17.02.2016

PVS-Studio delved into the FreeBSD kernel

Picture 14

About a year ago we checked the Linux core. It was one of the most discussed articles at that time. We also got quite a number of requests to check FreeBSD, so finally we decided to take the time to do it.

About the project

FreeBSD is a contemporary operating system for servers, desktops and embedded computer platforms. Its code has gone through more than thirty years of continuous development, improvement and optimization. It has proven itself as a system for building intranet, Internet networks, and servers. It provides reliable network services and efficient memory management.

Despite the fact that FreeBSD is regularly checked by Coverity, we had a great time checking this project because a lot of suspicious fragments were found. In this article we'll provide about 40 fragments, but the developers of this project may have a look at a full list, which contains around 1000 analyzer warnings of high severity.

To my humble opinion, a lot of those warnings issued by the analyzer are real bugs, but it's hard for me to determine how critical they are, as I am not the developer of the system. I suppose it could be a good ground for a discussion with the authors of the project.

The source code was taken from GitHub branch - 'master'. The repository contains ~23000 files and two dozens of assembly configurations for different platforms, but I checked the kernel only, which I compiled in this way:

# make buildkernel KERNCONF=MYKERNEL

Methodology

We used static code analyzer PVS-Studio, version 6.01.

For convenience, I set a PC-BSD and wrote a small utility in C++, which keeps the working environment of the compilers' runs when building the kernel. The acquired information was used to get the preprocessed files and their analysis, done by PVS-Studio. This method allowed me to quickly check a project without having to study an unfamiliar build system to integrate the analyzer. On top of it, analysis of preprocessed files allows you to do a more in-depth analysis of the code and find more sophisticated and interesting, errors, in macros for instance. This article will provide several examples of such a kind.

Linux kernel was analyzed in the same way; this mode is also available for Windows users in the Standaloneutility, which is a part of PVS-studio distribution kit. Usually PVS-Studio seamlessly integrates into the projects. There is a number of ways to integrate the analyzer, described in the documentation. Monitoring utilities have a big advantage of trying the analyzer if the project has an unusual building system.

Surprising luck

The first possible error was found before I ran the analyzer on the project, and even before I built the kernel; the build was interrupted by a linking error. Having addressed the file, specified in the error, I saw the following:

Picture 8

Pay attention to the highlighted fragment: a tab character is used for the formatting of the indentations; two statements are moved under the condition. But the last statement does not actually refer to a condition and will be always executed. Perhaps, curly braces were forgotten here.

Once we got a comment that we just copy the analyzer warnings, but it is not so. Before the analysis of the project we have to make sure that it gets compiled correctly; when the report is done, the warnings must be sorted/examined and commented. The same work is done by our customer support team, when they answer the incoming mails. There are also cases when the customers send examples of false positives (in their opinion) which turn out to be real bugs.

Capy-poste and typos

PVS-Studio analyzer is a powerful tool for static code analysis that finds bugs of various levels of severity. The first diagnostics were very simple and were created to detect most common bugs, related to typos and copy-paste programming. After the analysis review, I sort them according to the error code. So in this article we'll start with this type of diagnostic rules.

Picture 10

V501 There are identical sub-expressions '(uintptr_t) b->handler' to the left and to the right of the '>' operator. ip_fw_sockopt.c 2893

static int
compare_sh(const void *_a, const void *_b)
{
  const struct ipfw_sopt_handler *a, *b;

  a = (const struct ipfw_sopt_handler *)_a;
  b = (const struct ipfw_sopt_handler *)_b;
  ....
  if ((uintptr_t)a->handler < (uintptr_t)b->handler)
    return (-1);
  else if ((uintptr_t)b->handler > (uintptr_t)b->handler) // <=
    return (1);
  
  return (0);
}

Here is a vivid example of a bad practice - giving the variables short and uninformative names. Now because of the typo in the letter 'b', the a part of the condition will never be return 1. Thus, the function returns a zero status not always correctly.

V501 There are identical sub-expressions to the left and to the right of the '!=' operator: m->m_pkthdr.len != m->m_pkthdr.len key.c 7208

int
key_parse(struct mbuf *m, struct socket *so)
{
  ....
  if ((m->m_flags & M_PKTHDR) == 0 ||
      m->m_pkthdr.len != m->m_pkthdr.len) { // <=
    ....
    goto senderror;
  }
  ....
}

One of the fields of the structure is compared with itself; therefore, the result of the logical operation will always be False.

V501 There are identical sub-expressions to the left and to the right of the '|' operator: PIM_NOBUSRESET | PIM_NOBUSRESET sbp_targ.c 1327

typedef enum {
  PIM_EXTLUNS      = 0x100,
  PIM_SCANHILO     = 0x80,
  PIM_NOREMOVE     = 0x40,
  PIM_NOINITIATOR  = 0x20,
  PIM_NOBUSRESET   = 0x10, // <=
  PIM_NO_6_BYTE    = 0x08,
  PIM_SEQSCAN      = 0x04,
  PIM_UNMAPPED     = 0x02,
  PIM_NOSCAN       = 0x01
} pi_miscflag;

static void
sbp_targ_action1(struct cam_sim *sim, union ccb *ccb)
{
  ....
  struct ccb_pathinq *cpi = &ccb->cpi;

    cpi->version_num = 1; /* XXX??? */
    cpi->hba_inquiry = PI_TAG_ABLE;
    cpi->target_sprt = PIT_PROCESSOR
         | PIT_DISCONNECT
         | PIT_TERM_IO;
    cpi->transport = XPORT_SPI;
    cpi->hba_misc = PIM_NOBUSRESET | PIM_NOBUSRESET; // <=
  ....
}

In this example we see that the same variable "PIM_NOBUSRESET" is used in the bitwise operation, which doesn't affect the result in any way. Most likely a constant with a different value was meant to be used here, but the variable was left unchanged.

V523 The 'then' statement is equivalent to the 'else' statement. saint.c 2023

GLOBAL void siSMPRespRcvd(....)
{
  ....
  if (agNULL == frameHandle)
  {
    /* indirect mode */
    /* call back with success */
    (*(ossaSMPCompletedCB_t)(pRequest->completionCB))(agRoot,
       pRequest->pIORequestContext, OSSA_IO_SUCCESS, payloadSize,
       frameHandle);
  }
  else
  {
    /* direct mode */
    /* call back with success */
    (*(ossaSMPCompletedCB_t)(pRequest->completionCB))(agRoot,
       pRequest->pIORequestContext, OSSA_IO_SUCCESS, payloadSize,
       frameHandle);
  }
  ....
}

Two condition branches are commented differently: /* indirect mode */ and /* direct mode */, but they are implemented similarly, which is very suspicious.

V523 The 'then' statement is equivalent to the 'else' statement. smsat.c 2848

osGLOBAL void
smsatInquiryPage89(....)
{
  ....
  if (oneDeviceData->satDeviceType == SATA_ATA_DEVICE)
  {
    pInquiry[40] = 0x01; /* LBA Low          */
    pInquiry[41] = 0x00; /* LBA Mid          */
    pInquiry[42] = 0x00; /* LBA High         */
    pInquiry[43] = 0x00; /* Device           */
    pInquiry[44] = 0x00; /* LBA Low Exp      */
    pInquiry[45] = 0x00; /* LBA Mid Exp      */
    pInquiry[46] = 0x00; /* LBA High Exp     */
    pInquiry[47] = 0x00; /* Reserved         */
    pInquiry[48] = 0x01; /* Sector Count     */
    pInquiry[49] = 0x00; /* Sector Count Exp */
  }
  else
  {
    pInquiry[40] = 0x01; /* LBA Low          */
    pInquiry[41] = 0x00; /* LBA Mid          */
    pInquiry[42] = 0x00; /* LBA High         */
    pInquiry[43] = 0x00; /* Device           */
    pInquiry[44] = 0x00; /* LBA Low Exp      */
    pInquiry[45] = 0x00; /* LBA Mid Exp      */
    pInquiry[46] = 0x00; /* LBA High Exp     */
    pInquiry[47] = 0x00; /* Reserved         */
    pInquiry[48] = 0x01; /* Sector Count     */
    pInquiry[49] = 0x00; /* Sector Count Exp */
  }
  ....
}

This example is even more suspicious than the previous one. A big code fragment was copied, but later no changes were made.

V547 Expression is always true. Probably the '&&' operator should be used here. qla_hw.c 799

static int
qla_tx_tso(qla_host_t *ha, struct mbuf *mp, ....)
{
  ....
  if ((*tcp_opt != 0x01) || (*(tcp_opt + 1) != 0x01) ||
    (*(tcp_opt + 2) != 0x08) || (*(tcp_opt + 2) != 10)) { // <=
    return -1;
  }
  ....
}

Here the analyzer detected that the condition "(*(tcp_opt + 2) != 0x08) || (*(tcp_opt + 2) != 10)" is always true and it is really so, if you build a truth table. But most likely the '&&' is not needed here, it is just a typo in the address offset. Perhaps the function code should be like this:

static int
qla_tx_tso(qla_host_t *ha, struct mbuf *mp, ....)
{
  ....
  if ((*tcp_opt != 0x01) || (*(tcp_opt + 1) != 0x01) ||
    (*(tcp_opt + 2) != 0x08) || (*(tcp_opt + 3) != 10)) {
    return -1;
  }
  ....
}

V571 Recurring check. This condition was already verified in line 1946. sahw.c 1949

GLOBAL
bit32 siHDAMode_V(....)
{
  ....
  if( saRoot->memoryAllocated.agMemory[i].totalLength > biggest)
  {
    if(biggest < saRoot->memoryAllocated.agMemory[i].totalLength)
    {
      save = i;
      biggest = saRoot->memoryAllocated.agMemory[i].totalLength;
    }
  }
  ....
}

This code is really strange, if we simplify it, we'll see the following:

if( A > B )
{
  if (B < A)
  {
    ....
  }
}

The same condition is checked twice. Most likely, something else was supposed to be written here.

A similar fragment:

  • V571 Recurring check. This condition was already verified in line 1940. if_rl.c 1941

Dangerous macros

V523 The 'then' statement is equivalent to the 'else' statement. agtiapi.c 829

if (osti_strncmp(buffer, "0x", 2) == 0)
{ 
  maxTargets = osti_strtoul (buffer, &pLastUsedChar, 0);
  AGTIAPI_PRINTK( ".... maxTargets = osti_strtoul  0 \n" );
}
else
{
  maxTargets = osti_strtoul (buffer, &pLastUsedChar, 10);
  AGTIAPI_PRINTK( ".... maxTargets = osti_strtoul 10\n"   );
}

First, I skipped this analyzer warning, thinking that it is a false positive. But warnings of low severity should also be reviewed after the project check (to improve the analyzer). So I came across such a macro:

#define osti_strtoul(nptr, endptr, base)    \
          strtoul((char *)nptr, (char **)endptr, 0)

The 'base' parameter isn't used at all, and the '0' value is always passed to the "strtoul" function as the last parameter, although values '0' and '10' are passed to the macro. In the preprocessed files all macros got expanded and the code became similar. This macro is used in this way several dozen times. The entire list of such fragments was sent to the developers.

V733 It is possible that macro expansion resulted in incorrect evaluation order. Check expression: chan - 1 * 20. isp.c 2301

static void
isp_fibre_init_2400(ispsoftc_t *isp)
....
  if (ISP_CAP_VP0(isp))
    off += ICB2400_VPINFO_PORT_OFF(chan);
  else
    off += ICB2400_VPINFO_PORT_OFF(chan - 1); // <=
  ....
}

At first glance, there is nothing strange in this code fragment. We see that sometimes the 'chan' value is used, sometimes less by one 'chan - 1', but let's have look at the macro definition:

#define ICB2400_VPOPT_WRITE_SIZE 20

#define  ICB2400_VPINFO_PORT_OFF(chan) \
  (ICB2400_VPINFO_OFF +                \
   sizeof (isp_icb_2400_vpinfo_t) +    \
  (chan * ICB2400_VPOPT_WRITE_SIZE))          // <=

When passing the binary expression to the macro, the calculation logic changes dramatically. The expression "(chan - 1) * 20" turns into "chan - 1 *20", i.e. into "chan - 20", and the incorrectly calculated size gets used further in the program.

About the priorities of operations

In this section, I will discuss how important it is to know the priorities of operations, use extra parentheses, if you are not sure and sometimes test yourself by building truth tables of logical expressions.

Picture 15

V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '|' operator. ata-serverworks.c 166

ata_serverworks_chipinit(device_t dev)
{
  ....
  pci_write_config(dev, 0x5a,
           (pci_read_config(dev, 0x5a, 1) & ~0x40) |
           (ctlr->chip->cfg1 == SWKS_100) ? 0x03 : 0x02, 1);
  }
  ....
}

The priority of '?:' operator is lower than of the bitwise OR '|'. As a result, in the bit operations, in addition to the numeric constants, the expression result "(ctlr-> chip > cfg1 = SWKS_100)" gets used, which suddenly changes the calculation/computation logic. Perhaps this error wasn't noticed so far because the result seemed so close to the truth.

V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '|' operator. in6.c 1318

void
in6_purgeaddr(struct ifaddr *ifa)
{
  ....
  error = rtinit(&(ia->ia_ifa), RTM_DELETE, ia->ia_flags |
        (ia->ia_dstaddr.sin6_family == AF_INET6) ? RTF_HOST : 0);
  ....
}

A different file also had a fragment with a similar error with a ternary operator.

V547 Expression 'cdb[0] != 0x28 || cdb[0] != 0x2A' is always true. Probably the '&&' operator should be used here. mfi_tbolt.c 1110

int
mfi_tbolt_send_frame(struct mfi_softc *sc, struct mfi_command *cm)
{
  ....
  if (cdb[0] != 0x28 || cdb[0] != 0x2A) {  // <='
    if ((req_desc = mfi_tbolt_build_mpt_cmd(sc, cm)) == NULL) {
      device_printf(sc->mfi_dev, "Mapping from MFI "
          "to MPT Failed \n");
      return 1;
    }
  }
  else
    device_printf(sc->mfi_dev, "DJA NA XXX SYSPDIO\n");
  ....
}

The first conditional expression is always true, that's why the 'else' branch never gets control. I will provide the truth table in case of controversial logical expressions in this and the following examples. An example for this case:

Picture 11

V590 Consider inspecting the 'error == 0 || error != - 1' expression. The expression is excessive or contains a misprint. nd6.c 2119

int
nd6_output_ifp(....)
{
  ....
  /* Use the SEND socket */
  error = send_sendso_input_hook(m, ifp, SND_OUT,
      ip6len);
  /* -1 == no app on SEND socket */
  if (error == 0 || error != -1)           // <=
      return (error);
  ....
}

The problem with this fragment is that the conditional expression doesn't depend on the result "error == 0". Perhaps, something is wrong here.

Picture 12

Three more cases:

  • V590 Consider inspecting the 'error == 0 || error != 35' expression. The expression is excessive or contains a misprint. if_ipw.c 1855
  • V590 Consider inspecting the 'error == 0 || error != 27' expression. The expression is excessive or contains a misprint. if_vmx.c 2747
  • V547 Expression is always true. Probably the '&&' operator should be used here. igmp.c 1939

V590 Consider inspecting this expression. The expression is excessive or contains a misprint. sig_verify.c 94

enum uni_ieact {
  UNI_IEACT_CLEAR = 0x00, /* clear call */
  ....
}

void
uni_mandate_epref(struct uni *uni, struct uni_ie_epref *epref)
{
  ....
  maxact = -1;
  FOREACH_ERR(e, uni) {
    if (e->ie == UNI_IE_EPREF)
      continue;
    if (e->act == UNI_IEACT_CLEAR)
      maxact = UNI_IEACT_CLEAR;
    else if (e->act == UNI_IEACT_MSG_REPORT) {
      if (maxact == -1 && maxact != UNI_IEACT_CLEAR)     // <=
        maxact = UNI_IEACT_MSG_REPORT;
    } else if (e->act == UNI_IEACT_MSG_IGNORE) {
      if (maxact == -1)
        maxact = UNI_IEACT_MSG_IGNORE;
    }
  }
  ....
}

The result of the whole conditional expression doesn'r depend on the calculation of the value "maxact != UNI_IEACT_CLEAR". Here's how it looks in the table:

Picture 13

In this section I give three ways of how to make an error in seemingly simple formulas. Just think of it...

V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. aacraid.c 2854

#define EINVAL 22 /* Invalid argument */
#define EFAULT 14 /* Bad address */
#define EPERM 1 /* Operation not permitted */

static int
aac_ioctl_send_raw_srb(struct aac_softc *sc, caddr_t arg)
{
  ....
  int error, transfer_data = 0;
  ....
  if ((error = copyin((void *)&user_srb->data_len, &fibsize, 
    sizeof (u_int32_t)) != 0)) 
    goto out;
  if (fibsize > (sc->aac_max_fib_size-sizeof(....))) {
    error = EINVAL;
    goto out;
  }
  if ((error = copyin((void *)user_srb, srbcmd, fibsize) != 0)) 
    goto out;
  ....
out:
  ....
  return(error);
}

In this function the error code gets corrupted, when the assignment is executed in the 'if' operator. I.e. in the expression "error = copyin(...) != 0" the "copyin(...) != 0" is evaluated first, and then the result (0 or 1) is written to the variable 'error'.

The documentation for the function 'copyin' states that in case of an error, it returns EFAULT (value 14), and after such a check, the result of a logical operation '1' gets stored in the error code. It is actually EPERM, a completely different error status.

Unfortunately, there is quite a number of such fragments.

  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. aacraid.c 2861
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_age.c 591
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_alc.c 1535
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_ale.c 606
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_jme.c 807
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_msk.c 1626
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_stge.c 511
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. hunt_filter.c 973
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_smsc.c 1365
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. if_vte.c 431
  • V593 Consider reviewing the expression of the 'A = B != C' kind. The expression is calculated as following: 'A = (B != C)'. zfs_vfsops.c 498

Strings

Picture 2

V541 It is dangerous to print the string 'buffer' into itself. ata-highpoint.c 102

static int
ata_highpoint_probe(device_t dev)
{
  ....
  char buffer[64];
  ....
  strcpy(buffer, "HighPoint ");
  strcat(buffer, idx->text);
  if (idx->cfg1 == HPT_374) {
  if (pci_get_function(dev) == 0)
      strcat(buffer, " (channel 0+1)");
  if (pci_get_function(dev) == 1)
      strcat(buffer, " (channel 2+3)");
  }
  sprintf(buffer, "%s %s controller",
    buffer, ata_mode2str(idx->max_dma));
  ....
}

Some string is formed in the buffer. Then the programmer wants to get a new string, saving the previous string value, and add two more words. It seems really simple.

To explain why unexpected result will be recieved here, I will quote a simple and clear example from the documentation for this diagnostic:

char s[100] = "test";
sprintf(s, "N = %d, S = %s", 123, s);

As a result of the work we would want to get the following string:

N = 123, S = test

But in practice it will be like this:

N = 123, S = N = 123, S =

In other situations, the same code can lead not only to the incorrect text, but also to the program abortion. The code can be fixed if you use a new buffer to store the result . The correct version:

char s1[100] = "test";
char s2[100];
sprintf(s2, "N = %d, S = %s", 123, s1);

V512 A call of the 'strcpy' function will lead to overflow of the buffer 'p->vendor'. aacraid_cam.c 571

#define  SID_VENDOR_SIZE   8
  char   vendor[SID_VENDOR_SIZE];
#define  SID_PRODUCT_SIZE  16
  char   product[SID_PRODUCT_SIZE];
#define  SID_REVISION_SIZE 4
  char   revision[SID_REVISION_SIZE];

static void
aac_container_special_command(struct cam_sim *sim, union ccb *ccb,
  u_int8_t *cmdp)
{
  ....
  /* OEM Vendor defines */
  strcpy(p->vendor,"Adaptec ");          // <=
  strcpy(p->product,"Array           "); // <=
  strcpy(p->revision,"V1.0");            // <=
  ....
}

All three strings here are filled incorrectly. There is no space for null-terminal symbol in the arrays, which may cause serious problems with such strings in the future. One space can be removed in "p->vendor" and "p->product". Then there will be room for null terminal, that the strcpy() function adds to the end of the string. But there is no free space at all for the end of line characters for the "p->revision"; that's why the value SID_REVISION_SIZE should be increased at least by one.

Of course, it is rather hard for me to judge about the code. It's possible, that the terminal null is not needed at all and everything is designed for a specific buffer size. Then the strcpy() function is chosen incorrectly. In this case the code should be written like this:

memcpy(p->vendor,   "Adaptec ",         SID_VENDOR_SIZE);
memcpy(p->product,  "Array           ", SID_PRODUCT_SIZE);
memcpy(p->revision, "V1.0",             SID_REVISION_SIZE);

V583 The '?:' operator, regardless of its conditional expression, always returns one and the same value: td->td_name. subr_turnstile.c 1029

static void
print_thread(struct thread *td, const char *prefix)
{
  db_printf("%s%p (tid %d, pid %d, ....", prefix, td, td->td_tid,
      td->td_proc->p_pid, td->td_name[0] != '\0' ? td->td_name :
      td->td_name);
}

Suspicious fragment. Despite the "td->td_name[0] != '\0'" check, this string is still printed.

Here are such fragments:

  • V583 The '?:' operator, regardless of its conditional expression, always returns one and the same value: td->td_name. subr_turnstile.c 1112
  • V583 The '?:' operator, regardless of its conditional expression, always returns one and the same value: td->td_name. subr_turnstile.c 1196

Operations with memory

In this section I will tell about incorrect usage of the following functions:

void bzero(void *b, size_t len);

int copyout(const void *kaddr, void *uaddr, size_t len);

V579 The bzero function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the second argument. osapi.c 316

/* Autosense storage */  
struct scsi_sense_data sense_data;

void
ostiInitiatorIOCompleted(....)
{
  ....
  bzero(&csio->sense_data, sizeof(&csio->sense_data));
  ....
}

To zero the structure, we should pass the structure pointer and the size of the memory to be zeroed in bytes to the bzero() function; but here the pointer size is passed to the function, not the structure size.

The correct code should be like this:

bzero(&csio->sense_data, sizeof(csio->sense_data));

V579 The bzero function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the second argument. acpi_package.c 83

int
acpi_PkgStr(...., void *dst, ....)
{
  ....
  bzero(dst, sizeof(dst));
  ....
}

In this example we see a similar situation: the size of the pointer, not the object gets passed to the 'bzero' function.

Correct version:

bzero(dst, sizeof(*dst));

V579 The copyout function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. if_nxge.c 1498

int
xge_ioctl_stats(xge_lldev_t *lldev, struct ifreq *ifreqp)
{
  ....
  *data = (*data == XGE_SET_BUFFER_MODE_1) ? 'Y':'N';
  if(copyout(data, ifreqp->ifr_data, sizeof(data)) == 0)    // <=
      retValue = 0;
  break;
  ....
}

In this example the memory is copied from 'data' to 'ifreqp->ifr_data', at the same time the size of the memory to be copied is sizeof(data), i.e. 4 or 8 bytes depending on the bitness of the architecture.

Pointers

Picture 17

V557 Array overrun is possible. The '2' index is pointing beyond array bound. if_spppsubr.c 4348

#define AUTHKEYLEN  16

struct sauth {
  u_short  proto;      /* authentication protocol to use */
  u_short  flags;

#define AUTHFLAG_NOCALLOUT  1  
          /* callouts */
#define AUTHFLAG_NORECHALLENGE  2  /* do not re-challenge CHAP */
  u_char  name[AUTHNAMELEN];  /* system identification name */
  u_char  secret[AUTHKEYLEN];  /* secret password */
  u_char  challenge[AUTHKEYLEN];  /* random challenge */
};

static void
sppp_chap_scr(struct sppp *sp)
{
  u_long *ch, seed;
  u_char clen;

  /* Compute random challenge. */
  ch = (u_long *)sp->myauth.challenge;
  read_random(&seed, sizeof seed);
  ch[0] = seed ^ random();
  ch[1] = seed ^ random();
  ch[2] = seed ^ random(); // <=
  ch[3] = seed ^ random(); // <=
  clen = AUTHKEYLEN;
  ....
}

The size of 'u_char' type is 1 byte in the 32 and 64 - bit applications; but the size of the 'u_long' type is 4 bytes in the 32-bit applications and 8 byte in the 64-bit application. So in the 32-bit application during the execution of the operation "u_long* ch = (u_long *)sp->myauth.challenge", the array 'ch' will consist of 4 elements, 4 bytes each. And in the 64-bit application the array 'ch' will consist of 2 elements, that have 8 bytes each. Therefore, if we compile the 64-bit kernel, then when accessing ch[2] and ch[3] we'll have array index out of bounds.

V503 This is a nonsensical comparison: pointer >= 0. geom_vinum_plex.c 173

gv_plex_offset(...., int *sdno, int growing)
{
  ....
  *sdno = stripeno % sdcount;
  ....
  KASSERT(sdno >= 0, ("gv_plex_offset: sdno < 0"));
  ....
}

We managed to detect a very interesting fragment with the help of diagnostic 503. There is no point in checking that the pointer is greater than or equal to 0. Most likely, the pointer "sdno" was not dereferenced in order to compare the stored value.

There are two more comparisons with null.

  • V503 This is a nonsensical comparison: pointer >= 0. geom_vinum_raid5.c 602
  • V503 This is a nonsensical comparison: pointer >= 0. geom_vinum_raid5.c 610

V522 Dereferencing of the null pointer 'sc' might take place. mrsas.c 4027

void
mrsas_aen_handler(struct mrsas_softc *sc)
{
  ....
  if (!sc) {
    device_printf(sc->mrsas_dev, "invalid instance!\n");
    return;
  }
  if (sc->evt_detail_mem) {
  ....
}

If the pointer "sc" is a null one, then the function will exit. However, it's not quite clear, why the programmer tried to dereference the "sc->mrsas_dev" pointer.

A list of strange fragments:

  • V522 Dereferencing of the null pointer 'sc' might take place. mrsas.c 1279
  • V522 Dereferencing of the null pointer 'sc' might take place. tws_cam.c 1066
  • V522 Dereferencing of the null pointer 'sc' might take place. blkfront.c 677
  • V522 Dereferencing of the null pointer 'dev_priv' might take place. radeon_cs.c 153
  • V522 Dereferencing of the null pointer 'ha' might take place. ql_isr.c 728

V713 The pointer m was utilized in the logical expression before it was verified against nullptr in the same logical expression. ip_fastfwd.c 245

struct mbuf *
ip_tryforward(struct mbuf *m)
{
  ....
  if (pfil_run_hooks(
      &V_inet_pfil_hook, &m, m->m_pkthdr.rcvif, PFIL_IN, NULL) ||
      m == NULL)
    goto drop;
  ....
}

The check "m == NULL" is placed incorrectly. First we need to check the pointer, and only then call the pfil_run_hooks() function.

Loops

Picture 18

V621 Consider inspecting the 'for' operator. It's possible that the loop will be executed incorrectly or won't be executed at all. if_ae.c 1663

#define  AE_IDLE_TIMEOUT    100

static void
ae_stop_rxmac(ae_softc_t *sc)
{
  int i;
  ....
  /*
   * Wait for IDLE state.
   */
  for (i = 0; i < AE_IDLE_TIMEOUT; i--) {  // <=
    val = AE_READ_4(sc, AE_IDLE_REG);
    if ((val & (AE_IDLE_RXMAC | AE_IDLE_DMAWRITE)) == 0)
      break;
    DELAY(100);
  }
  ....
}

In the source code of FreeBSD we found such an interesting and incorrect loop. For some reason, there is a decrement of a loop counter instead of an increment. It turns out that the loop can execute more times than the value of AE_IDLE_TIMEOUT, until the 'break' operator executes.

If the loop is not stopped, then we'll have the overflow of a signed variable 'i'. Signed variable overflow is nothing but a undefined behavior. And it's not some abstract theoretical danger, it is very real. Recently, my colleague wrote an article on this topic: Undefined behavior is closer than you think

One more interesting moment. We detected the same error in the code of Haiku operating system (see the section "Warnings #17, #18") No idea, who borrowed the "if_ae.c" file, but this error appears after Copy-Paste.

V535 The variable 'i' is being used for this loop and for the outer loop. Check lines: 182, 183. mfi_tbolt.c 183

mfi_tbolt_adp_reset(struct mfi_softc *sc)
{
  ....
  for (i=0; i < 10; i++) {
    for (i = 0; i < 10000; i++);
  }
  ....
}

Probably, this small piece of code is used for creating the delay, but in sum total only 10000 operations are executed, not 10*10000; why then 2 loops are needed here?

I specifically cited this example because it is the most vivid to show that the usage of the same variable in the external and nested loops leads to unexpected results.

V535 The variable 'i' is being used for this loop and for the outer loop. Check lines: 197, 208. linux_vdso.c 208

void
__elfN(linux_vdso_reloc)(struct sysentvec *sv, long vdso_adjust)
{
  ....
  for(i = 0; i < ehdr->e_shnum; i++) {                      // <=
    if (!(shdr[i].sh_flags & SHF_ALLOC))
      continue;
    shdr[i].sh_addr += vdso_adjust;
    if (shdr[i].sh_type != SHT_SYMTAB &&
        shdr[i].sh_type != SHT_DYNSYM)
      continue;

    sym = (Elf_Sym *)((caddr_t)ehdr + shdr[i].sh_offset);
    symcnt = shdr[i].sh_size / sizeof(*sym);

    for(i = 0; i < symcnt; i++, sym++) {                    // <=
      if (sym->st_shndx == SHN_UNDEF ||
          sym->st_shndx == SHN_ABS)
        continue;
      sym->st_value += vdso_adjust;
    }
  }
  ....
}

This is probably a too complicated example to understand if the code executes correctly. But looking at the previous example we can draw a conclusion that a wrong number of iterations is executed here as well.

V547 Expression 'j >= 0' is always true. Unsigned type value is always >= 0. safe.c 1596

static void
safe_mcopy(struct mbuf *srcm, struct mbuf *dstm, u_int offset)
{
  u_int j, dlen, slen;                   // <=
  caddr_t dptr, sptr;

  /*
   * Advance src and dst to offset.
   */
  j = offset;
  while (j >= 0) {                       // <=
    if (srcm->m_len > j)
      break;
    j -= srcm->m_len;                    // <=
    srcm = srcm->m_next;
    if (srcm == NULL)
      return;
  }
  sptr = mtod(srcm, caddr_t) + j;
  slen = srcm->m_len - j;

  j = offset;
  while (j >= 0) {                       // <=
    if (dstm->m_len > j)
      break;
    j -= dstm->m_len;                    // <=
    dstm = dstm->m_next;
    if (dstm == NULL)
      return;
  }
  dptr = mtod(dstm, caddr_t) + j;
  dlen = dstm->m_len - j;
  ....
}

There are two dangerous loops in this function. As the 'j' variable (loop counters) has an unsigned type, then the "j >= 0" check is always true and these loops are "infinity". Another problem is that some value is constantly subtracted from this counter; therefore if there is an attempt to access beyond the zero value, then the 'j' variable will get the maximum value of its type.

V711 It is dangerous to create a local variable within a loop with a same name as a variable controlling this loop. powernow.c 73

static int
pn_decode_pst(device_t dev)
{
  ....
  struct pst_header *pst;                                   // <=
  ....
  p = ((uint8_t *) psb) + sizeof(struct psb_header);
  pst = (struct pst_header*) p;

  maxpst = 200;

  do {
    struct pst_header *pst = (struct pst_header*) p;        // <=

    ....

    p += sizeof(struct pst_header) + (2 * pst->numpstates);
  } while (cpuid_is_k7(pst->cpuid) && maxpst--);            // <=
  ....
}

In the body of the loop we detected variable declaration was that matches the variable used for the loop control. I suspect that the value of the external pointer with the 'pst' name doesn't change because a local pointer with the same 'pst' is created. Perhaps the same "pst->cupid" value is always checked in the loop condition do....while(). The developers should review this fragment and give the variables different names.

Miscelleneous

V569 Truncation of constant value -96. The value range of unsigned char type: [0, 255]. if_rsu.c 1516

struct ieee80211_rx_stats {
  ....
  uint8_t nf;      /* global NF */
  uint8_t rssi;    /* global RSSI */
  ....
};

static void
rsu_event_survey(struct rsu_softc *sc, uint8_t *buf, int len)
{
  ....
  rxs.rssi = le32toh(bss->rssi) / 2;
  rxs.nf = -96;
  ....
}

It's very strange that an unsigned variable "rxs.nf" is assigned with a negative value '-96' As a result, the variable will have the value '160'.

V729 Function body contains the 'done' label that is not used by any 'goto' statements. zfs_acl.c 2023

int
zfs_setacl(znode_t *zp, vsecattr_t *vsecp, ....)
{
  ....
top:
  mutex_enter(&zp->z_acl_lock);
  mutex_enter(&zp->z_lock);
  ....
  if (error == ERESTART) {
    dmu_tx_wait(tx);
    dmu_tx_abort(tx);
    goto top;
  }
  ....
done:                            // <=
  mutex_exit(&zp->z_lock);
  mutex_exit(&zp->z_acl_lock);

  return (error);
}

In this code there are functions containing labels, but at the same time, the call of the 'goto' statement is missing for these labels. For example, we see that the 'top' label is used in this fragment, but 'done' isn't used anywhere. Perhaps the programmer forgot to add a jump to the label, or it was removed over the time, while the label was left in the code.

V646 Consider inspecting the application's logic. It's possible that 'else' keyword is missing. mac_process.c 352

static void
mac_proc_vm_revoke_recurse(struct thread *td, struct ucred *cred,
    struct vm_map *map)
{
  ....
  if (!mac_mmap_revocation_via_cow) {
    vme->max_protection &= ~VM_PROT_WRITE;
    vme->protection &= ~VM_PROT_WRITE;
  } if ((revokeperms & VM_PROT_READ) == 0)   // <=
    vme->eflags |= MAP_ENTRY_COW |
        MAP_ENTRY_NEEDS_COPY;
  ....
}

Finally, I want to tell you about suspicious formatting, which I already came across in the very beginning of the project check. Here the code is aligned in such a way that the absence of the keyword 'else' looks strange.

V705 It is possible that 'else' block was forgotten or commented out, thus altering the program's operation logics. scsi_da.c 3231

static void
dadone(struct cam_periph *periph, union ccb *done_ccb)
{
  ....
  /*
   * If we tried READ CAPACITY(16) and failed,
   * fallback to READ CAPACITY(10).
   */
  if ((state == DA_CCB_PROBE_RC16) &&
    ....
  } else                                                    // <=
  /*
   * Attach to anything that claims to be a
   * direct access or optical disk device,
   * as long as it doesn't return a "Logical
   * unit not supported" (0x25) error.
   */
  if ((have_sense) && (asc != 0x25)                         // <=
    ....
  } else { 
    ....
  }
  ....
}

This code has no error now, but it will definitely show up one day. By leaving such a big commentary before 'else' you may accidentally forget that this keyword was somewhere in the code and make some erroneous edits.

Conclusion

Picture 1

The FreeBSD project was tested by special version of PVS-Studio, which showed a great result! The whole material is impossible to fit in one article. Nevertheless, the development team of FreeBSD got the full list of the analyzer warnings that should be examined.

I suggest everyone to try PVS-Studio on your projects. The analyzer works in Windows environment. We don't have a public version for using the analyzer in the development of the projects for Linux/FreeBSD. We could also discuss possible variants of customization PVS-Studio for your projects and specific tasks.