Why Windows 8 drivers are buggy

We have checked the Windows 8 Driver Samples pack with our analyzer PVS-Studio and found various bugs in its samples. There is nothing horrible about it - bugs can be found everywhere, so the title of this article may sound a bit high-flown. But these particular errors may be really dangerous, as it is a usual practice for developers to use demo samples as a basis for their own projects or borrow code fragments from them.

Windows 8 Driver Samples

Windows 8 Driver Samples is a pack of 283 independent solutions. This fact made our task somewhat difficult, as we absolutely didn't feel like opening and checking all the solutions (*.sln files) one by one. We investigated the issue and found out that we were not alone to face it. On programmers' forums you may often come across the question how to unite several solutions into one. This task appears to be relatively easy to fulfill. Those interested, please see this post: "How to unite several separate projects into one general Visual Studio solution (.sln file): One Solution to Rule Them All".

Microsoft developers create very high-quality code. The Casablanca project's check results are a good proof of that. Creating samples, however, seems to be a lower-priority task for them. I suspect they don't use the static analysis technology or any other methods of quality monitoring when developing these projects. A similar situation was with the IPP Samples collection created by Intel. As our checks have shown, it contains quite a number of bugs (checks 1, 2, 3).

Bugs in samples are not that critical as bugs in real-life software. Nevertheless, bugs can migrate from samples to real-life projects and cause developers a lot of troubles. Even within the Windows 8 Driver Samples pack we found identical bugs. The reason is obvious: copying-and-pasting a code fragment from a nearby sample. In the same way these errors will get into real-life drivers.

Now let's see what interesting issues can be found in Windows 8 Driver Samples. Analysis was performed with the PVS-Studio 5.03 analyzer. As usually, let me point out that I'll cite only those fragments which I found undoubtedly suspicious. Also, I only scanned through many of the diagnostic messages, so if any of the sample collection's developers notices this post, please don't limit yourself to reading the information given here, and consider analyzing your project more thoroughly.

Note. Visual Studio doesn't provide API for projects implemented as an extension of the standard Visual C++ project model. This is just the case with the driver development projects. That's why you'll need to additionally customize PVS-Studio to check your drivers, namely: you'll need to integrate PVS-Studio into MSBuild. To learn more about the MSBuild integration mode, see these sources:

Unnecessary semicolon ';'

NDIS_STATUS HwSetPowerMgmtMode(....)
{
  ....
  if (!HW_MULTIPLE_MAC_ENABLED(Hw) &&
      (PMMode->dot11PowerMode != dot11_power_mode_unknown));
  {
    NdisMoveMemory(&Hw->MacState.PowerMgmtMode, PMMode,
       sizeof(DOT11_POWER_MGMT_MODE));
    HalSetPowerMgmtMode(Hw->Hal, PMMode);
  }
  ....
}

V529 Odd semicolon ';' after 'if' operator. hw_mac.c 95

Note the semicolon here: "...unknown));". It causes the code following it to be executed all the time, regardless of the condition.

Poor ASSERT

VOID MPCreateProgrammableFilter(....)
{
  ....
  ASSERT (0 < dwMaskSize <5);
  ....
}

V562 It's odd to compare 0 or 1 with a value of 5: 0 < dwMaskSize < 5. nic_pm.c 825

No comments.

Strange initialization function

NTSTATUS UartInitContext(_In_ WDFDEVICE Device)
{
  ....
  pDevExt->WdfDevice;
  ....
}

V607 Ownerless expression 'pDevExt->WdfDevice'. uart16550pc.cpp 58

I suspect the developers forgot to initialize the variable 'pDevExt->WdfDevice' in the function UartInitContext (). I cannot say for sure what it should be initialized with.

A misprint

BOOLEAN DsmpFindSupportedDevice(....)
{
  WCHAR tempString[32];
  ....
  tempString[(sizeof(tempString) /
              sizeof(tempString)) - 1] = L'\0';
  ....
}

V501 There are identical sub-expressions 'sizeof (tempString)' to the left and to the right of the '/' operator. utils.c 931

A misprint will cause the null terminator to be written at the beginning of the string instead of its end. The size of the sizeof(tempString) buffer must be divided by the size of one character. But it is divided by itself instead. This is the fixed code:

tempString[(sizeof(tempString) /
  sizeof(tempString[0])) - 1] = L'\0';

The programmer forgot that a string consists of WCHAR characters

HRESULT CDot11SampleExtUI::CreateSecurityProperties(....)
{
  ....
  WCHAR wbuf[128];
  ....
  ZeroMemory(wbuf, 128);
  ....
}

V512 A call of the 'memset' function will lead to underflow of the buffer 'wbuf'. ihvsampleextui.cpp 288

The ZeroMemory() function will empty only half of the buffer 'wbuf'. Since this code refers to the 'CreateSecurityProperties()' function, we may say we have a potential vulnerability here. This is the fixed code:

ZeroMemory(wbuf, 128 * sizeof(WCHAR));

Another bug of that kind:

typedef struct _DEVICE_INFO
{
  ....
  WCHAR UnicodeSourceIp[MAX_LEN];
  WCHAR UnicodeDestIp[MAX_LEN];
  ....
} DEVICE_INFO, *PDEVICE_INFO;

PDEVICE_INFO FindDeviceInfo(....)
{
  ....
  PDEVICE_INFO    deviceInfo = NULL;
  ....
  memcpy(deviceInfo->UnicodeSourceIp,
         InputInfo->SourceIp, MAX_LEN);
  memcpy(deviceInfo->UnicodeDestIp,
         InputInfo->DestIp, MAX_LEN);
  ....       
}

V512 A call of the 'memcpy' function will lead to underflow of the buffer 'deviceInfo->UnicodeSourceIp'. testapp.c 729

V512 A call of the 'memcpy' function will lead to underflow of the buffer 'deviceInfo->UnicodeDestIp'. testapp.c 730

Only half of a string is copied. The analyzer generated some other V512 messages as well, but I would have to examine the code more thoroughly to judge whether those were genuine bugs. But I can't do that: I have a line of projects waiting to be checked.

A recheck

I don't think I can cite the code fragment in full. It contains very long names like "WFPSAMPLER_CALLOUT_BASIC_ACTION_BLOCK_AT_INBOUND_MAC_FRAME_NATIVE". Such long lines will break the article's format when publishing it on our viva64.com website. So let me just give you a description of the bug. The function KrnlHlprExposedCalloutToString() contains the following code:

else if (A == &inbound)
  str = "inbound";
else if (A == &inbound)
  str = "outbound";

It is meaningless because the second 'if' operator will never be executed. This code fragment is to be found in the helperfunctions_exposedcallouts.cpp file several times. It must be copy-paste. Here is the list of these fragments' locations:

This is another example of a recheck.

HRESULT CSensor::HandleSetReportingAndPowerStates(....)
{
  ....
  else if (SENSOR_POWER_STATE_LOW_POWER == ulCurrentPowerState)
  {
    Trace(TRACE_LEVEL_ERROR,
      "%s Power State value is not correct = LOW_POWER, "
      "hr = %!HRESULT!", m_SensorName, hr);
  }
  else if (SENSOR_POWER_STATE_LOW_POWER == ulCurrentPowerState)
  {
    Trace(TRACE_LEVEL_ERROR,
      "%s Power State value is not correct = FULL_POWER, "
      "hr = %!HRESULT!", m_SensorName, hr);
  }
  ....
}

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 5641, 5645. sensor.cpp 5641

I believe the second check must look like this:

else if (SENSOR_POWER_STATE_FULL_POWER == ulCurrentPowerState)

One-time loop

NDIS_STATUS AmSetApBeaconMode(....)
{
  ....
  while (BeaconEnabled != AssocMgr->BeaconEnabled)
  {
    ndisStatus = ....;
    if (NDIS_STATUS_SUCCESS != ndisStatus)
    {
      break;
    }
    AssocMgr->BeaconEnabled = BeaconEnabled;
    break;
  }
  ....
}

V612 An unconditional 'break' within a loop. ap_assocmgr.c 1817

The loop body is executed no more than once. I find the 'break' operator at the end unnecessary.

Incorrect swap?

NTSTATUS FatSetDispositionInfo (....)
{
  ....
  TmpChar = LocalBuffer[0];
  LocalBuffer[0] = TmpChar;
  ....
}

V587 An odd sequence of assignments of this kind: A = B; B = A;. Check lines: 2572, 2573. fileinfo.c 2573

Strange and meaningless code. Perhaps the programmer wanted to swap the "LocalBuffer[0]" array item's value for another variable. But something was messed up.

A condition not affecting anything

NDIS_STATUS Hw11QueryDiversitySelectionRX(....)
{
  //
  // Determine the PHY that the user wants to query
  //
  if (SelectedPhy)
    return HwQueryDiversitySelectionRX(HwMac->Hw, 
              HwMac->SelectedPhyId, 
              MaxEntries, 
              Dot11DiversitySelectionRXList
              );
  else
    return HwQueryDiversitySelectionRX(HwMac->Hw,
              HwMac->SelectedPhyId, 
              MaxEntries, 
              Dot11DiversitySelectionRXList
              );
}

V523 The 'then' statement is equivalent to the 'else' statement. hw_oids.c 1043

The value of the 'SelectedPhy' variable is of no importance: one and the same action is executed all the time. I'm not sure whether this is an error. But the code is very suspicious. Other strange fragments:

Restoring settings incorrectly

If you want to disable warnings for a time, you should use a sequence of the following directives:

#pragma warning(push)
#pragma warning(disable: XXX)
....
#pragma warning(pop)

But programmers often do it in a simpler way:

#pragma warning(disable:XXX)
....
#pragma warning(default:XXX)

This practice is bad because the warning output state you set earlier could be different from the default state. Therefore, the #pragma warning(default:XXX) directive may result in showing warnings you don't want or, on the contrary, hiding those messages you need.

There are several fragments in Windows 8 Driver Samples where warnings are suppressed in such a poor manner. For example:

// disable nameless struct/union warnings
#pragma warning(disable:4201) 
#include <wdf.h>
#pragma warning(default:4201)

V665 Possibly, the usage of '#pragma warning(default: X)' is incorrect in this context. The '#pragma warning(push/pop)' should be used instead. Check lines: 23, 25. common.h 25

Here is the list of all the rest fragments where warnings are disabled incorrectly:

A potential infinity loop

VOID HwFillRateElement(....)
{
  UCHAR i, j;
  ....
  for (i = 0; (i < basicRateSet->uRateSetLength) &&
              (i < 256); i++)
  {
    rate[i] = 0x80 | basicRateSet->ucRateSet[i];
  }
  ....
}

V547 Expression 'i < 256' is always true. The value range of unsigned char type: [0, 255]. hw_mac.c 1946

An infinity loop may occur here. The 'i' variable has the UCHAR type. It means that its value range is from 0 to 255, that is, any of its values is always below 256. The loop appears to be limited only by the (i < basicRateSet->uRateSetLength) condition.

A similar bug can be found in this fragment:

VOID HwFillRateElement(....)
{
  ....
  UCHAR rate[256];
  UCHAR rateNum;
  ....
  if (rateNum == sizeof(rate) / sizeof(UCHAR))
    break;
  ....  
}

V547 Expression is always false. The value range of unsigned char type: [0, 255]. hw_mac.c 1971

The "sizeof(rate) / sizeof(UCHAR)" expression equals 256. The 'rateNum' variable has the UCHAR type. It means that the condition will never hold.

Potential null pointer dereferencing

It is accepted to check pointers for being null pointers. But I know for sure that it is often done in a very slapdash manner. That is, you do have a check, but it is useless. For example:

HRESULT CFileContext::GetNextSubscribedMessage()
{
  ....
  m_pWdfRequest = pWdfRequest;
  m_pWdfRequest->MarkCancelable(pCallbackCancel);
  if (m_pWdfRequest != NULL)
  {
    CompleteOneArrivalEvent();
  }
  ....
}

V595 The 'm_pWdfRequest' pointer was utilized before it was verified against nullptr. Check lines: 266, 267. filecontext.cpp 266

The 'm_pWdfRequest' pointer was used to call the MarkCancelable() function. And then the programmer suddenly recalled that it might be a null pointer and made a check: "if (m_pWdfRequest != NULL)".

Such code usually appears during the refactoring process. Lines are moved and new expressions are added. And it may happen that a pointer check is put below the place where the pointer is used for the first time.

However, these errors don't affect the program execution in most cases. Pointers in these places simply cannot equal zero, so the program works well. But I cannot say for sure whether or not these fragments are buggy. It's up to the project's developers to figure it out.

Here is the list of the other fragments where this warning is generated:

True null pointer dereferencing

We have just discussed potential null pointer dereferencing errors. Now let's examine the case when a pointer is null for sure.

HRESULT CSensorDDI::OnGetDataFields(....)
{
  ....
  if (nullptr != pSensor)
  {
    ....
  }
  else
  {
    hr = E_POINTER;
    Trace(TRACE_LEVEL_ERROR,
      "pSensor == NULL before getting datafield %!GUID!-%i "
      "value from %s, hr = %!HRESULT!",
      &Key.fmtid, Key.pid, pSensor->m_SensorName, hr);
  }
}

V522 Dereferencing of the null pointer 'pSensor' might take place. sensorddi.cpp 903

If the 'pSensor' pointer equals zero, you want to save the related information into the log. But it's obviously a bad idea to try to take the name using "pSensor->m_SensorName".

A similar error can be found here:

V522 Dereferencing of the null pointer 'pSensor' might take place. sensorddi.cpp 1852

Strange loop

VOID ReportToString(
   PHID_DATA pData,
   _Inout_updates_bytes_(iBuffSize) LPSTR szBuff,
   UINT iBuffSize
)
{
  ....
  if(FAILED(StringCbPrintf (szBuff,
                iBuffSize,
                "Usage Page: 0x%x, Usages: ",
                pData -> UsagePage)))
  {
    for(j=0; j<sizeof(szBuff); j++)
    {
      szBuff[j] = '\0';
    }
    return;
  }
  ....
}

V604 It is odd that the number of iterations in the loop equals to the size of the 'szBuff' pointer. hclient.c 1688

Note the "j<sizeof(szBuff)" loop's truncation condition. It is very strange that the loop is iterated the same number of times as size of pointer (that is, 4 or 8). The following code should be most likely written instead:

for(j=0; j<iBuffSize; j++)

A misprint making the code vulnerable

bool ParseNumber(....)
{
  ....
  if ((*Value < Bounds.first) || 
      (*Value > Bounds.second))
  {
    printf("Value %s is out of bounds\n", String.c_str());
    false;
  }
  ....
}

V606 Ownerless token 'false'. util.cpp 91

It is checked that the variable's value is outside certain boundaries. This event must stop the function's operation, but that doesn't happen. The programmer made a misprint writing "false" instead of "return false;".

The same bug can be found here:

V606 Ownerless token 'false'. util.cpp 131

A misprint in switch

In the beginning of the article, I pointed out that errors taken from samples tend to spread all over. Now I'll demonstrate it by an example. Take a look at this code.

PCHAR DbgDevicePowerString(IN WDF_POWER_DEVICE_STATE Type)
{
  ....
  case WdfPowerDeviceD0:
    return "WdfPowerDeviceD0";
  case PowerDeviceD1:
    return "WdfPowerDeviceD1";
  case WdfPowerDeviceD2:
    return "WdfPowerDeviceD2";
  ....
}

V556 The values of different enum types are compared: switch(ENUM_TYPE_A) { case ENUM_TYPE_B: ... }. usb.c 450

Most likely, "case WdfPowerDeviceD1:" should be written instead of "case PowerDeviceD1:". And the name 'PowerDeviceD1' refers to an absolutely different type which is enum type.

So, this error was found in several projects at once. It was multiplied thanks to Copy-Paste. These are other fragments containing this bug:

Pi equals 3

NTSTATUS KcsAddTrignometricInstance (....)
{
  ....
  Angle = (double)(Timestamp.QuadPart / 400000) *
          (22/7) / 180;
  ....
}

V636 The '22 / 7' expression was implicitly casted from 'int' type to 'double' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double)(X) / Y;. kcs.c 239

This is a strange integer division. Why not write 3 right away? Perhaps it would be better to write (22.0/7). Then we'd get 3.1428.... By the way, Wikipedia prompts us that the fraction 22/7 is sometimes used to get an approximate value of Pi. Well, then the programmer has got a VERY approximate value in this sample.

Vestiges of the past

Long ago the 'new' operator used to return 0 if a memory allocation error occurred. Those times are long gone. Now, according to the standard, the 'new' operator throws the std::bad_alloc() exception if an error occurs. But many programmers either don't know or forget about this thing, or use their ancient code still containing such checks.

No problem, one may say. Just an extra check, that's alright. Well, the point is that a program is usually designed to perform some additional actions in case of an error. For instance it should release memory or close a file. But now it throws an exception when there is not enough memory, and the code that must handle it remains idle.

Have a look at this sample:

int SetHwidCallback(....)
{
  ....
  LPTSTR * tmpArray = new LPTSTR[cnt+2];
  if(!tmpArray) {
    goto final;
  }
  ....
final:
  if(hwlist) {
    DelMultiSz(hwlist);
  }
  return result;
}

V668 There is no sense in testing the 'tmpArray' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. cmds.cpp 2016

If the memory allocation error occurs, the program must move to the 'final' mark. After that, the DelMultiSz() function must be called to delete something. That won't happen. An exception will be generated which will leave the function. Even if this exception is correctly handled later, a memory leak or some other bad thing will most likely happen.

In Windows 8 Driver Samples, there are a lot of fragments where a pointer received from the 'new' operator is checked for being null. In most cases, everything should work well. But the programmers still need to investigate these fragments more thoroughly. Here they are:

Bad macro

#define MP_FREE_MEMORY(_Memory)  \
  MpFreeMemory(_Memory); _Memory = NULL;

NDIS_STATUS StaStartScan(....)
{
  ....
  if (pExternalScanRequest != NULL)
    MP_FREE_MEMORY(pExternalScanRequest);
  ....    
}

V640 The code's operational logic does not correspond with its formatting. The second statement will always be executed. It is possible that curly brackets are missing. st_scan.c 564

The MP_FREE_MEMORY macro is written in a poor way: function calls are not united into a single block by curly brackets. No error will occur in this particular place. It's simply that the pointer will be zeroed anyway, regardless whether or not it equaled zero.

Something messed up in switch

CPSUICALLBACK TVTestCallBack(....)
{
  ....
  switch (DMPubID)
  {
    ....
    case DMPUB_TVOPT_OVERLAY_NO:
      Action = CPSUICB_ACTION_REINIT_ITEMS;
    case DMPUB_TVOPT_ECB_EP:
      ....
      Action = CPSUICB_ACTION_OPTIF_CHANGED;
      //
      // Fall through
      //
    ....
  }
  ....
}

V519 The 'Action' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1110, 1124. cpsuidat.c 1124

Something is not right here. The assignment operation "Action = CPSUICB_ACTION_REINIT_ITEMS;" is pointless. The 'Action' variable will be assigned another value a bit later. Perhaps it is the 'break' operator missing here. In other places where 'break' is not needed, you can see the comment "// Fall through". But there is no such a comment here.

Not bugs, but code causing confusion

There are some code fragments that don't contain errors but may puzzle you very much. Since these code fragments confuse me, they will also confuse other programmers. Here is one example:

BOOLEAN FillDeviceInfo(IN  PHID_DEVICE HidDevice)
{
  ....
  HidDevice->InputReportBuffer = (PCHAR)calloc(....);
  HidDevice->InputButtonCaps = buttonCaps =
   (PHIDP_BUTTON_CAPS) calloc(....);
  
  ....
  if (NULL == buttonCaps)
  {
    free(HidDevice->InputReportBuffer);
    HidDevice->InputReportBuffer = NULL;
    free(buttonCaps);
    HidDevice->InputButtonCaps = NULL;
    return (FALSE);
  }
  ....
}

V575 The null pointer is passed into 'free' function. Inspect the first argument. pnp.c 406

The 'buttonCaps' pointer equals NULL. Despite that, the function free(buttonCaps) is called, which is pointless. This code makes you think there's some error here. But there are not any. It's just an unnecessary operation and a waste of time on code examination. The same meaningless calls of the free() function can be found in some other fragments:

There were some other strange fragments as well. I won't cite them, as this post is long enough and we have to finish.

Conclusion

Because PVS-Studio finds more and more bugs in open-source projects, my articles reporting these checks tend to grow larger and larger. In the future, I suppose, I'll have to describe only the most interesting issues in my posts and attach a link to a complete list of suspicious fragments.

I hope that Microsoft will get my article right. By no means did I intend to show that their code is bad. The article just shows that errors can be found in any projects and that we are capable of detecting some of them. In fact, each of my posts describes errors found in this or that project. I hope this one will help the developers to fix some defects. It will save other developers' time; but what's most important, no one will doubt Microsoft's reputation. Don't you find it strange to hear someone saying at a conference that Microsoft is concerned with their software's quality and then see the line "ASSERT (0 < dwMaskSize <5);" in some published project? The quality of demo samples must be as high as that of popular software. This is the code by which programmers will judge the quality of other solutions.