Trying to Sell PVS-Studio to Google, or New Bugs in Chromium

02.12.2013 Evgeniy Ryzhkov

Introduction

Publishing articles about checks of various projects with our tool PVS-Studio usually brings us new customers. It's a fair business: programmers don't like ads but readily response to interesting materials which can be easily checked. That's why we prefer to demonstrate what our tool is capable of rather than directly advertise it. Nevertheless, despite that we checked Chromium three times already and found bugs in it each time, I still haven't received an email with an order request from google.com. I want to figure out what I am doing wrong and why Google would refuse to use PVS-Studio, so I decided to write one more article on this matter.

Picture 1

PVS-Studio integrates into Ninja to check Chromium.

It consists of two parts. The first explains Chromium's infrastructure and our tool's integration specifics; the second demonstrates new bugs found in the code.

Do you want to know why developing Chromium is a difficult task and why only some of the programmer tools are applicable to this project? Then you are welcome to read on...

The Chromium developers disregard Visual Studio and don't use Makefile but still manage somehow to write an incredibly high-quality code. How come?

Developing projects like Chromium is extremely difficult. Well, I even feel a bit awkward about the phrase "projects like Chromium" because I'm not familiar with any other project of that kind. There are the Linux core, of course, and the Visual Studio environment and many other large and heavy projects, but personally I had the chance to "shake hands" with Chromium only, and what I found there would certainly be of interest to every programmer, as there is much to learn from it indeed.

For example, I found out that they don't use Visual Studio too much. The reason is obvious: Chromium includes an enormous amount of projects, and the IDE by Microsoft just can't manage it, to be honest. "Aha!", the stern linuxoids would say, "You bet it can't!!!" But the Chromium developers don't use Linux Makefiles either - just for the same reason: the standard GNU make can't manage so many projects and runs too slow.

So what do the Chromium developers use then? They use the GYP (Generate Your Projects) build system. It can generate either .vcxproj files (for MSBuild/Visual C++) or files for the Ninja system (it's kind of a much simpler and faster makefile). That's why one needs to integrate a static analyzer into this build system to be able to perform analysis regularly. And that's exactly what we've done in our attempts to sell PVS-Studio to Google.

What is special about the Chromium project is that the size of its C/C++ source code, including third party libraries, is larger than 500 Mbytes and each modification of code is verified by dozens of thousands of automatic tests simultaneously on hundreds of test computers with various architectures and configurations. The development rate is also very remarkable: more than 900 unique authors added more than 48 thousand code revisions into Chromium's repository during 2012, which makes the average rate of one revision per 11 minutes and one revision per week by each active author.

Such a heavy and fast developing project sets especially strict requirements to versatility, accuracy and efficiency of bug detectors as well as the entire testing system in general. It was when analyzing Chromium that many bugs, defects and optimization troubles of detectors were discovered for the first time. In particular, it proved unfairly troublesome to use proprietary detectors, whose source code was not available for modifications, because they would too often process even the base primitives of the project incorrectly, whereas fixing those testing defects required programmers to wait too long before the next version of a detector was released.

Although PVS-Studio is not an open source project either, one can't deny that we are flexible enough. Checking Chromium without any troubles by integrating into the same build system it employs is our way to prove that.

How to integrate PVS-Studio into Chromium's build system for regular checks?

General information on PVS-Studio's working principles

We can distinguish 2 basic components of the PVS-Studio distribution package: the command-line analyzer PVS-Studio.exe itself and the IDE plugin to integrate it into one of the supported IDEs (Microsoft Visual Studio and Embarcadero RAD Studio).

The command-line analyzer works in a way very much similar to a compiler: it is called individually for each file to be checked, each call containing parameters which, among others, include original compilation parameters for a particular file. After that, the analyzer calls a corresponding external preprocessor (depending on the compiler building the files) which generates a temporary i-file, i.e. the file with all the include and define directives expanded, for the analyzer to check it.

However, usage of PVS-Studio.exe is not limited to IDE plugins only. As said above, the command-line analyzer very much resembles a compiler in its working principles, so its call can be also integrated directly into the build system together with the compiler. For example, if you have your project built in the Eclipse IDE with gcc, you can integrate a PVS-Studio call into your makefiles.

For direct integration into the build process, you need to add a call of PVS-Studio.exe into the build script next to the call of the C/C++ compiler and pass the same parameters to the analyzer as to the compiler (and a few more additional parameters specifying the analysis report output). This requirement is determined by the consideration that the analyzer must be called for each file you want checked, each call containing specific parameters for each particular file. It can be most conveniently done at the same stage where automatic traverse of all the project's source files takes place.

Checking the Chromium project with the PVS-Studio.exe static analyzer

As said above, Chromium is developed with the aid of the GYP (Generate Your Projects) build system that allows one to get native project files for different operating systems and compilers. Since the PVS-Studio analyzer currently supports the Windows family only, we need to find possible ways to build Chromium with the Visual C++ 10 compiler. This compiler (cl.exe) comes with the Visual Studio IDE package and can also be installed separately from a free package Windows SDK.

Using MSBuild project files

The GYP system employed by Chromium allows one to use the Visual C++ compiler (cl.exe) to get MSBuild project files (vcxproj). The MSBuild build system is a component of the .NET Framework package which is one of the standard components for the Windows family's operating systems.

The easiest way to get PVS-Studio to check the Chromium project is to use its native IDE plugin for Visual Studio. MSBuild project files can be opened and checked in this environment, and the PVS-Studio IDE plugin will automatically collect all the necessary information about each of the project files and call the PVS-Studio.exe analyzer to check them. Note that the free version Visual Studio Express Edition does not support IDE plugins.

You can also use MSBuild (the command-line utility MSBuild.exe, to be more exact) to build and check vcxproj files directly, without the Visual Studio environment. To enable the analyzer to check projects in this mode, you will have to integrate the calls of the command-line analyzer PVS-Studio.exe directly into each project file (or import the shared props file with this call into all the project files).

Although MSBuild allows exe-files to be called directly from its build scripts (to which project files vcxproj also refer), the calls of building tools like the compiler and linker in the standard projects are implemented through special wrappers linked to the project (they are called Build Tasks in the terms of MSBuild). The PVS-Studio distribution package provides such a build-task wrapper for MSBuild build scripts together with the Props (property sheet) file employing it which can be directly imported into standard vcxproj projects to perform the integration of static analysis.

Using Ninja project files

Chromium can also be built under Windows with the cl.exe compiler and the Ninja build system's scripts which also can be generated by GYP.

As described above, to integrate the analyzer directly into the build process, you need to integrate a PVS-Studio.exe call at the same point where the system traverses the source files during compilation.

When dealing with Ninja files, this integration method is complicated by one thing: the files being built are strictly specified in automatically generated *.ninja files as dependences for obj files. In this connection, the build rules of this stage (they are described in the shared file build.ninja) must be modified to ensure integration of the analyzer at this point. These are the files cc and cxx - they are used when traversing the source files.

We currently haven't found a way to add PVS-Studio.exe calls directly into cc and cxx rules. Ninja's build rules allow using only one command variable to specify a command to be executed. At the same time, according to the documentation, this variable may also contain several commands separated by the characters &&. But when we add to an existing rule:

command = ninja -t msvc -e $arch -- $cc /nologo /showIncludes /FC 
@$out.rsp /c $in /Fo$out /Fd$pdbname

a PVS-Studio.exe call:

PVS = "PVS-Studio.exe"
...
command = ninja -t msvc -e $arch -- $cc /nologo /showIncludes /FC 
@$out.rsp /c $in /Fo$out /Fd$pdbname && $PVS -cfg "c:\test.cfg"

this call is interpreted as some of the arguments for the ninja process and therefore is passed as an argument into cl.exe ($cc) when calling -t msvs. In the same way, if you integrate the $PVS call in the beginning of the line, all the other parameters after && are passed as arguments into PVS-Studio.exe.

We could write a wrapper program to circumvent this limitation: it first calls ninja and then PVS-Studio.exe by turn, the call of this wrapper added into the beginning of the command variable for the necessary build rules (cc and cxx). Well, that's what we actually did in order to check Chromium with PVS-Studio.

Specifics of working with the command-line analyzer PVS-Studio.exe when directly integrated into the build system's scripts.

The main thing one needs to keep in mind when using PVS-Studio.exe in the mode of direct integration into the build system (i.e. without an IDE plugin) is that you need to preprocess all the source files you want checked in order to make them analyzable. PVS-Studio.exe must receive the flag cl-params as one of its launch parameters, followed by the "original" compilation parameters of the file. PVS-Studio.exe will automatically call the external preprocessor (for example, cl.exe), adding the necessary flags to those parameters to control the preprocessor's work (the flag /P in case of cl.exe).

But there are certain differences between the preprocessor's and compiler's behavior which may cause an issue when compilation flags are insufficient for correct preprocessing of the source C/C++ files. In particular, preprocessing may become impossible if include paths passed to the preprocessor are lacking the path to the folder with the header file serving as a precompiled header. In this case compilation will run successfully (only if the pch file specified to the compiler is already generated, of course), but the preprocessor will terminate with the error message "cannot open include file".

When dealing with a precompiled header included into a file, the PVS-Studio IDE plugin solves the issue by scanning all the files of the project the file being checked refers to and adding the folder with the file generating the needed pch (there may be a few of them in a project) into the include paths. In direct integration mode, we must ensure the preprocessor's correct work by passing this path among other compilation parameters. You can do it by adding one more -I (/I) parameter with the corresponding folder into the list of arguments passed to the analyzer.

The Chromium project contains several hundreds of such files, i.e. files that utilize precompiled headers which don't receive, while being compiled in Includes, the path to the folder with the h files themselves from which those headers were obtained. To perform a correct check of these files by PVS-Studio in direct integration mode (i.e. without using a plugin), we need to modify the build system as described above before launching analysis.

But there's an easier way to do this. We simply disabled precompiled headers when building Chromium to integrate PVS-Studio into the build system.

What to do with the check log generated after integration?

The result of such integration is a report file in a so called "raw" form. You can view it in our utility PVS-Studio Standalone (for details, look here) and start working with it in a full-function environment providing navigation and other convenient features.

Summing up the information about integrating PVS-Studio into Chromium's build system

So, how exactly is integration of PVS-Studio into Chromium's build system done?

  • Disable precompiled headers.
  • Generate Ninja projects.
  • Call a special utility PVS-Studio Wrapper (not included into the PVS-Studio distribution package) from the Ninja projects and this utility will in its turn call PVS-Studio.
  • The analysis results are output as a raw log file which can be opened and managed in PVS-Studio Standalone.

Now let's pass on to the second part of our article - examples of detected bugs.

Examples of detected bugs

It was not for the purpose of the error search as such, but for the purpose of testing the new build system that we have checked Chromium one more time. To be more exact, we wanted to see how well PVS-Studio can integrate into it. That's why Andrey Karpov has only scanned through the diagnostic messages. But he still managed to find some genuine bugs and sent me a few code fragments with comments. It's not surprising for large projects like Chromium that even a quick, superficial examination reveals bugs in them, since even a well written code of such a size will inevitably contain errors. Besides, Chromium is developing very fast and acquires new features and libraries.

Most bugs in its code are found in third-party libraries; but that doesn't make them less serious. Since the Chromium authors are very fast developing the libraries the project is comprised of, I guess they will be interested to know what new bugs PVS-Studio has found there.

Let me remind you that we already checked Chromium a few times before:

This is the reason why I can't persuade Andrey to spend more time and attention on examination of diagnostic messages. First, it's not that interesting after all those checks. Second, it's only the Chromium developers themselves who can perform a reliable and complete analysis of their project. It's too hard carrying out the analysis on one's own and, moreover, without knowing the project's and libraries' structure and principles. After all, the project can (and ought to) be checked every day, not once in a year. But that issue lies on the conscience of the programmer community working on Chromium and the libraries.

A parenthesis in a wrong place (paranoids will treat it as a tab :)

static SECStatus
ssl3_SendEncryptedExtensions(sslSocket *ss)
{
  static const unsigned char P256_SPKI_PREFIX[] = {
    0x30, 0x59, 0x30, 0x13, 0x06, 0x07, 0x2a, 0x86,
    0x48, 0xce, 0x3d, 0x02, 0x01, 0x06, 0x08, 0x2a,
    0x86, 0x48, 0xce, 0x3d, 0x03, 0x01, 0x07, 0x03,
    0x42, 0x00, 0x04
  };
  ....
  if (.... ||
      memcmp(spki->data, P256_SPKI_PREFIX,
             sizeof(P256_SPKI_PREFIX) != 0))
  {
    PORT_SetError(SSL_ERROR_INVALID_CHANNEL_ID_KEY);
    rv = SECFailure;
    goto loser;
  }
  ....
}

PVS-Studio's diagnostic message (the library Network Security Services): V526 The 'memcmp' function returns 0 if corresponding buffers are equal. Consider examining the condition for mistakes. ssl3con.c 10533

Because of this parenthesis the function memcmp() performs comparison of 1 byte.

The "sizeof(P256_SPKI_PREFIX) != 0" expression is always true. That is, it evaluates to 1 all the time.

The correct check should look like this:

if (.... ||
    memcmp(spki->data, P256_SPKI_PREFIX,
           sizeof(P256_SPKI_PREFIX)) != 0)

The variable 'i' looks like 1.

void SkCanvasStack::pushCanvas(....) {
  ....
  for (int i = fList.count() - 1; i > 0; --i) {
    SkIRect localBounds = canvasBounds;
    localBounds.offset(origin - fCanvasData[i-1].origin);

    fCanvasData[i-1].requiredClip.op(localBounds,
                                     SkRegion::kDifference_Op);
    fList[i-i]->clipRegion(fCanvasData[i-1].requiredClip);
  }
  ....
}

Can't see anything strange? :) Well, the analyzer can.

PVS-Studio's diagnostic message (the library Skia Graphics Engine): V501 There are identical sub-expressions to the left and to the right of the '-' operator: i - i SkCanvasStack.cpp 38

The [i - 1] expression is used as an index several times, but in one place it's written as [i-i]. Looks like a typo, and I bet it's 1 that must be subtracted here.

"One-time" loop

Code* Code::FindFirstCode() {
  ASSERT(is_inline_cache_stub());
  DisallowHeapAllocation no_allocation;
  int mask = RelocInfo::ModeMask(RelocInfo::CODE_TARGET);
  for (RelocIterator it(this, mask); !it.done(); it.next())
  {
    RelocInfo* info = it.rinfo();
    return
      Code::GetCodeFromTargetAddress(info->target_address());
  }
  return NULL;
}

PVS-Studio's diagnostic message (Chromium): V612 An unconditional 'return' within a loop. objects.cc 10326

The loop will terminate right after the first iteration. I suspect the programmer forgot to use some condition here. However, this code may be correct as well, but it's still odd and worth closer examination.

Here's one more loop like that:

int SymbolTable::Symbolize() {
  ....
  if (socketpair(AF_UNIX, SOCK_STREAM,
                 0, child_fds[i]) == -1)
  {
    for (int j = 0; j < i; j++) {
      close(child_fds[j][0]);
      close(child_fds[j][1]);
      PrintError("Cannot create a socket pair");
      return 0;
    }
  }
  ....
}

PVS-Studio's diagnostic message (the library tcmalloc): V612 An unconditional 'return' within a loop. symbolize.cc 154

I believe a closing curly brace is put in a wrong place in this fragment. Perhaps the code should look like this:

if (socketpair(AF_UNIX, SOCK_STREAM,
               0, child_fds[i]) == -1)
{
  for (int j = 0; j < i; j++) {
    close(child_fds[j][0]);
    close(child_fds[j][1]);
  }
  PrintError("Cannot create a socket pair");
  return 0;
}

The beginning and the end return the same value

class CONTENT_EXPORT EventPacket {
  ....
  InputEvents::const_iterator begin() const
    { return events_.end(); }
  InputEvents::const_iterator end() const
    { return events_.end(); }
  ....
protected:
  InputEvents events_;
  ....
};

PVS-Studio's diagnostic message (Chromium): V524 It is odd that the body of 'end' function is fully equivalent to the body of 'begin' function. event_packet.h 36

The functions begin() and end() return one and the same value. I guess the function begin() must look differently:

InputEvents::const_iterator begin() const
  { return events_.begin(); }

Unstable function rdtsc()

__inline__ unsigned long long int rdtsc()
{
#ifdef __x86_64__
  unsigned int a, d;
  __asm__ volatile ("rdtsc" : "=a" (a), "=d" (d));
  return (unsigned long)a | ((unsigned long)d << 32);
#elif defined(__i386__)
  unsigned long long int x;
  __asm__ volatile ("rdtsc" : "=A" (x));
  return x;
#else
#define NO_CYCLE_COUNTER
  return 0;
#endif
}

PVS-Studio's diagnostic message (the library SMHasher): V629 Consider inspecting the '(unsigned long) d << 32' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. Platform.h 78

This function works, but not all the time. It may fail if the long type appears 32-bit - an overflow will occur in the "(unsigned long)d << 32" expression. To avoid it, the code must be changed as shown below:

return (unsigned long long)a |
       ((unsigned long long)d << 32);

The great and terrible break

Programmers are constantly forgetting about the keyword 'break' in case statements. No matter how skilled, any programmer may forget to write it in any place of code. Be careful!

The first example:

static v8::Handle<v8::Value>
toV8Object(....)
{
  switch (extension->name()) {
    ....
    case WebGLExtension::WebGLCompressedTextureATCName:
      extensionObject = toV8(....);
      referenceName = "webGLCompressedTextureATCName";
    case WebGLExtension::WebGLCompressedTexturePVRTCName:
      extensionObject = toV8(....);
      referenceName = "webGLCompressedTexturePVRTCName";
      break;
  }
  ....
}

PVS-Studio's diagnostic messages (the library WebKit):

  • V519 The 'extensionObject' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 222, 225. V8WebGLRenderingContextCustom.cpp 225
  • V519 The 'referenceName' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 223, 226. V8WebGLRenderingContextCustom.cpp 226

There's nothing to discuss here. It's just a missing 'break', that's it.

The second example:

bool ScriptDebugServer::executeSkipPauseRequest(....)
{
  const char* v8MethodName;
  switch (request)
  {
    case ScriptDebugListener::NoSkip:
      return false;
    case ScriptDebugListener::Continue:
      return true;
    case ScriptDebugListener::StepInto:
      v8MethodName = stepIntoV8MethodName;
    case ScriptDebugListener::StepOut:
      v8MethodName = stepOutV8MethodName;
  }
  ....
}

PVS-Studio's diagnostic message (the library WebKit): V519 The 'v8MethodName' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 412, 414. ScriptDebugServer.cpp 414

Andrey Karpov has sent me a few more code fragments, but they are not that interesting, so let me skip them.

Here's just one example of those:

int linux_get_device_address (....,
  uint8_t *busnum, uint8_t *devaddr,
  ....)
{
  ....
  *busnum = __read_sysfs_attr(ctx, sys_name, "busnum");
  if (0 > *busnum)
    return *busnum;
  ....
}

PVS-Studio's diagnostic message (the library LibUSB): V547 Expression '0 > * busnum' is always false. Unsigned type value is never < 0. linux_usbfs.c 620

The pointer 'busnum' refers to an unsigned variable of the uint8_t type. It means that the (0 > *busnum) condition will never be true.

It's a genuine bug indeed, but it's dull. For you not to get bored, let me finish here.

Conclusion, or a note for the Chromium developers

You surely know that PVS-Studio regularly finds bugs in Chromium's code. Now you see that PVS-Studio can be easily integrated into the build system you employ. We are ready to help you with whatever trouble you face in that aspect. So it's up to you now to decide if you want to enhance Chromium's quality with our skills combined. I do recommend you to try PVS-Studio on your project!

P.S. No NDA's were violated while writing this article; all information used here was obtained from public sources.