Valgrind: Good but Not Enough

Sep 10 2014

Author: Andrey Karpov

Not so long ago we tried to demonstrate the benefit of using the PVS-Studio static analyzer to one company. We failed, but while corresponding with them, I prepared a detailed answer about the static and dynamic analysis methodologies. Now I've decided to publish this answer in the form of a small article. I think the readers will find it interesting, and I also will be able to use this article later on to refer our new potential customers to.

So, in the course of our correspondence, I was asked a question sounding somewhat like this:

We already experimented with static analyzers and have come to the conclusion that their accuracy is much lower than that of the ordinary valgrind. So we can't see what the purpose of static analysis is all about. It produces too many false positives and finds almost no bugs among those that valgrind cannot detect when running.

I have prepared the following answer I'm publishing here with just a few corrections:

It's not that easy to demonstrate the strengths of static analysis on a couple of small projects. First, their code is of good quality. Second, static analysis is first of all meant for searching and eliminating bugs in fresh code. Third, the error density in small projects is lower than in larger ones (explanation).

Trying to find anything in a long and stably working code is quite an ungrateful task. What static analysis is all about is to prevent numbers of bugs at the very early stages. Yes, most of these bugs can be found through our methods: either by the programmer himself or by large tests or by testers. In the worst case, they will be reported by users. But anyway, it will be just a waste of time. Many typos, Copy-Paste related bugs and other defects can be eliminated at the very early stages through static analysis. What is most valuable about it is that it can find most bugs right once the code is written. Finding a bug at any other next stage is many times more expensive.

Usually, when I'm saying this, for some reason everyone will start telling me their programmers never make typos and Copy-Paste mistakes. That's not true - they do. Everyone does: http://www.viva64.com/en/b/0260/

OK, suppose we've got you to believe us now and agree that static analysis can find some bugs. But then you would ask a fair question, "Do we need it at all with tools like valgrind that obviously produce fewer false positives?"

Yes, you do, unfortunately. There is no technology that could detect all the known error types at once. It's sad, but you have to use tools of different types complementing each other to improve your code's quality.

We already wrote some time ago about how static analysis complements other technologies. For instance, see the following post about the differences between static and dynamic code analysis: http://www.viva64.com/en/b/0248/

And here is another post about how static analysis complements unit testing: http://www.viva64.com/en/a/0080/

But in order not to sound abstract, I'd like to try to explain you the difference between the two types of analysis by a few examples. For instance, let's discuss the following interesting fragment in the SlowScanner class' constructor:

class SlowScanner {
  ....
  explicit SlowScanner(Fsm& fsm)
  {
    ....
    Fill(m_letters,
         m_letters + sizeof(m_letters)/sizeof(*m_letters), 0);
    ....
  }
  ....
  size_t* m_letters;
  ....
}

The PVS-Studio analyzer generates the following warning on this: V514 Dividing sizeof a pointer 'sizeof (m_letters)' by another value. There is a probability of logical error presence. slow.h 238

It seems that the 'm_letters' class member used to be a static array in the past. It's just an assumption of course, but it's highly probable. Imagine it used to be something like this at first: size_t m_letters[MAX_COUNT];. In those times, the array size was defined correctly:

sizeof(m_letters)/sizeof(*m_letters)

Then this array turned into a dynamic one, the 'm_letters' variable becoming an ordinary pointer. Now the "sizeof(m_letters)/sizeof(*m_letters)" expression will always evaluate to one. In a 32-bit system, the pointer size and the size of the size_t type equal 4. In a 64-bit system, their sizes will equal 8. However, regardless of whether we divide 4 by 4 or 8 by 8, we always get 1.

So, the Fill() function appears to clear only one byte. The error may well remain unrevealed if the memory has already been accidentally cleared or if uninitialized items are not used. And this is what makes this error especially tricky. You can't be sure about uninitialized items not being used.

Can a dynamic analyzer find this bug? I don't know for sure. Perhaps it can detect reading from uninitialized memory, but why does it keep silent then? This is where we are facing one of the basic differences between static and dynamic analysis.

Most likely, this code branch is executed quite rarely or at least not covered by tests. Because of that, a dynamic analyzer simply skips this code and fails to notice the bug. The weak point of dynamic analysis is that it's too difficult to cover all the possible code branches with tests, which results in some rarely used code remaining untested - and that is especially common among handlers of errors and non-standard situations of all sorts.

Static analysis, on the contrary, checks all the branches that theoretically can get control. That's why it can detect errors regardless of how often certain code is executed.

Well, let's distract from the main topic for a while. We not only offer you our analyzer, but our services in code audit as well. Depending on the results of such code audit, we may work out a document with a set of recommendations on improving your code which you can include into your coding standard. We already have experience in this job. For example, to avoid errors related to array size calculation, we recommend using a special technology (borrowed from Chromium):

template <typename T, size_t N>
char (&ArraySizeHelper(T (&array)[N]))[N];
#define arraysize(array) (sizeof(ArraySizeHelper(array)))

The 'arraysize' macro cannot be applied to an ordinary pointer as it causes a compilation error. This helps us protect our code from accidental errors. If it happens that an array turns into a pointer, the program won't be allowed to skip the place where its size is calculated.

Let's get back to static and dynamic analysis. Take a look at the following function:

inline RECODE_RESULT _rune2hex(wchar32 in,
  char* out, size_t out_size, size_t &out_writed)
{
    static const char hex_digs[]="0123456789ABCDEF";
    bool leading = true;
    out_writed = 0;
    RECODE_RESULT res = RECODE_OK;
    for (int i = 7; i >=0; i--){
        unsigned char h = (unsigned char)(in>>(i*4) & 0x0F);
        if (h || !leading || i==0){
            if (out_writed + 1 >= out_size){
                res = RECODE_EOOUTPUT;
                break;
            }
            out[out_writed++] = hex_digs[h];
        }
    }
    return res;
}

From the viewpoint of dynamic analysis, this code contains nothing to worry about. In its turn, the PVS-Studio static analyzer draws your attention to the 'leading' variable: V560 A part of conditional expression is always false: !leading. recyr_int.hh 220

I don't think there is any bug here. The 'leading' variable just became redundant after refactoring. But what if I'm wrong? What if the code is incomplete? This is surely the place the programmer should examine - and remove the variable if it is redundant so it doesn't mislead both the analyzer and those people who will be maintaining the code in future.

Warnings about some part of an expression always being a constant may feel too plain. Then check a few examples of errors found through the V560 diagnostic; they'll make you wonder at what unusual things are sometimes found in code: http://www.viva64.com/en/examples/V560/

Such errors cannot be found by dynamic analysis - it just has got nothing to look for here. They are just incorrect logical expressions.

Unfortunately, the offered projects don't allow us to demonstrate the static analyzer's advantages in full. So let's take one of the libraries included into the project. A bug in a library is in a sense a bug in the project itself, isn't it?

Here is the sslDeriveKeys function working with private data:

int32 sslDeriveKeys(ssl_t *ssl)
{
  ....
  unsigned char buf[SSL_MD5_HASH_SIZE + SSL_SHA1_HASH_SIZE];
  ....
  memset(buf, 0x0, SSL_MD5_HASH_SIZE + SSL_SHA1_HASH_SIZE);

  psFree(ssl->sec.premaster);
  ssl->sec.premaster = NULL;
  ssl->sec.premasterSize = 0;
skipPremaster:
  if (createKeyBlock(ssl, ssl->sec.clientRandom,
        ssl->sec.serverRandom,
        ssl->sec.masterSecret, SSL_HS_MASTER_SIZE) < 0)
  {
    matrixStrDebugMsg("Unable to create key block\n", NULL);
    return -1;
  }
  return SSL_HS_MASTER_SIZE;
}

A dynamic analyzer won't find anything in this fragment. The code is absolutely correct from the language viewpoint. To find the error, we need a different way of thinking by higher-level patterns, which static analyzers are good in.

We are interested in the local array 'buf'. Since it stores private data, the program attempts to clear it before leaving the function with the help of the memset() function. And this is just what the error is about.

The local array 'buf' is not used anymore after calling memset(). It means the compiler is allowed to delete the call of the memset() function as it doesn't affect the code in any way from the viewpoint of the C/C++ language. Moreover, it is not only allowed to but will certainly do so in the release version.

It will result in the private data remaining in memory and quite probably getting where they shouldn't be. Thus, a bug in a third-party library makes the whole project a bit more vulnerable.

PVS-Studio generates the following warning on this: V597 The compiler could delete the 'memset' function call, which is used to flush 'buf' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. sslv3.c 123

This error is a potential vulnerability. You may think it to be too insignificant. But it really can cause quite unpleasant consequences including sending fragments of private data by network. See the article by an ABBYY specialist Dmitry Meshcheryakov to find out how such "miracles" occur: http://www.viva64.com/en/k/0041/

I hope I've managed to make the differences between static and dynamic code analyzers clear enough to you. These two approaches do complement each other pretty well. Static analysis generating too many false positives is not a crucial problem. They can be handled and eliminated by customizing the analyzer. If you are interested in that, we can help you customize it for you to reduce the number of false positives to an amount comfortable to work with.

If we have got you interested, we suggest working out further steps of our potential cooperation and demonstration of the analyzer's capabilities on large living real-life projects.

#Cpp #StaticAnalysis