A User's Experience of Working with the Analyzer

25.10.2013 Alexander Lotokhov

This text is a copy of a post by one PVS-Studio user, originally published in Russian here. It was kind of Alexander to permit us to publish it at our website and translate it into English.

When the PVS-Studio team announced that they had finally released a standalone version which didn't require you to have Visual Studio installed to be able to work with it, I certainly couldn't but try it :) Before that I already had experimented with the trial version on one of our old projects. And now I got a chance to check the code of our recent project built in the AVR Studio IDE (it's eclipse-based).

To be able to work with the analyzer, you need special files generated by the preprocessor. The AVR environment can do that, but there's one subtle nuance: when you turn on the flag "Preprocessor only" you really get the preprocessed files - but they still have the .o extension instead of the .i you expected. Well, it took me 5 minutes to write a Python script to solve this small problem, and here we go - the analyzer runs well!

I was quite surprised to get very few messages - just about a couple of dozens. Most of them were insignificant warnings or false positives (one and the same value is written twice in a row into the register in embedded, and the analyzer considers it a potential error (and I agree with it on this point - you'd always better play safe and check such places to be sure)).

In a few fragments real misprints and copy-paste mistakes were found. For example, a variable of one enum is compared to a value of another enum. Or, one and the same variable is assigned two different values in a row (however, as I said before, it was in most cases false positives triggered by writing sequences into the register).

But it was the only message that I found most interesting and which made me write this post: "Possible NULL pointer dereferencing"...

It happened so that all throughout the code we used a construct of this kind:

void fun(error_t * perr)
{
 *perr = SUCCESS;
 ...
 if (something)
 {
    *perr = SOME_ERROR;
 }
}

And just in a few functions it was a bit different:

void init(void)
{
  error_t err = SUCCESS;
  ...
  fun(&err);
}

And one day, after a small refactoring, we got the following code in one fragment:

void some_init(void)
{
  error_t *perr = SUCCESS;
  ...
  some_fun(perr);
}

It was this particular line that the analyzer was angry with. SUCCESS was 0, of course.

Let's now rewind the time a bit - to the place when this change was brought into the repository.

A pretty large suite of automatic tests continued to be passed successfully after the refactoring. Code review failed discover the problem in that line (we've got too many lines *perr = SUCCESS in the code, so it was no wonder we missed that particular one).

About 30 days later after that commit the night tests crashed for the first time. We failed to reproduce the crash.

Then they crashed again. And again. We found out experimentally that the crash occurred once per thirty runs of the test suite on the average.

Our team spent about 50 hours to track the bug. All in vain. Well, we actually managed to locate the commit after which we started having troubles - but we never revealed the reason for crashes itself.

By the way, it was lying two steps lower. The function some_fun(perr) contained a call of some_other_fun(perr) which, in its turn, called some_third_fun(perr). And this last function had a code checking for errors:

for(number_of_loops)
{
  some_action(perr);
  if (*perr != SUCCESS)
    return;
}

That is, despite that no errors occurred in the function some_action (which was quite uncommon, as it made use of a whole lot of external periphery, because of which we had troubles trying to locate the problem), whether or not the loop continued depended on the value stored at the 0 address (in embedded a zero address is legal in most cases). And in most cases we had 0 written at this address.

The conclusion is: the bug we had wasted about 50 hours to track was detected at once with the first run of the analyzer and fixed in less than an hour!

A convincing argument to start using the analyzer, isn't it? Alas, not always. In particular, ours was that very case when it is not so: since we get paid for the project on the time and material scheme and those 50 hours were paid for by the customer, integration of the analyzer implies real direct damages for our administration :(((

And one more thing: the project employs FreeRTOS - and you know, we got not a single warning on it during the check!

And yes, this post was written solely from my love to analyzers.

Note by PVS-Studio developers.

Thank you for your post. We were pleasantly surprised by it: first, because you've described a positive experience; second, because you managed to cope with an AVR Studio project, though we didn't adapt the analyzer to that IDE in any way.By the way, this article shows a good example of how one should not use a static analyzer. If one used it regularly (see incremental analysis ), an error like the described one and perhaps many of plainer bugs would have never occurred. This is not the case, of course (we don't have a plugin for AVR Studio currently), but this is a thing to keep in mind for those who use Visual Studio or Embarcadero RAD Studio.Thanks again for the article.References: