Reanalyzing the Notepad++ project

13.02.2012 Andrey Karpov

More than a year has passed since we analyzed Notepad++ with PVS-Studio. We wanted to see how much better the PVS-Studio analyzer has become since then and which of the previous errors have been fixed in Notepad++.

Introduction

So, we have checked the Notepad++ project taken from the repository on 31-st January, 2012. We used the PVS-Studio analyzer (version 4.54) for analysis.

Picture 0

As we have already said, we checked this project earlier. There were not too many errors in it, but still we found something worth fixing. In the project's new version, some of the errors have been fixed, while some others haven't. It is strange. Notepad++'s authors must have missed our previous note and did not use PVS-Studio to check the project. Perhaps the current note will draw their attention, especially now that we have changed the trial mode that enables you to see all the errors found by PVS-Studio.

The fixed errors must have been detected through other testing methods or reported by users.

For example, they have fixed the error described in the previous article regarding filling the _iContMap array. This is how it looked:

memset(_iContMap, -1, CONT_MAP_MAX);

The current fixed code:

memset(_iContMap, -1, CONT_MAP_MAX * sizeof(int));

But the following error is still alive and well:

bool isPointValid() {
  return _isPointXValid && _isPointXValid;
};

PVS-Studio's diagnostic message:

V501 There are identical sub-expressions to the left and to the right of the '&&' operator: _isPointXValid && _isPointXValid Notepad++ parameters.h 166

We won't return to those errors described in the previous article. Instead, we will examine some new defects that the PVS-Studio analyzer has learned to diagnose during the passed year.

As usually, we want to note that these are not all the defects we have found; we cite only those which seem interesting to write about in the article. Do not forget about our new trial mode that suits the task of checking projects (open-source ones as well) best of all.

New errors we have found

Error N1. Array overrun

int encodings[] = {
  1250, 
  1251, 
  1252, 
  ....
};

BOOL CALLBACK DefaultNewDocDlg::run_dlgProc(
  UINT Message, WPARAM wParam, LPARAM)
{
  ...
  for (int i = 0 ; i <= sizeof(encodings)/sizeof(int) ; i++)
  {
    int cmdID = em->getIndexFromEncoding(encodings[i]);
  ...
}

PVS-Studio's diagnostic message:

V557 Array overrun is possible. The value of 'i' index could reach 46. Notepad++ preferencedlg.cpp 984

The "encodings" array's items are searched in the loop. The error is this: an incorrect condition is used to exit the loop. It can be fixed through replacing the "<=" condition with "<". This is the correct code:

for (int i = 0 ; i < sizeof(encodings)/sizeof(int) ; i++)

Error N2. Incorrect calculation of buffer size

typedef struct tagTVITEMA {
  ...
  LPSTR     pszText;
  ...
} TVITEMA, *LPTVITEMA;

#define TVITEM TVITEMA

HTREEITEM TreeView::addItem(...)
{
  TVITEM tvi;
  ...
  tvi.cchTextMax =
    sizeof(tvi.pszText)/sizeof(tvi.pszText[0]); 
  ...
}

PVS-Studio's diagnostic message:

V514 Dividing sizeof a pointer 'sizeof (tvi.pszText)' by another value. There is a probability of logical error presence. Notepad++ treeview.cpp 88

The "sizeof(tvi.pszText)/sizeof(tvi.pszText[0])" expression is used to calculate the buffer size. This expression is meaningless. The pointer size in it is divided by the size of one character. We cannot say how to fix this code, as we are not familiar with the logic of program operation.

Error N3. Incorrect check that a string is empty

size_t Printer::doPrint(bool justDoIt)
{
  ...
  TCHAR headerM[headerSize] = TEXT("");
  ...
  if (headerM != '\0')
  ...
}

PVS-Studio's diagnostic message:

V528 It is odd that pointer to 'char' type is compared with the '\0' value. Probably meant: *headerM != '\0'. Notepad++ printer.cpp 380

The pointer is compared to null. The null is declared by the '\0' literal. It indicates that the programmer forgot to dereference the pointer here. This is the correct code:

if (*headerM != '\0')

There is one more identical error in another place:

V528 It is odd that pointer to 'char' type is compared with the '\0' value. Probably meant: *headerR != '\0'. Notepad++ printer.cpp 392

Error N4. Misprint in condition

DWORD WINAPI Notepad_plus::threadTextPlayer(void *params)
{
  ...
  const char *text2display = ...;
  ...
  if (text2display[i] == ' ' && text2display[i] == '.')
  ...
}

PVS-Studio's diagnostic message:

V547 Expression is always false. Probably the '||' operator should be used here. Notepad++ notepad_plus.cpp 4967

The (text2display[i] == ' ' && text2display[i] == '.') never holds. The character cannot be a space and a dot at a time. We must be dealing with a mere misprint here, so the code should look as follows:

if (text2display[i] == ' ' || text2display[i] == '.')

There is one more identical error in another place:

V547 Expression is always false. Probably the '||' operator should be used here. Notepad++ notepad_plus.cpp 5032

Error N5. Misprint in condition

int Notepad_plus::getHtmlXmlEncoding(....) const
{
  ...
  if (langT != L_XML && langT != L_HTML && langT == L_PHP)
    return -1;
  ...
}

PVS-Studio's diagnostic message:

V590 Consider inspecting this expression. The expression is excessive or contains a misprint. Notepad++ notepad_plus.cpp 853

The check in this code can be simplified. The code will look as follows:

if (langT == L_PHP)

This is obviously not what the programmer intended. Perhaps we deal with a misprint here again. This is the correct code:

if (langT != L_XML && langT != L_HTML && langT != L_PHP)

Error N6. Incorrect bit handling

TCHAR GetASCII(WPARAM wParam, LPARAM lParam)
{
  ...
  result=ToAscii(wParam,(lParam >> 16) && 0xff,
    keys,&dwReturnedValue,0);
  ...
}

PVS-Studio's diagnostic message:

V560 A part of conditional expression is always true: 0xff. Notepad++ babygrid.cpp 694

The programmer wanted to extract the third byte from the 'lParam' variable. A misprint prevents the code from doing this. The error is this: the '&&' operator is used instead of '&'. This is the correct code:

result=ToAscii(wParam,(lParam >> 16) & 0xff,
  keys,&dwReturnedValue,0);

Error N7. Incomplete code

#define SCE_T3_BRACE 20
static inline bool IsAnOperator(const int style) {
  return style == SCE_T3_OPERATOR || SCE_T3_BRACE;
}

PVS-Studio's diagnostic message:

V560 A part of conditional expression is always true: 20. lextads3.cxx 700

The IsAnOperator() function always returns 'true'. This is the correct code:

return style == SCE_T3_OPERATOR ||
       style == SCE_T3_BRACE;

Error N8. Code that will never be executed

static void ColouriseVHDLDoc(....)
{
  ...
      } else if (sc.Match('-', '-')) {
        sc.SetState(SCE_VHDL_COMMENT);
        sc.Forward();
      } else if (sc.Match('-', '-')) {
        if (sc.Match("--!"))
          sc.SetState(SCE_VHDL_COMMENTLINEBANG);
        else
          sc.SetState(SCE_VHDL_COMMENT);
      }
  ...
}

PVS-Studio's diagnostic message:

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 130, 133. lexvhdl.cxx 130

If the first condition (sc.Match('-', '-')) is true, the second check won't be executed. It will cause an issue when the case of a sequence of "--!" characters will never be handled correctly. This part of the code seems to be surplus, and the correct code should look in this way:

static void ColouriseVHDLDoc(....)
{
  ...
      } else if (sc.Match('-', '-')) {
        if (sc.Match("--!"))
          sc.SetState(SCE_VHDL_COMMENTLINEBANG);
        else
          sc.SetState(SCE_VHDL_COMMENT);
      }
  ...
}

Error N9. Surplus code

There are code fragments that don't cause problems but they are surplus. Let's cite two examples of this type:

void Gripper::doTabReordering(POINT pt)
{
  ...
  else if (_hTab == hTabOld)
  {
    /* delete item on switch between tabs */
    ::SendMessage(_hTab, TCM_DELETEITEM, iItemOld, 0);
  }
  else
  {
    if (_hTab == hTabOld)
    {
      /* delete item on switch between tabs */
      ::SendMessage(_hTab, TCM_DELETEITEM, iItemOld, 0);
    }
  }
  ...
}

PVS-Studio's diagnostic message:

V571 Recurring check. The 'if (_hTab == hTabOld)' condition was already verified in line 478. Notepad++ gripper.cpp 485

The second part of the code is meaningless and can be deleted.

And here is another example where the code contains surplus checks. The 'mainVerStr' and 'auxVerStr' pointers are always not equal to zero, as these are arrays created on the stack:

LRESULT Notepad_plus::process(....)
{
  ...
  TCHAR mainVerStr[16];
  TCHAR auxVerStr[16];
  ...
  if (mainVerStr)
    mainVer = generic_atoi(mainVerStr);
  if (auxVerStr)
    auxVer = generic_atoi(auxVerStr);
  ...
}

This code can be written in a simpler way:

mainVer = generic_atoi(mainVerStr);
auxVer = generic_atoi(auxVerStr);

There are many checks like the ones shown above to be found in the Notepad++ project:

V600 Consider inspecting the condition. The 'mainVerStr' pointer is always not equal to NULL. Notepad++ nppbigswitch.cpp 938

V600 Consider inspecting the condition. The 'auxVerStr' pointer is always not equal to NULL. Notepad++ nppbigswitch.cpp 940

V600 Consider inspecting the condition. The 'intStr' pointer is always not equal to NULL. Notepad++ preferencedlg.cpp 1871

V600 Consider inspecting the condition. The 'intStr' pointer is always not equal to NULL. Notepad++ userdefinedialog.cpp 222

V600 Consider inspecting the condition. The 'intStr' pointer is always not equal to NULL. Notepad++ wordstyledlg.cpp 539

Conclusions

Static code analyzers are not tools to be used from time to time. Regularly using them will help you quickly find errors, thus making the process of eliminating them tens of times cheaper.

Why should you waste your time looking for a fragment with a strange program behavior occurring because of an uncleared array? The code "memset(_iContMap, -1, CONT_MAP_MAX)" can be quickly and easily found by a static analyzer.

Even if this error was found by the PVS-Studio static analyzer, the tool was used in a wrong way. First, the other diagnostic messages were not studied with proper attention. Second, static analysis should be used regularly. It allows you to quickly eliminate errors in a fresh code. In addition, we regularly add new diagnostic rules into PVS-Studio.

I would like to draw your attention once again to the idea that a static analyzer is a tool to be used regularly. It is like expanding the list of warnings generated by the compiler. Do you work with the warnings off and turn them on just from time to time, when you feel like doing it? Of course, no. You should consider diagnostic warnings generated by static analysis tools in the same way.

How can you do that? You can launch analysis at night on the server. You can use incremental analysis to check freshly modified files. If it seems to you that analysis is slow and consumes many resources, see Tips on speeding up PVS-Studio.