Andrey Karpov

Mar 09 2011

Tags:

#Cpp

How to make fewer errors at the stage of code writing. Part N1

Mar 09 2011

Author: Andrey Karpov

Introduction
1. Avoid functions memset, memcpy, ZeroMemory and the like
2. Watch closely and check if you are working with a signed or unsigned type
3. Avoid too many calculations in one string
4. Align everything you can in code
5. Do not copy a line more than once
6. Set a high warning level of your compiler and use static analyzers
Summary
P.S.

I've arrived at the source code of a widely know instant messenger Miranda IM. Together with various plugins, this is a rather large project whose size is about 950 thousand code lines in C and C++. And like any other considerable project with a long development history, it has rather many errors and misprints.

Introduction

While examining defects in various applications, I noticed some regularities. By the examples of defects found in Miranda IM, I will try to formulate some recommendations that will help you to avoid many errors and misprints already at the stage of code writing.

I used the PVS-Studio 4.14 analyzer to check Miranda IM. The Miranda IM project's code is rather quality and its popularity just confirms this fact. I am using this messenger myself and do not have any complaints about its quality. The project is built in Visual Studio with the Warning Level 3 (/W3) while the amount of comments makes 20% of the whole program's source.

1. Avoid functions memset, memcpy, ZeroMemory and the like

I will start with errors that occur when using low-level functions to handle memory such as memset, memcpy, ZeroMemory and the like.

I recommend you to avoid these functions by all means. Sure, you do not have to follow this tip literally and replace all these functions with loops. But I have seen so many errors related to using these functions that I strongly advise you to be very careful with them and use them only when it is really necessary. In my opinion, there are only two cases when using these functions is grounded:

1) Processing of large arrays, i.e. in those places where you can really benefit from an optimized function algorithm, as compared to simple looping.

2) Processing large number of small arrays. The reason for this case also lies in performance gain.

In all the other cases, you'd better try to do without them. For instance, I believe that these functions are unnecessary in such a program as Miranda. There are no resource-intensive algorithms or large arrays in it. So, using functions memset/memcpy is determined only by the convenience of writing short code. But this simplicity is very deceptive and having saved a couple of seconds while writing the code, you will spend weeks to catch this elusive memory corruption error. Let's examine several code samples taken from the Miranda IM project.

V512 A call of the 'memcpy' function will lead to a buffer overflow or underflow. tabsrmm utils.cpp 1080

typedef struct _textrangew
{
  CHARRANGE chrg;
  LPWSTR lpstrText;
} TEXTRANGEW;

const wchar_t* Utils::extractURLFromRichEdit(...)
{
  ...
  ::CopyMemory(tr.lpstrText, L"mailto:", 7);
  ...
}

Only a part of the string is copied here. The error is awfully simple yet it remains. Most likely, there was a string earlier consisting of 'char'. Then they switched to Unicode strings but forgot to change the constant.

If you copy strings using functions which are designed quite for this purpose, this error can never occur. Imagine that this code sample was written this way:

strncpy(tr.lpstrText, "mailto:", 7);

Then the programmer did not have to change number 7 when switching to Unicode strings:

wcsncpy(tr.lpstrText, L"mailto:", 7);

I am not saying that this code is ideal. But it is much better than using CopyMemory. Consider another sample.

V568 It's odd that the argument of sizeof() operator is the '& ImgIndex' expression. clist_modern modern_extraimage.cpp 302

void ExtraImage_SetAllExtraIcons(HWND hwndList,HANDLE hContact)
{
  ...
  char *(ImgIndex[64]);
  ...
  memset(&ImgIndex,0,sizeof(&ImgIndex));
  ...
}

The programmer intended to empty the array consisting of 64 pointers here. But only the first item will be emptied instead. The same error, by the way, can be also found in another file. Thanks to our favorite Copy-Paste:

V568 It's odd that the argument of sizeof() operator is the '& ImgIndex' expression. clist_mw extraimage.c 295

The correct code must look this way:

memset(&ImgIndex,0,sizeof(ImgIndex));

By the way, taking the address from the array might additionally confuse the one who is reading the code. Taking of the address here is unreasonable and the code may be rewritten this way:

memset(ImgIndex,0,sizeof(ImgIndex));

The next sample.

V568 It's odd that the argument of sizeof() operator is the '& rowOptTA' expression. clist_modern modern_rowtemplateopt.cpp 258

static ROWCELL* rowOptTA[100];

void rowOptAddContainer(HWND htree, HTREEITEM hti)
{
  ...
  ZeroMemory(rowOptTA,sizeof(&rowOptTA));
  ...
}

Again, it is the pointer's size which is calculated instead of the array's size. The correct expression is "sizeof(rowOptTA)". I suggest using the following code to clear the array:

const size_t ArraySize = 100;
static ROWCELL* rowOptTA[ArraySize];
...
std::fill(rowOptTA, rowOptTA + ArraySize, nullptr);

I got used to meeting such lines which populate the code through the copy-paste method:

V568 It's odd that the argument of sizeof() operator is the '& rowOptTA' expression. clist_modern modern_rowtemplateopt.cpp 308

V568 It's odd that the argument of sizeof() operator is the '& rowOptTA' expression. clist_modern modern_rowtemplateopt.cpp 438

You think that is all about low-level handling of arrays? No, quite not. Read further, fear and punish those who like to use memset.

V512 A call of the 'memset' function will lead to a buffer overflow or underflow. clist_modern modern_image_array.cpp 59

static BOOL ImageArray_Alloc(LP_IMAGE_ARRAY_DATA iad, int size)
{
  ...
  memset(&iad->nodes[iad->nodes_allocated_size], 
    (size_grow - iad->nodes_allocated_size) *
       sizeof(IMAGE_ARRAY_DATA_NODE),
    0);
  ...
}

This time, the size of copied data is calculated correctly, but the second and third arguments are swapped by mistake. Consequently, 0 items are filled. This is the correct code:

memset(&iad->nodes[iad->nodes_allocated_size], 0,
  (size_grow - iad->nodes_allocated_size) *
     sizeof(IMAGE_ARRAY_DATA_NODE));

I do not know how to rewrite this code fragment in a smarter way. To be more exact, you cannot make it smart without touching other fragments and data structures.

A question arises how to do without memset when handling such structures as OPENFILENAME:

OPENFILENAME x;
memset(&x, 0, sizeof(x));

It's very simple. Create an emptied structure using this method:

OPENFILENAME x = { 0 };

2. Watch closely and check if you are working with a signed or unsigned type

The problem of confusing signed types with unsigned types might seem farfetched at first sight. But programmers make a big mistake by underestimating this issue.

In most cases, people do not like to check compiler's warning messages concerning the comparison of an int-variable to an unsigned-variable. Really, such code is usually correct. So programmers disable these warnings or just ignore them. Or, they resort to the third method - add an explicit type conversion to suppress the compiler's warning without going into details.

I suggest that you stop doing this and analyze the situation each time when a signed type meets an unsigned type. And in general, be careful about what type an expression has or what is returned by a function. Now examine several samples on this subject.

V547 Expression 'wParam >= 0' is always true. Unsigned type value is always >= 0. clist_mw cluiframes.c 3140

There is the id2pos function in program code which returns value '-1' for an error. Everything is OK with this function. In another place, the result of id2pos function is used as shown below:

typedef UINT_PTR WPARAM; 
static int id2pos(int id);
static int nFramescount=0;

INT_PTR CLUIFrameSetFloat(WPARAM wParam,LPARAM lParam)
{
  ...
  wParam=id2pos(wParam);
  if(wParam>=0&&(int)wParam<nFramescount)
    if (Frames[wParam].floating)
  ...
}

The problem is that the wParam variable has an unsigned type. So, the condition 'wParam>=0' is always true. If id2pos function returns '-1', the condition of checking for permissible values will not work and we will start using a negative index.

I am almost sure that there was different code in the beginning:

if (wParam>=0 && wParam<nFramescount)

The Visual C++ compiler generated the warning "warning C4018: '<' : signed/unsigned mismatch". It is this very warning that is enabled on Warning Level 3 with which Miranda IM is built. At that moment, the programmer paid little attention to this fragment. He suppressed the warning by an explicit type conversion. But the error did not disappear and only hidden itself. This is the correct code:

if ((INT_PTR)wParam>=0 && (INT_PTR)wParam<nFramescount)

So, I urge you to be careful with such places. I counted 33 conditions in Miranda IM which are always true or always false due to confusion of signed/unsigned.

Let's go on. I especially like the next sample. And the comment, it is just beautiful.

V547 Expression 'nOldLength < 0' is always false. Unsigned type value is never < 0. IRC mstring.h 229

void Append( PCXSTR pszSrc, int nLength )
{
  ...
  UINT nOldLength = GetLength();
  if (nOldLength < 0)
  {
    // protects from underflow
    nOldLength = 0;
  }
  ...
}

I think there is no need in further explanations concerning this code.

Of course, it is not only programmers' fault that errors appear in programs. Sometimes library developers play a dirty trick on us (in this case it is developers of WinAPI).

#define SRMSGSET_LIMITNAMESLEN_MIN 0
static INT_PTR CALLBACK DlgProcTabsOptions(...)
{
  ...
  limitLength =
    GetDlgItemInt(hwndDlg, IDC_LIMITNAMESLEN, NULL, TRUE) >=
    SRMSGSET_LIMITNAMESLEN_MIN ?
    GetDlgItemInt(hwndDlg, IDC_LIMITNAMESLEN, NULL, TRUE) :
    SRMSGSET_LIMITNAMESLEN_MIN;
  ...
}

If you ignore the excessively complicated expression, the code looks correct. By the way, it was one single line at first. I just arranged it into several lines to make it clearer. However, we are not discussing editing now.

The problem is that the GetDlgItemInt() function returns quite not 'int' as the programmer expected. This function returns UINT. This is its prototype from the "WinUser.h" file:

WINUSERAPI
UINT
WINAPI
GetDlgItemInt(
    __in HWND hDlg,
    __in int nIDDlgItem,
    __out_opt BOOL *lpTranslated,
    __in BOOL bSigned);

PVS-Studio generates the following message:

V547 Expression is always true. Unsigned type value is always >= 0. scriver msgoptions.c 458

And it is really so. The "GetDlgItemInt(hwndDlg, IDC_LIMITNAMESLEN, NULL, TRUE) >= SRMSGSET_LIMITNAMESLEN_MIN" expression is always true.

Perhaps there is no error in this particular case. But I think you understand what I am driving at. Be careful and check results your functions return.

3. Avoid too many calculations in one string

Every programmer knows and responsibly says at discussions that one should write simple and clear code. But in practice it seems that programmers participate in a secret contest for the most intricate string with an interesting language construct or skill of juggling with pointers.

Most often errors occur in those places where programmers gather several actions in one line to make code compact. Making code just a bit smarter, they risk misprinting or missing some side effects. Consider this sample:

V567 Undefined behavior. The 's' variable is modified while being used twice between sequence points. msn ezxml.c 371

short ezxml_internal_dtd(ezxml_root_t root, char *s, size_t len)
{
  ...
  while (*(n = ++s + strspn(s, EZXML_WS)) && *n != '>') {
  ...
}

We have undefined behavior here. This code might work correctly for a long time but it is not guaranteed that it will behave the same way after moving to a different compiler's version or optimization switches. The compiler might well calculate '++s' first and then call the function 'strspn(s, EZXML_WS)'. Or vice versa, it may call the function first and only then increment the 's' variable.

Here you have another example on why you should not try to gather everything in one line. Some execution branches in Miranda IM are disabled/enabled with inserts like '&& 0'. For example:

if ((1 || altDraw) && ...
if (g_CluiData.bCurrentAlpha==GoalAlpha &&0)
if(checkboxWidth && (subindex==-1 ||1)) {

Everything is clear with these comparisons and they are well noticeable. Now imagine that you see a fragment shown below. I have edited the code but initially it was ONE SINGLE line.

V560 A part of conditional expression is always false: 0. clist_modern modern_clui.cpp 2979

LRESULT CLUI::OnDrawItem( UINT msg, WPARAM wParam, LPARAM lParam )
{
  ...
  DrawState(dis->hDC,NULL,NULL,(LPARAM)hIcon,0,
    dis->rcItem.right+dis->rcItem.left-
    GetSystemMetrics(SM_CXSMICON))/2+dx,
    (dis->rcItem.bottom+dis->rcItem.top-
    GetSystemMetrics(SM_CYSMICON))/2+dx,
    0,0,
    DST_ICON|
    (dis->itemState&ODS_INACTIVE&&FALSE?DSS_DISABLED:DSS_NORMAL));
   ...
}

If there is no error here, still it is hard to remember and find the word FALSE in this line. Have you found it? So, it is a difficult task, isn't it? And what if there is an error? You have no chances to find it by just reviewing the code. Such expressions should be arranged as a separate line. For example:

UINT uFlags = DST_ICON;
uFlags |= dis->itemState & ODS_INACTIVE && FALSE ?
            DSS_DISABLED : DSS_NORMAL;

Personally I would make this code longer yet clearer:

UINT uFlags;
if (dis->itemState & ODS_INACTIVE && (((FALSE))))
  uFlags = DST_ICON | DSS_DISABLED;
else 
  uFlags = DST_ICON | DSS_NORMAL;

Yes, this sample is longer but it is well readable and the word FALSE is well noticeable.

4. Align everything you can in code

Code alignment makes it less probable that you will misprint or make a mistake using Copy-Paste. If you still make an error, it will be much easier to find it during code review. Let's examine a code sample.

V537 Consider reviewing the correctness of 'maxX' item's usage. clist_modern modern_skinengine.cpp 2898

static BOOL ske_DrawTextEffect(...)
{
  ...
  minX=max(0,minX+mcLeftStart-2);
  minY=max(0,minY+mcTopStart-2);
  maxX=min((int)width,maxX+mcRightEnd-1);
  maxY=min((int)height,maxX+mcBottomEnd-1);
  ...
}

It is just a solid code fragment and it is not interesting to read it at all. Let's edit it:

minX = max(0,           minX + mcLeftStart - 2);
minY = max(0,           minY + mcTopStart  - 2);
maxX = min((int)width,  maxX + mcRightEnd  - 1);
maxY = min((int)height, maxX + mcBottomEnd - 1);

This is not the most typical example but you agree that it is much easier to notice now that the maxX variable is used twice, don't you?

Do not take my recommendation on alignment literally writing columns of code everywhere. First, it requires some time when writing and editing code. Second, it may cause other errors. In the next sample you will see how that very wish to make a nice column caused an error in Miranda IM's code.

V536 Be advised that the utilized constant value is represented by an octal form. Oct: 037, Dec: 31. msn msn_mime.cpp 192

static const struct _tag_cpltbl
{
  unsigned cp;
  const char* mimecp;
} cptbl[] =
{
  {   037, "IBM037" },    // IBM EBCDIC US-Canada 
  {   437, "IBM437" },    // OEM United States 
  {   500, "IBM500" },    // IBM EBCDIC International 
  {   708, "ASMO-708" },  // Arabic (ASMO 708) 
  ...
}

Trying to make a nice column of numbers, you might be easily carried away and write '0' in the beginning making the constant an octal number.

So I define my recommendation more exactly: align everything you can in code, but do not align numbers by writing zeroes.

5. Do not copy a line more than once

Copying lines in programming is inevitable. But you may secure yourself by giving up on inserting a line from the clipboard several times at once. In most cases, you'd better copy a line and then edit it. Then again copy a line and edit it. And so on. If you do so, it is much harder to forget to change something in a line or change it wrongly. Let's examine a code sample:

V525 The code containing the collection of similar blocks. Check items '1316', '1319', '1318', '1323', '1323', '1317', '1321' in lines 954, 955, 956, 957, 958, 959, 960. clist_modern modern_clcopts.cpp 954

static INT_PTR CALLBACK DlgProcTrayOpts(...)
{
  ...
  EnableWindow(GetDlgItem(hwndDlg,IDC_PRIMARYSTATUS),TRUE);
  EnableWindow(GetDlgItem(hwndDlg,IDC_CYCLETIMESPIN),FALSE);
  EnableWindow(GetDlgItem(hwndDlg,IDC_CYCLETIME),FALSE);    
  EnableWindow(GetDlgItem(hwndDlg,IDC_ALWAYSPRIMARY),FALSE);
  EnableWindow(GetDlgItem(hwndDlg,IDC_ALWAYSPRIMARY),FALSE);
  EnableWindow(GetDlgItem(hwndDlg,IDC_CYCLE),FALSE);
  EnableWindow(GetDlgItem(hwndDlg,IDC_MULTITRAY),FALSE);
  ...
}

Most likely, there is no real error here; we just handle the item IDC_ALWAYSPRIMARY twice. However, you may easily make an error in such blocks of copied-pasted lines.

6. Set a high warning level of your compiler and use static analyzers

For many errors, there are no recommendations to give on how to avoid them. They are most often misprints both novices and skillful programmers make.

However, many of these errors can be detected at the stage of code writing already. First of all with the help of the compiler. And then with the help of static code analyzers' reports after night runs.

Someone would say now that it is a scarcely concealed advertising. But actually it is just another recommendation that will help you to have fewer errors. If I have found errors using static analysis and cannot say how to avoid them in code, it means that using static code analyzers is just that very recommendation.

Now let's examine some samples of errors that may be quickly detected by static code analyzers:

V560 A part of conditional expression is always true: 0x01000. tabsrmm tools.cpp 1023

#define GC_UNICODE 0x01000

DWORD dwFlags;

UINT CreateGCMenu(...)
{
  ...
  if (iIndex == 1 && si->iType != GCW_SERVER &&
      !(si->dwFlags && GC_UNICODE)) {
  ...
}

We have a misprint here: the '&&' operator is used instead of '&' operator. I do not know how one could secure oneself against this error while writing code. This is the correct condition:

(si->dwFlags & GC_UNICODE)

The next sample.

V528 It is odd that pointer to 'char' type is compared with the '\0' value. Probably meant: *str != '\0'. clist_modern modern_skinbutton.cpp 282

V528 It is odd that pointer to 'char' type is compared with the '\0' value. Probably meant: *endstr != '\0'. clist_modern modern_skinbutton.cpp 283

static char *_skipblank(char * str)
{
  char * endstr=str+strlen(str);
  while ((*str==' ' || *str=='\t') && str!='\0') str++;
  while ((*endstr==' ' || *endstr=='\t') &&
         endstr!='\0' && endstr<str)
    endstr--;
  ...
}

The programmer just missed two asterisks '*' for pointer dereferencing operations. The result might be a fatal one. This code is prone to violation access errors. This is the correct code:

while ((*str==' ' || *str=='\t') && *str!='\0') str++;
while ((*endstr==' ' || *endstr=='\t') &&
       *endstr!='\0' && endstr<str)
  endstr--;

Again I cannot give any particular tip except using special tools for code check.

The next sample.

V514 Dividing sizeof a pointer 'sizeof (text)' by another value. There is a probability of logical error presence. clist_modern modern_cachefuncs.cpp 567

#define SIZEOF(X) (sizeof(X)/sizeof(X[0]))

int Cache_GetLineText(..., LPTSTR text, int text_size, ...)
{
  ...
  tmi.printDateTime(pdnce->hTimeZone, _T("t"), text, SIZEOF(text), 0);
  ...
}

Everything is OK at first sight. The text and its length which is calculated with the SIZEOF macro are passed into the function. Actually this macro must be called COUNT_OF, but that's not the point. The point is that we are trying to calculate the number of characters in the pointer. It is "sizeof(LPTSTR) / sizeof(TCHAR)" which is calculated here. A human hardly notices such fragments but compiler and static analyzer see them well. This is the corrected code:

tmi.printDateTime(pdnce->hTimeZone, _T("t"), text, text_size, 0);

The next sample

V560 A part of conditional expression is always true: 0x29. icqoscar8 fam_03buddy.cpp 632

void CIcqProto::handleUserOffline(BYTE *buf, WORD wLen)
{
  ...
  else if (wTLVType = 0x29 && wTLVLen == sizeof(DWORD))
  ...
}

In such cases, I recommend you to write a constant first in the condition. The following code will simply not compile:

if (0x29 = wTLVType && sizeof(DWORD) == wTLVLen)

But many programmers, including myself, do not like this style. For instance, personally I get confused because I want to know first what variable is being compared and only then - to what it is being compared.

If the programmer does not want to use this comparison style, he has either to rely on compiler/analyzer or risk.

By the way, this error is not a rare one despite being widely known among programmers. Here are three more examples from Miranda IM where the PVS-Studio analyzer generated the V559 warning:

else if (ft->ft_magic = FT_MAGIC_OSCAR)
if (ret=0) {return (0);}
if (Drawing->type=CLCIT_CONTACT)

The code analyzer also allows you to detect very suspicious places in code, if not errors. For instance, pointers serve not only as pointers in Miranda IM. In some places such games look fine, in other places they look scary. Here is a code sample that alerts me:

V542 Consider inspecting an odd type cast: 'char *' to 'char'. clist_modern modern_toolbar.cpp 586


static void
sttRegisterToolBarButton(..., char * pszButtonName, ...)
{
  ...
  if ((BYTE)pszButtonName)
    tbb.tbbFlags=TBBF_FLEXSIZESEPARATOR;
  else
    tbb.tbbFlags=TBBF_ISSEPARATOR;
  ...
}

Actually we are checking here if the string's address is not equal to 256. I do not quite understand what the developers intended to write in this condition. Perhaps this fragment is even correct but I doubt it.

You may find a lot of incorrect conditions using code analysis. For example:

V501 There are identical sub-expressions 'user->statusMessage' to the left and to the right of the '&&' operator. jabber jabber_chat.cpp 214

void CJabberProto::GcLogShowInformation(...)
{
  ...
  if (user->statusMessage && user->statusMessage)
  ...
}

And so on and so forth. I can give your other examples, a lot of them. But there is no reason. The main point is that you may detect many errors with static analysis at the very early stages.

When a static analyzer finds few errors in your program, it does not seem interesting to use it. But this is a wrong conclusion. You see, you paid with blood and sweat and spent hours on debugging and correcting errors which analyzer could have found at early stages.

Static analysis is of large interest in the software development field and not as a tool for one-time checks. Many errors and misprints are detected during testing and unit-test development. But if you manage to find some of them at the stage of code writing already, you will have a great time and effort gain. It is a pity when you debug a program for two hours just to notice an unnecessary semicolon '; ' after the 'for' operator. Usually you may get rid of this error by spending 10 minutes on static analysis of files that have been changed during development process.

Summary

In this article, I have shared only some of my ideas concerning ways of avoiding as many errors as possible in C++ programming. There are some other ideas I am pondering on. I will try to write about them in the next articles and posts.

P.S.

It has become a tradition to ask, after reading such an article, if we have told the application's/library's developers about the errors found. I will answer beforehand to a probable question if we have sent the bug report to Miranda IM's developers.

No, we have not. This task is too resource-intensive. We have showed only a small part of what we found in the project. There are about a hundred fragments in it about which I cannot say exactly if they are errors or not. However, we will send this article to Miranda IM's authors and offer them a free version of the PVS-Studio analyzer. If they'll get interested in the subject, they will check their source code themselves and fix whatever they consider necessary to fix.

I must also clarify why I often cannot say exactly if a particular code fragment has an error. This is a sample of ambiguous code:

V523 The 'then' statement is equivalent to the 'else' statement. scriver msglog.c 695

if ( streamData->isFirst ) {
  if (event->dwFlags & IEEDF_RTL) {
    AppendToBuffer(&buffer, &bufferEnd, &bufferAlloced, "\\rtlpar");
  } else {
    AppendToBuffer(&buffer, &bufferEnd, &bufferAlloced, "\\ltrpar");
  }
} else {
  if (event->dwFlags & IEEDF_RTL) {
    AppendToBuffer(&buffer, &bufferEnd, &bufferAlloced, "\\rtlpar");
  } else {
    AppendToBuffer(&buffer, &bufferEnd, &bufferAlloced, "\\ltrpar");
  }
}

Here you are two identical code fragments. Perhaps it is an error. Or maybe the programmer needs to have two identical action sets in every branch, so he has written the code so that it could be easily modified later. You need to know the program to make out if this place is a mistake or not.

#Cpp