Explanations to the article on Copy-Paste

Andrey Karpov
Articles: 375



Many readers liked my article "Consequences of using the Copy-Paste method in C++ programming and how to deal with it" [1]. Scott Meyers [2] noticed it too and asked me how static analysis proper helped us to detect the errors described in the article. This is his letter:

Your article on copy-paste of code fragments at http://www.viva64.com/en/a/0068/ was interesting, but it's not clear how most of the errors you use as examples could be detected by static analysis. The only one I see that looks like a good candidate for static analysis is the assignment of too many elements to invModulate. How could static analysis detect the others?

I wrote an answer to the letter and then decided to arrange it as a blog-post. Perhaps other readers will also find it interesting to know how the described errors were found.

This is my answer:

I have found all the samples of errors given in the article on "Copy-Paste" while investigating codes of projects using the PVS-Studio analyzer. Each error was detected by a certain diagnostic rule.

The first four errors were detected by the V501 diagnostic rule. To put it in simpler words, this warning is generated when there are identical subexpressions standing to the left and to the right of operators &&, ||, -, / and the like. Besides, there are a lot of exceptions to reduce the number of false alarms. For instance, the warning will not be produced for this code line: if (*p++ == *a++ && *p++ == *a++).

Now let's consider the samples from the article.

sampleCount VoiceKey::OnBackward (...) {
  ...
  int atrend = sgn(
    buffer[samplesleft - 2]-buffer[samplesleft - 1]); 
  int ztrend = sgn(
    buffer[samplesleft - WindowSizeInt-2]-
      buffer[samplesleft - WindowSizeInt-2]);
  ...
}

This is how PVS-Studio diagnoses it:

V501 There are identical sub-expressions to the left and to the right of the '-' operator. Audacity voicekey.cpp 304

inline_ bool Contains(const LSS& lss)
{
  // We check the LSS contains the two 
  // spheres at the start and end of the sweep
  return
    Contains(Sphere(lss.mP0, lss.mRadius)) && 
    Contains(Sphere(lss.mP0, lss.mRadius));
}

This is how PVS-Studio diagnoses it:

V501 There are identical sub-expressions to the left and to the right of the '&&' operator. plgcsopcode icelss.h 69

void COX3DTabViewContainer::OnNcPaint() 
{
  ...
  if(rectClient.top<rectClient.bottom &&
     rectClient.top<rectClient.bottom)
  {
    dc.ExcludeClipRect(rectClient);
  }
  ...
}

This is how PVS-Studio diagnoses it:

V501 There are identical sub-expressions to the left and to the right of the '&&' operator. UT ox3dtabview.cpp 230

void uteTestRunner::StressBayer(uint32 iFlags)
{
  ...
  static EPixelFormat ms_pfList[] = 
    { PF_Lub, PF_Lus, PF_Li, PF_Lf, PF_Ld };
  const int fsize = sizeof(ms_pfList) / sizeof(ms_pfList);

  static EBayerMatrix ms_bmList[] = 
    { BM_GRBG, BM_GBRG, BM_RGGB, BM_BGGR, BM_None };
  const int bsize = sizeof(ms_bmList) / sizeof(ms_bmList);
  ...
}

This is how PVS-Studio diagnoses it:

V501 There are identical sub-expressions to the left and to the right of the '/' operator: sizeof (ms_pfList) / sizeof (ms_pfList) IFF plugins engine.cpp 955

V501 There are identical sub-expressions to the left and to the right of the '/' operator: sizeof (ms_bmList) / sizeof (ms_bmList) IFF plugins engine.cpp 958

The next two samples are diagnosed by the V517 rule. The check detects sequences of the "if(A)... else if(A)..." kind. This is a bit simplified too, of course, and there are also special exceptions to the rule.

string TimePeriod::toString() const
{
  ...
  if (_relativeTime <= 143)
    os << ((int)_relativeTime + 1) * 5 << _(" minutes");
  else if (_relativeTime <= 167)
    os << 12 * 60 + ((int)_relativeTime - 143) * 30 << _(" minutes");
  else if (_relativeTime <= 196)
    os << (int)_relativeTime - 166 << _(" days");
  else if (_relativeTime <= 143)
    os << (int)_relativeTime - 192 << _(" weeks");
  ...
}

This is how PVS-Studio diagnoses it:

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. GSM gsm_sms_codec.cc 175

void DXUTUpdateD3D10DeviceStats(...)
{
  ...
  else if( DeviceType == D3D10_DRIVER_TYPE_SOFTWARE )
    wcscpy_s( pstrDeviceStats, 256, L"WARP" );
  else if( DeviceType == D3D10_DRIVER_TYPE_HARDWARE )
    wcscpy_s( pstrDeviceStats, 256, L"HARDWARE" );
  else if( DeviceType == D3D10_DRIVER_TYPE_SOFTWARE )
    wcscpy_s( pstrDeviceStats, 256, L"SOFTWARE" );
  ...
}

This is how PVS-Studio diagnoses it:

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. TickerTape dxut.cpp 6217

The next error was detected by the analyzer with the help of the V523 rule. It is odd that the then and else branches of a condition perform the same actions.

BOOL CGridCellBase::PrintCell(...)
{
  ...
  if(IsFixed())
    crFG = (GetBackClr() != CLR_DEFAULT) ?
      GetTextClr() : pDefaultCell->GetTextClr();
  else
    crFG = (GetBackClr() != CLR_DEFAULT) ?
      GetTextClr() : pDefaultCell->GetTextClr();
  ...
}

This is how PVS-Studio diagnoses it:

V523 The 'then' statement is equivalent to the 'else' statement. GridCtrl gridcellbase.cpp 652

The next sample has an obvious "Copy-Paste" error. But it is detected by a rule which is not intended to find misprints. We can say that the error is detected indirectly. And it is detected because an obvious array overrun occurs. The V557 diagnostic rule is responsible for detecting this error.

void RB_CalcColorFromOneMinusEntity( unsigned char *dstColors ) {
  ...
  unsigned char invModulate[3];
  ...
  invModulate[0] = 255 - backEnd.currentEntity->e.shaderRGBA[0];
  invModulate[1] = 255 - backEnd.currentEntity->e.shaderRGBA[1];
  invModulate[2] = 255 - backEnd.currentEntity->e.shaderRGBA[2];
  invModulate[3] = 255 - backEnd.currentEntity->e.shaderRGBA[3];
  ...
}

This is how PVS-Studio diagnoses it:

V557 Array overrun is possible. The '3' index is pointing beyond array bound. renderer tr_shade_calc.c 679

The last sample is the most interesting one. It is detected by the V525 diagnostic. We have developed this rule intentionally to detect similar code fragments where a misprint is highly probable. Schematically, its working principle is the following. Assume we have a code like this:

if (A == 1)
  Q = A + X;
if (A == 2)
  Q = A + Y;
if (A == 3)
  Q = A + Y;

The three statements have an identical structure. So let's consider this code fragment as a table consisting of names and numbers with the dimension 5x3:

A  1  Q  A  X
A  2  Q  A  Y
A  3  Q  A  Y

While investigating this table, the analyzer resorts to a heuristic algorithm to suppose that something else must stand in the place of the last 'Y'. Remember that it is just a very rough description. Unfortunately, I have to admit that this check often gives false reports and we do not know how to eliminate them. Because of this we had to set the third relevance level for the V525 warning. However, sometimes it lets you find very interesting errors like the one cited in the article:

void KeyWordsStyleDialog::updateDlg() 
{
  ...
  Style & w1Style =
    _pUserLang->_styleArray.getStyler(STYLE_WORD1_INDEX);
  styleUpdate(w1Style, _pFgColour[0], _pBgColour[0],
    IDC_KEYWORD1_FONT_COMBO, IDC_KEYWORD1_FONTSIZE_COMBO,
    IDC_KEYWORD1_BOLD_CHECK, IDC_KEYWORD1_ITALIC_CHECK,
    IDC_KEYWORD1_UNDERLINE_CHECK);

  Style & w2Style =
    _pUserLang->_styleArray.getStyler(STYLE_WORD2_INDEX);
  styleUpdate(w2Style, _pFgColour[1], _pBgColour[1],
    IDC_KEYWORD2_FONT_COMBO, IDC_KEYWORD2_FONTSIZE_COMBO,
    IDC_KEYWORD2_BOLD_CHECK, IDC_KEYWORD2_ITALIC_CHECK,
    IDC_KEYWORD2_UNDERLINE_CHECK);

  Style & w3Style =
    _pUserLang->_styleArray.getStyler(STYLE_WORD3_INDEX);
  styleUpdate(w3Style, _pFgColour[2], _pBgColour[2],
    IDC_KEYWORD3_FONT_COMBO, IDC_KEYWORD3_FONTSIZE_COMBO,
    IDC_KEYWORD3_BOLD_CHECK, IDC_KEYWORD3_BOLD_CHECK,
    IDC_KEYWORD3_UNDERLINE_CHECK);

  Style & w4Style =
    _pUserLang->_styleArray.getStyler(STYLE_WORD4_INDEX);
  styleUpdate(w4Style, _pFgColour[3], _pBgColour[3],
    IDC_KEYWORD4_FONT_COMBO, IDC_KEYWORD4_FONTSIZE_COMBO,
    IDC_KEYWORD4_BOLD_CHECK, IDC_KEYWORD4_ITALIC_CHECK,
    IDC_KEYWORD4_UNDERLINE_CHECK);
  ...
}

This is how PVS-Studio diagnoses it:

V525: The code containing the collection of similar blocks. Check items '7', '7', '6', '7' in lines 576, 580, 584, 588

The rest of the letter is not relevant to the matter and I won't cite the complete text. I have to admit that the post is a bit boring but it shows well that static analysis can be successfully used to detect errors in copy-pasted code. And these errors can be found both by specialized rules like V501 or V517 and in an indirect way, for example, by the V557 rule.

If you want to know about other diagnostics implemented in PVS-Studio, please visit our documentation page.

References:



Use PVS-Studio to search for bugs in C, C++, C# and Java

We offer you to check your project code with PVS-Studio. Just one bug found in the project will show you the benefits of the static code analysis methodology better than a dozen of the articles.

goto PVS-Studio;

Andrey Karpov
Articles: 375


Bugs Found

Checked Projects
364
Collected Errors
13 504