Search of 64-bit errors in array implementation

11.01.2010 Andrey Karpov

In PVS-Studio 3.43, we revised the way how Viva64 analyzer detects errors in the classes serving as containers (arrays). Before, we have stuck to the principle that if a class has operator[], its parameter must have memsize-type (ptrdiff_t, size_t) and not int or unsigned. We still recommend that you use memsize type as an argument for operator[]. It allows the compiler to build a more efficient code in some cases and avoid some 64-bit errors beforehand. Now we have changed the approach to working with classes that have operator[] what allows us to reduce the number of unnecessary diagnostic warnings. Let us consider an example that might contain an error if we want to work with large data amounts:

class MyArray {
  std::vector <float> m_arr;
  ...
  float &operator[](int i)
  {
    return m_arr[i];
  }
} A;
...
int x = 2000;
int y = 2000;
int z = 2000;
A[x * y * z] = 33;

The first drawback of this code is that operator[] does not allow us to access the item with the number more than INT_MAX.

Note. I would like to clarify one important thing. In the release-version, for a code like the one in the example, the compiler can provide an optimization that will work because the 64-bit register will be used to calculate and pass the index. I will make a separate post to examine this example more thoroughly. But this luck does not make the code correct. You may learn more about dangerous optimizations here.

The second drawback of the code lies in the expression "x*y*z" where an overflow might occur when working with a large array.

Before, the analyzer has generated two warnings (V108). The first is using int type when calling the array m_arr. The second is using int type when calling the array A. Although operator[] of the class MyArray takes an int argument, we offered to use a memsize-type as the index. When the programmer changed the types of the variables x, y and z to ptrdiff_t, Visual C++ compiler started warning about type conversion in the line A[x * y * z] = 33:

warning C4244: 'argument' : conversion from 'ptrdiff_t' to 'int', possible loss of data

This warning prompted the user to change the argument in operator[] and the code became absolutely correct. Here is an example of the corrected code:

class MyArray {
  std::vector <float> m_arr;
  ...
  float &operator[](ptrdiff_t i)
  {
    return m_arr[i];
  }
} A;
...
ptrdiff_t x = 2000;
ptrdiff_t y = 2000;
ptrdiff_t z = 2000;
A[x * y * z] = 33;

Unfortunately, this diagnosis approach has one great drawback. In some cases, operator[] cannot be changed or using int as the index is absolutely justified. And it appeared that Viva64 analyzer generated a lot of unnecessary warnings. CString class from MFC can serve as an example. The operator in CString class has the prototype:

TCHAR operator []( int nIndex ) const;

Because of this the code is diagnosed as dangerous:

int i = x;
CString s = y;
TCHAR c = s[i];

CString class is inaccessible to edit. And, well, hardly will anyone use CString type in a standard program to work with lines longer than two milliard characters. In its turn, Viva64 analyzer generated many warnings on this code. If the programmer changed the index's type from int to ptrdiff_t, it was the compiler that generated the warnings. We could use warning suppression //-V108, but it would overload the code. You may learn more about warning suppression in the article: PVS-Studio: using the function "Mark as False Alarm".

We made a decision to consider the construct "A[x * y * z] = 33;" from the first example safe. Now, if operator[] takes a 32-bit type as an argument (for example, int) and we call this operator also using a 32-bit type, this call is considered safe.

Of course, it might hide an error. That is why, we added a new diagnostic warning V302: "Member operator[] of 'FOO' class has a 32-bit type argument. Use memsize-type here". This diagnostic warning is generated for operator[] defined with a 32-bit argument.

The smartness of this solution consists in that this warning is not generated on the library code that is not accessible to change. I.e., V302 warning will not be generated for the class CString but will be for the user class MyArray.

If operator[] in MyArray class is correct and really should have the type int, the programmer will only need to write only one warning suppression //-V302 in this class instead of multiple places where it is used.

The last change related to array processing concerns introduction of one more warning V120: "Member operator[] of object 'FOO' declared with 32-bit type argument, but called with memsize type argument". In whole, this warning copies the compiler warning about converting a 64-bit type to a 32-bit one. It will be useful when there are a lot of warnings generated by the compiler and among them you miss the information about code efficiency on a 64-bit system.