Lesson 7. The issues of detecting 64-bit errors

23.01.2012

There are various techniques of detecting errors in program code. Let us consider the most popular ones and see how efficient they are in finding 64-bit errors.

Code review

The oldest and the most proved and reliable approach to error search is code review. This method relies on reading the code by several developers together following some rules and recommendations described in the book by Steve McConnell "Code Complete" (Steve McConnell, "Code Complete"). Unfortunately, this method cannot be applied to large-scale testing of contemporary program systems due to their huge sizes.

Code review may be considered in this case rather a good means of education and avoiding 64-bit errors in a new code being developed. But this method will be too expensive and therefore unacceptable in searching for the already existing errors. You would have to view the code of the whole project to find all 64-bit errors.

Static code analysis

The means of static code analysis will help those developers who appreciate the regular code review but do not have enough time to do that. The main purpose of static code analysis is to reduce the amount of code needed to be viewed by a human and therefore reduce the time of code review. Rather many programs refer to static code analyzers which have implementations for various programming languages and provide a lot of various functions from simple code alignment control to complex analysis of potentially dangerous fragments. The advantage of static analysis is its good scalability. You can test a project of any size in reasonable time with its help. And testing the code with static analyzer regularly will help you detect many errors at the stage of only writing the code.

The static analysis technique is the most appropriate method to detect 64-bit errors. Further, when discussing 64-bit error patterns, we will show you how to diagnose these errors using Viva64 analyzer included into PVS-Studio. In the next lesson you will learn in more detail about the static analysis methodology and PVS-Studio tool.

White box method

By the white box method we will understand the method of executing the maximum available number of different code branches using a debugger or other tools. The more code is covered during the analysis, the more complete the testing is. Also, the white box testing is sometimes understood as simple debugging of an application in order to find some known error. It became impossible a long time ago to completely test the whole program code with the white box method due to huge sizes of contemporary applications. Nowadays, the white box method is convenient to use when an error is found and you want to find out what has caused it. Some programmers oppose the white box technique denying the efficiency of real-time program debugging. The main reason they refer to is that enabling a programmer to watch the process of program execution and change it along the way leads to an unacceptable programming approach implying correction of code by the trial-and-error method. We are not going to discuss these debates but I would like to note that the white box testing is too expensive to use for enhancing the quality of large program systems anyway.

It must be evident to you that complete debugging of an application for the purpose of detecting 64-bit errors is unreal just like the complete code review.

We should also note that the step-by-step debugging might be impossible when debugging 64-bit applications that process large data arrays. Debugging of such applications may take much more time. So you should consider using logging systems or some other means to debug applications.

Black box method (unit-test)

The black box method has shown much better results. Unit tests refer to this type of testing. The working principle of this technique is writing a set of tests for separate units and functions that checks all the main modes of their operation. Some authors refer unit-testing to the white box method because it relies on knowledge of the program organization. But we think that functions and units being tested should be considered black boxes because unit tests do not take into account the inner organization of a function. This viewpoint is supported by an approach when tests are developed before the functions themselves are written and it provides an increased level of the control over their functionality in terms of specification.

Unit tests have proved to be efficient in developing both simple and complex projects. One of the advantages of unit testing is that you may check if all the changes introduced into the program are correct right along the development process. They try to make it so that tests are run in only a few minutes - it allows the developer who has modified the code to see an error and correct it right away. If it is impossible to run all the tests at once, long-term tests are usually launched separately, for example, at night. It also contributes to a quick detection of errors, at least in the next morning.

When using unit tests to search for 64-bit errors, you are likely to encounter some unpleasant things. Seeking to make quick tests, programmers try to involve a small amount of calculations and data to be processed while developing them. For example, when you develop a test for the function searching for an array item, it does not matter if there will be 100 or 10 000 000 items. A hundred of items is enough but when the function processes 10 000 000, its speed is greatly reduced. But if you want to develop efficient tests to check this function on a 64-bit system, you will have to process more than 4 billion items! You think that if the function works with 100 items, it will work with billions? No. Here is an example.

bool FooFind(char *Array, char Value,
             size_t Size)
{
  for (unsigned i = 0; i != Size; ++i)
    if (i % 5 == 0 && Array[i] == Value)
      return true;
  return false;
}
#ifdef _WIN64
  const size_t BufSize = 5368709120ui64;
#else
  const size_t BufSize = 5242880;
#endif
int _tmain(int, _TCHAR *) {
  char *Array =
    (char *)calloc(BufSize, sizeof(char));
  if (Array == NULL)
    std::cout << "Error allocate memory" << std::endl;
  if (FooFind(Array, 33, BufSize))
    std::cout << "Find" << std::endl;
  free(Array);
}

The error here is in using the type unsigned for the loop counter. As a result, the counter is overflowed and an eternal loop occurs when processing a large array on a 64-bit system.

Note. It might be so that this example will not reveal an error with some settings of the compiler. To understand this strange thing, see the article "A 64-bit horse that can count".

As you may see from the example, you cannot rely on obsolete sets of unit tests if your program processes a large data amount on a 64-bit system. You must extend them taking into account possible large data amounts to be processed.

Unfortunately, it is not enough to write new tests. Here we face the problem of the time it will take the modified test set processing large data amounts to complete this work. Consequently, such tests cannot be added to the set you could launch right along the development process. Launching them at night also causes issues. The total time of running all the tests may increase more than ten times. As a result, the test running time may become more than 24 hours. You should keep this in mind and take it very seriously when modifying the tests for the 64-bit version of your program.

Manual testing

This method can be considered the final step of any development process but you should not take it as a good and safe technique. Manual testing must exist because it is impossible to detect all the errors in the automatic mode or with code review. But you should not fully rely on it either. If a program is low-quality and has a lot of defects, it may take you a long time to test and correct it and still you cannot provide the necessary quality. The only way to get a quality program is to have a quality code. That is why we are not going to consider manual testing as an efficient method of detecting 64-bit errors.

To sum it up, I would like to say that you should not rely on only one of the methods we have discussed. Although static analysis is the most efficient technique of detecting 64-bit errors, a quality application cannot be developed when only a couple of testing methodologies are involved.

The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).

The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.

Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.