Application port to 64-bit platforms or never cackle till your egg is laid

Nov 08 2007

Author: Vladimir Elesin

Why Do We Need This?
1. Much bigger memory size for the applications
2. The Rise of the Performance Speed
3. Large Number of Registers. Pinpoint Accuracy (high fidelity) Calculations
4. Improved Parallelism
Who Needs This?
How Can It Be Done?
What is Necessary for This?

64-bit systems appeared more than 10 years ago, but we became closely acquainted with them comparatively recently when they came to the mass computer market. More and more software developers talk about the necessity of the support of such systems. Formerly 64-bit processors were mainly spread in the field of prolonged and complicated calculations - computational modeling of hydrodynamics and flow dynamics processes, deformable solid body mechanics, ecology and molecular chemistry ones, etc. They were also used for the maintenance of some ultra-large data bases. But today systems based on these processors can be observed as typical work stations. So, is it really necessary to port the applications to the 64-bit platforms? And if the decision of the porting is made, then by what means can it be done with the least time and financial costs? Let us see.

Why Do We Need This?

Before we define the need in 64-bit systems maintenance we should obviously define the advantages of this maintenance.

1. Much bigger memory size for the applications

Here is some information about the address space size for 64-bit and 32-bit Windows operating systems:

Address space	64-bit Windows	32-bit Windows
Virtual memory	16 Tb	4 Gb
Swap file	512 Tb	16 Tb
System cache	1 Tb	1 Gb

Some operation systems reserve some part of the address space for their own needs, and this diminishes its total size available for user's applications. For instance, dynamic libraries of Windows XP and user's operating system components leave only 2 up to 3 Gb of addressing space available (it depends upon the settings), even if the computer possesses 4 Gb main memory, this constricts the available memory size even more.

With 32-bit operating systems a file whose size is more than 4 Gb couldn't be completely represented in the address space, and as a result of this, it was necessary to represent only a part of a file with these systems and this led to the degradation of the efficiency of work with large-size data. Yet, the presence of files larger than 4 Gb even at the work station has become most likely a rule than an exception (above all it is concerned with the DVD-video). The use of 64-bit systems allows to operate with files of such size more efficiently because the considerable extension of the memory size available for the applications allows us to represent such files in the address space wholly , and, as everybody knows, main memory access time is many times shorter than the hard disk access time.

2. The Rise of the Performance Speed

The improved bus architecture increases the productivity by means of shift of large amount of data between the cache and the processor during a shorter period of time. Bus architecture of the 64-bit chip sets provides high speed and carrying capacity; more data are transmitted to the cache and the processor. Bigger size of the second-level cache provides quicker accomplishment of the users' inquiries and more efficient use of processor time.

Certainly, it does not mean that your text editor will work much faster. But the 64-bit systems are able to increase the productivity of work with more exacting applications considerably, for example of work with CAD-systems, computational modeling complexes, audio and video coding, cryptographic systems and games.

3. Large Number of Registers. Pinpoint Accuracy (high fidelity) Calculations

In 64-bit systems there are twice as big as number of integer general-purpose registers, among them there are SIMD-registers (they support "one command stream - many data streams" concept). The use of these registers by the compiler allows to improve considerably the efficiency of the realization of many algorithms. For the operations with the floating point one does not use a stack but the registers, and this affects considerably the productivity of the applications, in which complicated mathematical calculations are performed. And finally, the use of 64-bit increases the accuracy of the calculations performed, reduces the round-off errors, and all these are especially important for the computation modeling of processes packets and some other applications.

4. Improved Parallelism

The improvements in paralleling processes and bus architecture provide the opportunity for the 64-bit platforms to support a larger number of processors (up to 64) with the maintenance of the linear scalability for each additional processor.

Who Needs This?

For a certain number of custom everyday programs the porting of them to the 64-bit platform at the moment does not provide any great qualitative advance in productivity. However, there are a number of fields where such advance will be quite powerful: programs for work with data bases (the larger is the amount of the used data, the more remarkable is the advance), programs for CAD/CAE (computer-aided design, modeling), programs for the creation of the numerical content (processing of picture, sound, video), 3D modeling (rendering, animation), among these are high-tech games, packets of scientific and highly productive calculations gas dynamics and hydrodynamics, seismology, geological survey, molecular chemistry and biology, genetics, research in the field of nanotechnology), cryptographic programs, expert systems, etc.

In spite of some kind of caution of the software developers concerning the question of porting to the 64-bit platform, there already exist many software products compatible with it. Nevertheless, it should be mentioned that the stated inertness of the program developers gives a chance to the beginner companies not only to get a certain position at the market of 64-bit software but also to break away in the case of successful advance of the their application versions for 64-bit platforms.

How Can It Be Done?

A number of the existing development tools essentially lower the costs of porting from the 32-bit platform to the 64-bit platform by means of simple recompiling of the existing code. The obtained applications, according to the development tool creators' opinions, are practically ready for the adequate work in new systems. It is only necessary to make some alterations (from this point on we will speak only about C and C++ languages, because they are two of the most widely spread programming languages, and at the same time they adequately illustrate the problems which appear while porting to the 64-bit platform).

These alterations are to correct a certain number of code blocks which work wrongly. To be more exact, to correct those ones which work incorrectly only with the 64-bit system, and with the 32-bit system they function absolutely right.

First of all such blocks can appear because of the use of a new data model (in the 64-bit Microsoft operating systems - LLP64). In it the types int and long remain 32-bit integers, and the type size_t becomes a 64-bit integer. All this causes a number of possible mistakes. Here are some examples. To simplify this explanation we shall use the notion of the memsize type, it is a type capable of storing a pointer. As memsize types we mean pointers and integer types, the size of which corresponds with the size of the pointer.

1) The error with the implicit conversion of the function argument, which possesses a memsize type to the 32-bit type.

float Foo(float *array, int arraySize) {...}
...
float *beginArray;
float *endArray;
...
float Value = Foo(beginArray, endArray - beginArray);

When you fulfill an arithmetic operation of subtraction with two pointers, according to the C++ language rules, the result will have a ptrdiff_t type. When you call a Foo function the result will be converted to int type, and this means the loss of high bits, and incorrect behavior of function if the application works with a 64-bit platform because in this case ptrdiff_t is a 64-bit integer, (contrary to the 32-bit int).

2) A similar error appears with implicit conversion of a 32-bit function argument to the memsize type. For example, with 64-bit platforms this may lead to inability of using the system's resources.

unsigned size = Size(); 
void *p = malloc(size);

According to the definition of the malloc() function, the argument which determines the size of the allocated memory is of the size_t type. The converted code block does not allow to allocate memory size larger than 4 Gb, because this size is limited by the maximum size of the size variable, which possesses the unsigned type (32-bit).

3) The error inside of the arithmetic expression, connected with implicit conversion to the memsize type and the change of the allowable limits of variables belonging to the expression. One of the typical examples is the rise of the infinite loop in the following code block:

size_t n;
unsigned i;
...
for (i = 0; i != n; ++i) { ... }

When you port it to the 64-bit platform the value of n in accordance with the data model LLP64 may exceed the maximum possible value of unsigned type, and this means that in this case the condition i != n turns out to be unsatisfiable.

It should be mentioned, that the errors similar to the examples 1, 2 and 3 may also appear with the explicit type conversion, for example, by means of the use of static_cast.

4) The error in address arythmetics with pointers with the overflow when an expression is being calculated.

short ind1, ind2, ind3;
char *pointerValue;
...
pointerValue += ind1* ind2* ind3;

In the case if the variable values ind1, ind2, ind3 are such, that their product exceeds the maximum allowable for the int type value ( and it's the int type to which the variables ind1, ind2, ind3 will be converted in C++ language, and consequently, their product will be converted to it, too), so, the overflow will occur and the pointerValue variable will get an incorrect value. Everything described above may occur when a programmer, who has decided to use a 64-bit system in his work with large numbers, will allow the variables ind1, ind2, ind3 get the values greater than in the 32-bit application version (though within the limits allowed by the short type). For example, 3000, 2000, 1000 correspondingly.

A similar error connected with the implicit type conversion and leading to the overflow may occur when you deal with the assignment operator, when the expression standing right to it is being calculated incorrectly.

These are only some examples of the errors which may arouse while porting applications to 64-bit platforms. The problems which arouse with the use of the overloaded functions, with the interaction of a 32-bit and a 64-bit applications, with reading and recording of files created in systems of different digit capacity and some other problems should be also regarded here.

The most of the mentioned mistakes unfortunately cannot be warned by a compiler. So, consequently, it is necessary to engage some additional means and (or) resources.

What is Necessary for This?

Practice proves that manual search for such errors is very labour- and time-consuning process, especially when the source code is large. Such manual corrections may take you several weeks and even months. And the vanishingly small amount of the errors found by the compiler (despite the claims of the software developers) predetermines extremely large time and money wastes.

Basically, now existing multy-purpose syntactic verifiers of program code may help, but they also have some disadvantages. So, one of the leaders in this field - PC Lint - despite all its advantages, does not define a considerable number of errors which appear while porting to the 64-bit platform and besides it is extremely difficult to use because of its abundant functionality and large number of unnecessary settings.

Static code analyzer may be of great help in this case. It must possess easy and handy interface which allows finding errors in the source codes of a program which appear as a result of porting to the 64-bit platform. It must let the developer find this error quickly and identify them correctly. It also must be reliable and flexible in order to respond to every possible error of porting (or at least to the overwhelming majority of errors) from the one hand, and, from the other hand, it must not overload the developer with excess information about those not found defects which are not essential. Such analyzers already exist and are available in the Internet. They can be useful for those who want their software product to be ported to a new, up-to-date platform possessing great facilities with minimum time and money outlay.

#Cpp #64bit