Viva64 for optimizing data structures

16.02.2009 Andrey Karpov

Finally I've come to implementation of diagnosis in Viva64 analyzer detecting structures with non-optimal arrangement of fields. Absence in VivaCore of support of type calculations for small data types as ST_CHAR, ST_SHORT restrained me from that. Earlier all the types smaller than 32-bit were called ST_LESS_INT. So the library users should download an updated version of VivaCore. Everything has been changed in it recently.

But let's return to testing optimality of arrangement of data in structures. I will focus mostly on Visual C++. You know that data in C++ structures are aligned in such a way as to provide the most effective access to them. By the way, some microprocessors cannot address non-aligned data directly at all and the compiler has to generate special code for addressing such data. Those microprocessors which can address non-aligned data still do this much less effectively. That's why the C++ compiler leaves empty cells between fields of structures to align them according to the addresses of machine words and thereby speed up addressing to them. You can turn off the alignment function by using special directives #pragma to reduce the size of main memory being used but we are not interested in this variant now. It is often possible to significantly reduce the size of used memory by simply changing the order of fields in a structure without performance loss.

Let's consider the following structure:

struct MyStruct
{
  bool m_bool;
  char *m_pointer;
  int m_int;
};

On a 32-bit system this structure will occupy 12 bytes and it is impossible to reduce this size. Each field is aligned at the border of 4 bytes. Even if we put m_bool in the end it won't change anything. The compiler will still make the structure's size multiple of 4 bytes to align these structures in arrays.

In case of a 64-bit building MyStruct structure will occupy 24 bytes. It is clear why. In the beginning there is one byte under m_bool and 7 non-used bytes for alignment as the pointer occupies 8 bytes and must be aligned at the border of 8 bytes. Then there are 4 bytes m_int and 4 non-used bytes for aligning the structure at the border of 8 bytes. Fortunately, it can be easily corrected by putting m_bool in the end of the structure as follows:

struct MyStructOpt
{
  char *m_pointer;
  int m_int;
  bool m_bool;
};

MyStructOpt occupies not 24 but 16 bytes. It is a sensible saving if we use, for example, 10 million items. In this case we will save 80 MB of memory but what is more important, we can increase performance. If there are not to many structures it doesn't matter of what size they are. Access will be performed with the same speed. But when there are a lot of items cache, the number of memory accesses etc will do make the difference. And we can for sure say that processing of 160 MB of data will take less time than in case of 240 MB. Even a simple access to all the items of the array for reading will be rather fast.

I know that changing the order of fields in structures is not always possible or convenient. But if you have millions of such structures you should pay a little time to it. The results of such simple optimization as changing the order of fields can very significant. Now I haven't figures to prove this, but perhaps I will give examples in the next notes in the blog.

Perhaps you will ask according to what rules the compiler aligns data. I will give a brief answer but if you want to learn about this issue more, please address the book by Jeffrey Richter - "Programming Applications for Microsoft Windows, 4th edition". It seems to me that this question is considered there in detail.

In whole the alignment rule is the following: each field is aligned at the address multiple of the size of this field. On a 64-bit system the field of size_t type will be aligned at the border of 8 bytes, int at the border of 4 bytes and short at the border of 2 bytes. The fields of char type are not aligned. The size of a structure is aligned up to the size multiple of the size of its maximum item. Let's show it with the help of the following example:

struct ABCD
{
  size_t m_a;
  char m_b;
};

The items will occupy 8 + 1 = 9 bytes. But if the size of the structure is 9 bytes, that is, if we want to create an array of the structures ABCD[2], m_a field of the second structure will be placed at the non-aligned address. Because of this the compiler will add 7 empty bytes to the structure to reach the size of 16 bytes.

The process of optimizing the fields' sequence may seem complicated. But we can offer a very simple and very effective way. You can just arrange the fields in descending order according to their sizes. It will be absolutely enough. In this case the fields will be situated without additional gaps. For example, let's take the following structure of 40 bytes

struct MyStruct
{
  int m_int;
  size_t m_size_t;
  short m_short;
  void *m_ptr;
  char m_char;
};

and with the help of simple size-descending sort of the fields' sequence:

struct MyStructOpt
{
  void *m_ptr;
  size_t m_size_t;
  int m_int;
  short m_short;
  char m_char;
};

we will make a structure of only 24 bytes.

A much more difficult task is detection of these very structures that should be modified. It is a thankless and tiresome task to look through all the structures and classes. It is for this purpose that I came to adding rules for searching such ineffective structures (classes) into Viva64. Besides, the analyzer will show some intelligence by giving no warning messages on the classes which are descendants of other classes. Usually such objects are not created in millions. That is, I want the analyzer to warn about ineffectiveness of MyPoint class but to keep silent about ineffectiveness of MyWindow class:

class MyPoint {
  bool m_isActive;
  size_t m_x, m_y;
  char m_color[3];
  ...
};
class MyWindow : public CWnd {
  bool m_isActive;
  size_t m_sizeX, m_ sizeY;
  char m_color[3];
  ...
};