Lesson 23. Pattern 15. Growth of structures' sizes

24.01.2012

A growth of structures' sizes is not an error by itself but it may lead to consumption of an unreasonably large memory amount and therefore to performance penalty. Let us consider this pattern not as an error but as a cause of 64-bit code inefficiency.

Data in structures of C++ language are aligned in such a way as to make the access to them most effective. Some microprocessors cannot address non-aligned data at all and the compiler has to generate a special code to deal with them. Those microprocessors that can address non-aligned data do it much less efficiently. That is why the C++ compiler leaves empty locations between structures' fields to align them on the addresses of machine words and therefore speed up the access to them. You may disable alignment using special #pragma directives to reduce the amount of memory being consumed but we are not interested in this way now. The amount of memory being used may often be greatly reduced by simply changing the order of fields in the structure without performance penalty.

Consider the following structure:

struct MyStruct
{
  bool m_bool;
  char *m_pointer;
  int m_int;
};

This structure will take 12 bytes on a 32-bit system and we cannot make it less. Each field is aligned on a 4-byte boundary. Even if we move m_bool to the end, it will not change anything. The compiler will still make the structure's size multiple of 4 bytes to align such structures in arrays.

In the 64-bit build mode the structure MyStruct will take 24 bytes. It is clear. First there is one byte for m_bool and 7 vacant bytes for the purpose of alignment because a pointer takes 8 bytes and must be aligned on an 8-byte boundary. Then there are 4 bytes for m_int and 4 vacant bytes to align the structure on an 8-byte boundary.

Fortunately, we may easily fix it by moving m_bool in the end of the structure, as shown below:

struct MyStructOpt
{
  char *m_pointer;
  int m_int;
  bool m_bool;
};

The structure MyStructOpt takes 16 bytes instead of 24. The arrangement of the fields is represented in Figure 1. It is rather a great saving if we use, for instance, 10 million items. In this case we will save 80 Mbytes of memory but what is more significant, we will enhance performance. If there will be few structures, their sizes will not matter - the access will be performed with the same speed. But when there are many items, such things as cache, the number of memory accesses, etc. become significant. And you may say with certainty that 160 Mbytes of data will take less time to process than 240 Mbytes. Even a simple access to all the array items for reading will be faster.

Figure 1 - Arrangement of the fields in the structures MyStruct and MyStructOpt

Figure 1 - Arrangement of the fields in the structures MyStruct and MyStructOpt

It is not always possible or convenient to change the order of fields in structures. But if there are millions of such structures, you must find some time for refactoring. The result of such simple optimization as changing the field order may be very great.

You may ask according to what rules the compiler aligns the data. We will answer briefly, but if you want to study this issue in more detail, read the book by Jeffery Richter "Programming Applications for MS Windows". This question is considered rather thoroughly there.

In general, the alignment rule is as follows: each field is aligned on the address multiple of the size of this field. A field of size_t type on a 64-bit system will be aligned on an 8-byte boundary, int on a 4-byte boundary, short on a 2-byte boundary. Fields of char type are not aligned. The size of such a structure is aligned on the size multiple of the size of its maximum item. Let us explain this type of alignment by an example:

struct ABCD
 {
  size_t m_a;
  char m_b;
 };

The items will take 8 + 1 = 9 bytes. But if we want to create an array of structures ABCD[2], the size of the structure being 9 bytes, the field m_a of the second structure will lie on the non-aligned address. Therefore the compiler will add 7 empty bytes to the structure to make its size 16 bytes.

The process of optimizing a field arrangement may seem complicated. But there is a very simple and very effective method: you just need to arrange the fields in decreasing order of their sizes. This will be quite enough. In this case, the fields will be arranged without unnecessary gaps. For example, take the following structure of 40 bytes:

struct MyStruct
{
  int m_int;
  size_t m_size_t;
  short m_short;
  void *m_ptr;
  char m_char;
};

By simply sorting the sequence of the fields in decreasing order of their sizes:

struct MyStructOpt
{
  void *m_ptr;
  size_t m_size_t;
  int m_int;
  short m_short;
  char m_char;
};

we make this structure's size only 24 bytes.

Diagnosis

The tool PVS-Studio allows you to find structures in the code of 64-bit applications, whose sizes may be reduced by rearranging the fields in them. The analyzer generates the diagnostic message V401 on non-optimal structures.

The analyzer does not always generate a warning about inefficient structures because it tries to avoid too many unnecessary warnings. For example, the analyzer does not generate a message on complex derived classes because such objects are usually very few. For example:

class MyWindow : public CWnd {
  bool m_isActive;
  size_t m_sizeX, m_ sizeY;
  char m_color[3];
  ...
};

You may reduce this structure's size but there is no practical sense in it.

The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).

The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.

Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.