[C++] Structure memory alignment rules

1. Structure memory alignment (important)

Structure memory alignment is a calculation rule for structure size. It is a very popular test point in the school recruitment written examination and interview process. I hope everyone will take it seriously.

Before learning about structure memory alignment, let's first give two groups of questions about calculating the size of a structure to see if you can get it right:

//计算结构体大小
#include <stdio.h>
struct S1 
{
char c1;
int i;
char c2;
};

struct S2 
{
char c1;
char c2;
int i;
};

int main()
{
printf("%d\n", sizeof(struct S1));
printf("%d\n", sizeof(struct S2));
return 0;
}

Insert image description here

I know you are in a hurry, but don't be impatient first. We will continue to analyze the rules and explore the origin of the answer.

Structure memory alignment rules

Regarding structure memory alignment rules, most reference materials say this:

  1. The first member is at the address offset 0 from the structure variable (the address of the first member)

  2. Other member variables are aligned to addresses that are integer multiples of their alignment number (determining the relationship between the non-first member and the previous member address). For example: if the alignment number of a value is 3, then in the address map, it can only be stored At 0, 3, 6, 9…

    • Alignment number = The smaller of the compiler's default alignment number and the size of the member variable
    • The default alignment number of VS is 8
    • Only the VS compiler has the concept of default alignment number. The alignment number of variables under other compilers = the size of the variable.
  3. The total size of the structure is an integer multiple of the maximum alignment number. (The maximum number of alignments is the maximum value of the number of alignments of all variables)

  4. If a structure is nested, the nested structure is aligned to an integer multiple of its own maximum alignment number, and the overall size of the structure is an integer multiple of all the maximum alignment numbers (including the alignment number of nested structures).

Now I refine the rules:
a. First calculate the alignment number of each variable (and mark the maximum)
b. Classify and discuss the offsets of different members
c. Calculate the size of the total structure

Now that we know the alignment rules for the maximum number of alignments, let’s look at the above exercise questions:

struct S1 
{
char c1;  //变量大小为1,默认对齐数为8 -> 对齐数为1
int i;    //变量大小为4,默认对齐数为8 -> 对齐数为4
char c2;  //变量大小为1,默认对齐数为8 -> 对齐数为1
//最大对齐数是4
};

Analysis process:
We assume that the starting position of struct S1 is the position indicated by the arrow in the figure, then the offset of each position is as shown in the figure; the memory alignment rule: the first member is
at the offset of 0 from the structure variable Address: So c1 is at offset 0, and c1 occupies one byte;

Other member variables must be aligned to an address that is an integer multiple of its alignment number: since the alignment number of i is 4, i can only be stored starting from the offset of 4, and i occupies four bytes;

Other member variables should be aligned to addresses that are integer multiples of their alignment number: since the alignment number of c2 is 1, c2 is stored next to i, and c2 occupies one byte;

The total size of the structure is an integer multiple of the maximum alignment number: Since the maximum alignment number is 4, the total alignment number must be a multiple of 4, and the smallest multiple of 4 greater than 9 is 12, so the size of the entire structure is 12 words Festival.
image-20220712180818425

struct S2 
{
char c1;  //变量大小为1,默认对齐数为8 -> 对齐数为1
char c2;  //变量大小为1,默认对齐数为8 -> 对齐数为1
int i;    //变量大小为4,默认对齐数为8 -> 对齐数为4
};

Analysis process:
We assume that the starting position of struct S2 is the position indicated by the arrow in the figure, and the offset of each position is as shown in the figure; the memory alignment rule: the first member is
at the offset of 0 from the structure variable Address: So c1 starts at offset 0 and occupies one byte;

Other member variables should be aligned to addresses that are integer multiples of their alignment number: c2 has an alignment number of 1, so it is stored next to c1 and occupies one byte; other member variables should be aligned to
addresses that are integer multiples of its alignment number. Location: The alignment number of i is 4, so storage starts at an integer multiple of 4 – offset 4, occupying 4 bytes;

After storage, 0~7 occupies a total of 8 bytes. Because the maximum alignment number is 4 and 8 is an integer multiple of 4, it remains unchanged.
image-20220712181237786

2. offsetof macro (finding the structure offset)

Introduction to offsetof

offsetof is a macro defined in C language that is used to find the offset of a structure member in a structure. Its corresponding header file is <stddef.h>. Since offsetof is used in the same way as a function, it is often It is mistakenly considered to be a function; we can right-click offsetof in VS to go to the definition and see how offsetof is implemented in VS.
Insert image description here

parameters of offsetof

size_t offsetof(structure variable name, member variable name);

Use of offsetof

#include <stdio.h>
#include <stddef.h>  //offsetof对应头文件
struct S1
{
char c1;
int i;
char c2;
};

struct S2
{
char c1;
char c2;
int i;
};

int main()
{
printf("%d\t", offsetof(struct S1, c1));
printf("%d\t", offsetof(struct S1, i));
printf("%d\n", offsetof(struct S1, c2));

printf("%d\t", offsetof(struct S2, c1));
printf("%d\t", offsetof(struct S2, c2));
printf("%d\n", offsetof(struct S2, i));

return 0;
}

Insert image description here

It is also correct to write the first parameter of offsetof here as S1 and S2.

Simulated implementation of offsetof

Let’s take the above struct S1 as an example. After the above analysis, we already know that the size of struct S1 is 12, and draw a specific diagram:image-20220712180818425

After observation, we found that: the offset of the structure member in the structure = the address of the structure member - the starting address of the structure , such as the address of i in struct S1 - the starting address of the structure, we can get the structure member i The offset of is equal to 4; then if the starting address of the structure is at 0, then the offset of the structure member = the address of the structure member - 0 = the address of the structure member, so we can convert 0 to the corresponding Structure pointer type, and then return the address of the structure member to get the offset of the structure member . The specific code is as follows:

#include <stdio.h>
#define OFFSETOF(type, member) (size_t)&(((type*)0)->member)
struct S1
{
char c1;
int i;
char c2;
};

int main()
{
printf("%d\n", OFFSETOF(struct S1, c1));
printf("%d\n", OFFSETOF(struct S1, i));
printf("%d\n", OFFSETOF(struct S1, c2));
return 0;
}

image-20220719232345217

3. Why does memory alignment exist?

From the above example, we can see that structure memory alignment will waste a certain amount of memory space, but doesn't the computer have to try its best not to waste resources? So why does memory alignment exist? Regarding the reason why memory alignment exists, most reference materials say this:

  1. Platform reasons (transplantation reasons): Not all hardware platforms can access any data at any address; some hardware platforms can only fetch certain types of data at certain addresses, otherwise a hardware exception will be thrown.
  2. Performance reasons: Data structures (especially stacks) should be aligned on natural boundaries whenever possible. The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; aligned memory access requires only one access. Therefore, memory alignment can improve access efficiency .
  3. Generally speaking: memory alignment of structures is a way of exchanging space for time.

Here I will explain the second point of the reason:

As we all know, our machines are divided into 32-bit machines and 64-bit machines. The 32-bit and 64-bit here actually refer to the number of digits in the CPU, and the number of digits in the CPU corresponds to the word length of the CPU, and the word length determines When the CPU reads data, how much space is accessed at one time, that is, how many bytes are read at a time. Let's take a 32-bit machine as an example:

image-20220712183831801As shown in the figure, a 32-bit machine accesses four bytes at a time. If there is no memory alignment, then two reads are required to retrieve the data in i. If there is memory alignment, only one read is required .

Tips for designing structures

After understanding the alignment rules of the structure, is there a way that can allow us to meet the alignment rules and save as much space as possible when designing the structure? In fact, there is, and the method is: let the members who take up less space gather together as much as possible . Just like the exercise, we put c1 and c2 under the occupied space together, so that struct S2 is four bytes smaller than struct S1.

4. Modify the default alignment number

We can use the "#pragma pack(num)" command to modify the default alignment number in VS. For example:

#include <stdio.h>

#pragma pack(8)//设置默认对齐数为8
struct S1
{
char c1;
int i;
char c2;
};
#pragma pack()//取消设置的默认对齐数,还原为默认

#pragma pack(1)//设置默认对齐数为1
struct S2
{
char c1;
int i;
char c2;
};
#pragma pack()//取消设置的默认对齐数,还原为默认

int main()
{
//输出的结果是什么?
printf("%d\n", sizeof(struct S1));
printf("%d\n", sizeof(struct S2));
return 0;
}

image-20220712184907940

In struct S2, we set the default alignment number of VS to 1 (equivalent to no alignment) through the "#pragma pack(1)" command, making its size 6.

5. Structure size calculation exercises

Exercise 1

#include <stdio.h>

struct S3
{
double d;
char c;
int i;
};

int main()
{
printf("%d\n", sizeof(struct S3));
return 0;
}

image-20220712185417651
Damn, double is 8 bytes , I'm so confused.

d is stored starting from offset 0, occupying 8 bytes, so 0~7; c is stored next to d, occupying one byte, so 8, i starts storing from an integer multiple of 4, that is, 12, occupying 4 words section, so 12~15; so 0~15 totals 16 bytes, and 16 is a multiple of the maximum alignment number of 8, so it remains unchanged.

Exercise 2

#include <stdio.h>

struct S3
{
double d;
char c;
int i;
};

struct S4
{
char c1;
struct S3 s3;
double d;
};

int main()
{
printf("%d\n", sizeof(struct S4));
return 0;
}

c1 starts to be stored at offset 0 and occupies one byte, so 0; struct S3 s3 We have calculated above that it occupies 16 bytes, and because the nested structure is aligned to an integer multiple of its own maximum alignment number, so Start storing from an integer multiple of 8, that is, offset 8, so 8~23; d Start storing from an integer multiple of 8, that is, offset 24, occupying 8 bytes, so 24~31; a total of 32 bytes, and It is an integer multiple of the maximum offset number 8, so it does not change.

Exercise 3

#include <stdio.h>

#pragma pack(4)
struct tagTest1
{
    short a;
    char d;
    long b;
    long c;
};
struct tagTest2
{
    long b;
    short c;
    char d;
    long a;
};
struct tagTest3
{
    short c;
    long b;
    char d;
    long a;
};
#pragma pack()

int main(int argc, char* argv[])
{
    struct tagTest1 stT1;
    struct tagTest2 stT2;
    struct tagTest3 stT3;

    printf("%d %d %d", sizeof(stT1), sizeof(stT2), sizeof(stT3));
    return 0;
}

image-20220712190713050

This is the number of address digits stored in each variable. See if you can match it!

stT1:

a: 0~1 d:2 b:4~7 c:8~11 Total: 0~11 = 12 (multiples of 4);

stT2:

b:0~3 c:4~5 d:6 a:8~11 Total: 0~11 = 12 (multiple of 4);

stT3:

c:0~1 b:4~7 d:8 a:12~15 Total: 0~15 = 16 (multiple of 4);

Wuhu! ! I hope this article on structure alignment rules can help you!

Please add image description

Guess you like

Origin blog.csdn.net/weixin_62985813/article/details/132782632