Memory Alignment
1, the CPU access principle
CPU is not accessing bytes of data. CPU to the memory as a piece of the block size may be 2 , 4 , 8 , 16 bytes in size, so CPU time memory is one of a read. Each memory access will have a fixed cost, reduce the number of memory access will improve the performance. Therefore, CPU typically will 2/4/8/16/32 byte access operation units. We call these the access unit is referred to as block size ( Memory Access granularity & ) memory access granularity.
For example: in one access particle size of 4 bytes of memory, the start address 0 is read 4 bytes into the register, then the address 1 is read 4 bytes into the register.
When the address from 0 at the start of the read data, the read data is aligned address, reading can be done directly by one. When an address from a data non-aligned address read when reading data. Need to read data twice to complete.
And after the data has been read twice, but also the 0-3 data is shifted upwards 1 byte, 4-7 data is shifted downward by 3 bytes. Then combined into two final data register.
A memory for unaligned data so much extra work, which CPU large overhead, greatly reduces the CPU performance. Therefore, some processor was reluctant to do the work for you.
Memory alignment principles
1: data member alignment rules: Structure (struct) (or in combination (Union)) of the data member, the first
A local data member in offset 0, the starting position after each data member to store
From the size of the child members of a member or members of magnitude (as long as the members have child members, such as an array,
Start integer multiple structures, etc.) (such as the 4-byte int, is an integer multiple from the start address memory 4
Reserve.
2: the structure as a member: If a structure, there are some members of the structure, the structure members from
The maximum size of the internal elements of an integer multiple of the start address is stored. (Struct a where there struct b, b
There are char, int, double and other elements that b should start from an integer multiple of 8 is stored.)
3: finishing touches: The total size of the structure, which is the sizeof the result must be the largest of its internal
Integer multiple members. Insufficient to be filled.
Example:
1、
struct StructTwo {
Double B; //. 8 bytes
char A; //. 1 byte
Short D; // 2 -byte + 1
int C; //. 4 bytes
} MyStruct3;
b (8) is filled maximum 8, less than 8. a (1) followed by D (2), add up to 3, at any later need to start at the start position is a multiple of 2 of 4. So c (4) and a, d row together.
b(8) + [a(1)+d(2)+1+c(4)]= 16;
2、
struct StructTwo {
Double B; //. 8 bytes
char A; //. 1 byte
int C; //. 4 bytes +3
Short D; // 2 bytes + 6
} MyStruct2;
MyStruct2
b (8), maximum 8, less than 8 filled. a (1) followed by C (4), adds up to 5, 8 behind the need to start i.e., at the start of full multiple of four at any. Therefore, d (2) 6 filled them again.
b(8) + [a(1)+c(4)+3] + d(2)+6 = 24;
3、
struct StructTwo {
char A; //. 1 byte
struct MyStruct2; // 24 bytes 8-31
int C; //. 4 bytes 32-35
Short D; // 2 bytes + 2 36-37
} MyStruct3;
MyStruct3
MyStruct2 in b (8), maximum 8, less than 8 filled. a (1) the structure behind it a (1) themselves filled +7. MyStruct2 structure is 24 . c (4) + d (2 ) = 6 2 filled again.
[a(1)+7] + 24 + [c(4)+ d(2)+2] = 40;
4、
struct StructTwo {
int C; //. 4 bytes
} MyStruct4;
struct StructTwo {
char A; //. 1 byte
struct MyStruct4; //. 4 bytes 8-31
int C; //. 4 bytes 32-35
Short D; // 2 bytes + 2 36-37
} MyStruct5;
MyStruct5
MyStruct4 in c (4), maximum 4 is filled, less than 4. While meeting lie at integer multiples of 4. a (1) and there are 7 bytes MyStruct4 from 4 byte row, while itself is 4 bytes, a (1). 8 = +3+ MyStruct4 . c (4) + d (2 ) = 6 2 filled again.
[a(1)+3+4] + [c(4)+ d(2)+2] = 16;