ARM architecture and C language (Wei Dongshan) study notes (7) - unions, memory alignment issues, bit fields, header files


1. How are structures and unions stored in memory?

(1) Structure

The storage method of a structure in memory is similar to that of an array. It is a continuous memory space, and each member variable occupies a certain byte size .

When the structure is defined, the compiler will determine the memory layout of the structure according to the type and order of the member variables. Normally, member variables are stored in the memory sequentially according to the order of definition, and the starting address of each member variable is an integer multiple of the size of the member variable. If the member variables in the structure contain different data types, the compiler may make adjustments according to the alignment requirements of the data types to ensure the access efficiency and reliability of the structure.

For example, consider the following struct definition:

struct person {
    
    
    char name[20];
    int age;
    float height;
};

Assume that the starting address of the structure in memory is 0x1000, and the char type occupies 1 byte, the int type occupies 4 bytes, and the float type occupies 4 bytes. Then, the member variables in the structure will be stored in memory in the following order:

0x1000: name[0]
0x1001: name[1]

0x1013: name[19]
0x1014: age (low byte)
0x1015: age
0x1016: age 0x1017:
age (high byte)
0x1018: height (low byte)
0x1019: height 0x101 A: height
0x101B
: height (high byte)
It can be seen that the member variables in the structure are stored in the memory in order of definition, and the starting address of each member variable is an integer multiple of its size. This storage method can make the access efficiency and reliability of the structure higher.

(2) Consortium

The members in the union are also stored in the same memory space in sequence according to the order of definition. The difference is that all members of the union share the same memory space, so only one of the members can be accessed, and the values ​​of other members may be overwritten . Therefore, special care needs to be taken when using unions.

For example, consider the following union definition:

union data {
    
    
    int i;
    float f;
    char str[20];
};

Suppose the starting address of the union in memory is 0x1000, and the int type occupies 4 bytes, the float type occupies 4 bytes, and the char type occupies 1 byte. Then, the member variables in the union will be stored in memory in the following order:

0x1000: i (low byte)
0x1001: i
0x1002: i
0x1003: i (high byte)
or
0x1000: f (low byte)
0x1001: f
0x1002: f
0x1003: f (high byte)
or
0x1000: str[0]
0x1001: str[1]

0x101 3: str[19]
It can be seen that the member variables in the union are sequentially stored in the same memory space in the order of definition, and the starting address of each member variable is an integer multiple of the size of the member variable. However, only one of these members can be accessed, and the values ​​of other members may be overwritten. For example, if the int type member variable in the union is accessed first, and then the float type member variable in the union is accessed, the value originally stored in the int type member variable may be overwritten . Therefore, special care needs to be taken when using unions to avoid errors.

(3) The occupied space of the two in memory

Take this code as an example:

#include <stdio.h>

struct example_struct {
    
    //结构体
    int a;
    char b;
    double c;
};

union example_union {
    
    //联合体
    int a;
    char b;
    double c;
};

int main() {
    
    
    struct example_struct s;
    printf("Size of struct: %lu bytes\n", sizeof(s));

    union example_union u;
    printf("Size of union: %lu bytes\n", sizeof(u));
    return 0;
}

The result is as follows:

Size of struct: 16 bytes
Size of union: 8 bytes

It can be seen that
(1) For the structure, the structure example_struct contains an int integer, a char character type and a double double-precision floating-point type, and it occupies 16 bytes in memory (4 bytes + 4 bytes + 8 bytes).
(2) The largest member in the union example_union is of type double, so it occupies 8 bytes in memory.

That is to say, the memory size of the structure is calculated according to the size of the internal member variables and based on the principle of memory alignment .
The memory size of the union, because all members occupy the same memory space, so its memory space is the memory size of the largest member .

2. Memory alignment mechanism

1. Lead to the question

struct example_struct {
    
    //结构体
    int a;
    char b;
    double c;
};

 struct example_struct s;
 printf("Size of struct: %lu bytes\n", sizeof(s));

According to the result in one, the size of the structure s is 16bytes, that is to say, it contains an int integer, a char character type and a double double-precision floating-point type, and it occupies 16 bytes in memory (4 bytes + 4 bytes + 8 bytes) . So why char itself is 1 byte, but here it takes up 4 bytes of space?
insert image description here
Only the first byte of character b is occupied, but the last three bytes are vacated, and it is forced to become 4 bytes.

2. Explain

The memory alignment mechanism means that when allocating and using memory, the system will allocate and align memory according to certain rules, thereby improving the performance and security of the program.

In the C language, each variable needs to occupy a certain byte size, for example, the int type usually occupies 4 bytes, and the double type usually occupies 8 bytes. In order to improve memory access efficiency, the system stores variables at addresses that are aligned according to specific rules, which can reduce the number of memory accesses and improve program performance .

The rules of the memory alignment mechanism are usually determined by the hardware architecture and operating system. In computers with x86 architecture, it is usually aligned according to 4 bytes or 8 bytes. For example, an int type variable is usually allocated on a 4-byte aligned address, and a double type variable is usually allocated on an 8-byte aligned address.

In the C language, you can use the #pragma pack(n) directive to change the default alignment, where n is the specified alignment value. For example, use the #pragma pack(1) directive to set the alignment value to 1 byte, thereby canceling the memory alignment mechanism. However, canceling the memory alignment mechanism may affect the performance and portability of the program, so it needs to be used with caution.

insert image description here

If memory alignment is not used, then for data of unequal length, some data may have to be read twice by intercepting two segments when being read by the CPU, and the data must be stitched together, which greatly affects the operating efficiency of the CPU.

3. Bit field

(1) What is a bit field

A bit field is a data type in the C language that allows programmers to define a structure member or union member to occupy a specified number of bits of space instead of an entire byte or word. It is usually used to save memory or communicate with hardware.

In the C language, bit fields can be declared using the colon (:) operator, and the syntax is as follows:

struct {
    
    
    type [member_name] : width;
};

Among them, type indicates the basic data type of the bit field, member_name indicates the name of the bit field (optional), and width indicates the number of bits occupied by the bit field. For example:

struct {
    
    
    unsigned int flag: 1;
    unsigned int value: 15;
};

The above code defines a structure that contains two bit field members: flag and value. Among them, the flag occupies 1 bit, and the value occupies 15 bits.

The following points need to be paid attention to when using bit fields:
the width of a bit field cannot exceed the width of its basic data type.
For the bit fields in the structure, their order and size are usually determined by the compiler, and the #pragma pack directive can be used to control the alignment.
The behavior of bit fields may vary in different compilers, so use them with caution.
Tip: Here is a summary of the article:
For example: the above is what I will talk about today. This article only briefly introduces the use of pandas, and pandas provides a large number of functions and methods that allow us to process data quickly and easily.

(2) Application of bit fields

Bit fields can be used to save memory space, especially when you need to store a large amount of bool type or enumeration type data. Here are some examples of bitfield applications:

1. Store Boolean data

The Boolean type has only two values: true and false. Multiple Boolean data can be compressed into one byte by using a bit field, thereby saving memory space.

struct bool_fields {
    
    
    unsigned int a: 1;
    unsigned int b: 1;
    unsigned int c: 1;
    unsigned int d: 1;
};

In this example, we define a structure bool_fields containing 4 bool type members. Each member occupies only 1 bit, and they can be compressed into a single byte.
The size of this structure is 4 bytes, but the variables a/b/c/d only occupy the lowest 4 bits of the lowest byte.

2. Store data of enumerated type

Enumeration types usually have only a few values, and using bit fields can compress data of multiple enumeration types into one byte, thereby saving memory space.

enum color {
    
    RED, GREEN, BLUE};

struct color_fields {
    
    
    enum color a: 2;
    enum color b: 2;
    enum color c: 2;
    enum color d: 2;
};

In this example, we define a structure color_fields that contains 4 enumeration type members. Each member takes 2 bits, which can be packed into a single byte.

3. Store bitmap data

A bitmap is a data structure used to store images, which represents each pixel in the image as a binary number. Bitfields can be used to compress bitmap data into a smaller storage space, thereby reducing storage and transmission overhead.

struct bitmap {
    
    
    unsigned int width: 10;
    unsigned int height: 10;
    unsigned char data[1024];
};

In this example, we define a bitmap structure containing bitmap width and height. The number of bits for width and height is 10 bits respectively, and the maximum value that can be represented is 1023. Bitmap data is stored in a 1024-byte character array.

4. Header file

1. The concept of header files

A header file in C language refers to a file containing information such as predefined functions, variables, macro definitions, and type declarations, which can be referenced and called by other C files . Header files are usually included in source files and can be included into the current file using the #include directive.

The following are some commonly used C language header files:

stdio.h: defines the functions and macro definitions of input and output, such as printf, scanf, puts, gets, etc.
stdlib.h: defines some common functions and types, such as malloc, calloc, realloc, exit, rand, srand, etc.
string.h: defines some string processing functions and macro definitions, such as strcpy, strcat, strlen, strcmp, memset, memcpy, etc.
math.h: defines functions and constants related to mathematical operations, such as sqrt, sin, cos, exp, PI, etc.
time.h: defines time-related functions and types, such as time, clock, strftime, tm, etc.
ctype.h: defines some character processing functions and macro definitions, such as isalpha, isdigit, isspace, tolower, toupper, etc.

2. extern keyword

(1) The role of extern

extern is a keyword in C language, which is used to declare that a variable or function is defined in other files and can be used in the current file.

Specifically, when we use a variable or function defined in another file in one file, we need to use the extern keyword to declare it. For example, if we use a variable x in the main.c file, and this variable is defined in another file func.c, then we need to use the extern keyword to declare it in the main.c file:

// main.c
#include <stdio.h>

extern int x; // 声明变量x是在其他文件中定义的

int main() {
    
    
    printf("%d\n", x); // 使用变量x
    return 0;
}
// func.c
int x = 10; // 定义变量x

void func() {
    
    
    // ...
}

In the above example, we have used variable x in main.c file and declared it using extern keyword . This way the compiler knows that the variable x is defined in some other file and can associate it with the actual definition at link time.

In addition to variables, the extern keyword can also be used to declare that functions are defined in other files, for example:

// main.c
#include <stdio.h>

extern void func(); // 声明函数func是在其他文件中定义的

int main() {
    
    
    func(); // 调用函数func
    return 0;
}
// func.c
void func() {
    
    
    // ...
}

In actual programming, the extern keyword is often used with header files to share variable and function definitions among multiple files.

(2) Can the variable be defined in the .h file?

Answer: In C language, it is generally not recommended to define variables in header files (.h files), because the content in header files will be included by multiple source files. If you define variables in header files, then these source files will all have the definition of this variable, which may cause compilation errors or unpredictable results at runtime .

Usually, header files only contain information such as function declarations, macro definitions, and type definitions . That is to say, there should be only declarations in header files, not definitions.

If you need to share a variable in multiple source files, you can define this variable in one source file, and then use the extern keyword to declare it in other source files. In this way, the compiler will link all references to the same variable definition, avoiding the problem of duplication and inconsistency of definitions.

Guess you like

Origin blog.csdn.net/qq_53092944/article/details/131886369