Detailed explanation of custom types (structure, enumeration, union)

foreword

  This issue introduces the self-defined type in C language. When we usually do questions, but when we need to represent a piece of data, we only need to define a variable to represent it. But what if I need to represent multiple information? For example, we need to record the names, student numbers, gender and age of multiple students. Do we need to define multiple name1, name2, etc. to record? Does this seem like a hassle? So the structure introduced in this issue came into being to deal with this kind of problem, so let's take a look.

1. Structure

1.1 The concept of structure type

  A structure is a collection of values ​​called member variables. Each member of the structure can be a variable of a different type.

1.2 Declaration of structure

struct Stu
{
    
    
 char name[20];//名字
 int age;//年龄
 char sex[5];//性别
 char id[20];//学号
}; //分号不能丢

  struct is a keyword for declaring a structure and cannot be omitted, which is equivalent to declaring a variable a as an int (int a). Stu is the name of the structure declared by the user, and the middle is the various data that the user needs to record.
  It can be mentioned here that the struct cannot be omitted, but the name of the structure can be omitted, which is called an anonymous structure type.

//匿名结构体类型
struct
{
    
    
 	int a;
	char b;
 	float c;
}x;
struct
{
    
    
 	int a;
 	char b;
 	float c;
}a[20], *p;

  You can look at the above two anonymous structures, do you think the same? The answer is different, because not writing a name means that the system does not know its name, so the system will automatically think that they are different, and will not think that they are the same because they have no names.

1.3 Self-reference of structure

  As the name suggests, it is to use yourself. Let's look at this first:

struct Node
{
    
    
 	int data;
 	struct Node next;
};

  Is such a reference right? why? The answer is wrong. If you write it like this, you can think about it first. The struct Node next contains new int data and new struct Node next, which will continue to repeat the cycle and cannot be stopped.

struct Node
{
    
    
 	int data;
 	struct Node* next;
};

  But this is different. When you need to continue calling yourself, you only need to store the address of the next structure in struct Node* next. Fill in the address of the next structure when you need it. Fill in NULL to end the self-reference.

1.4 Definition and initialization of structure variables

  Take student information as an example:

//第一种写法
struct Stu
{
    
    
 	char name[20];//名字
 	int age;//年龄
 	char sex[5];//性别
 	char id[20];//学号
}; //分号不能丢

struct Stu s = {
    
     "张三",20,"男",1001 };

//第二种写法
struct Stu
{
    
    
 	char name[20];//名字
 	int age;//年龄
 	char sex[5];//性别
 	char id[20];//学号
} s = {
    
     "张三",20,"男",1001 };

  Both methods are available. The first method is generally written in the main function, which is a local variable, and the second method is written after the structure, outside the main function, and is a global variable.

1.5 Structure memory alignment (emphasis)

  Now that we already know the declaration and initialization of the structure, how are its many data stored in memory? First look at the alignment rules of the structure:

  1. The first member is at offset 0 from the structure variable.
  2. Other member variables should be aligned to an address that is an integer multiple of a certain number (alignment number).
    Alignment = Compiler's default alignment and the smaller value of the member size.
    • The default value in VS is 8
  3. The total size of the structure is an integer multiple of the maximum alignment (each member variable has an alignment).
  4. If a structure is nested, the nested structure is aligned to an integer multiple of its own maximum alignment, and the overall size of the structure is an integer multiple of all maximum alignments (including the alignment of the nested structure).
      It may not be easy to understand just by reading the words, let's calculate one:
struct S1
{
    
    
 	char c1;
 	char c2;
 	int i;
};

insert image description here
  We understand through rules, code, and diagrams. The first element of the structure is placed at the address with an offset of 0, and the position of the second element c2 can be seen through rule 2, (alignment = the comparison between the default alignment number of the compiler and the size of the member Small value) At this time, the type of c2 is char, and the size is one byte, which is smaller than the default value of 8 in vs, so choose to use 1, that is, put it at an integer multiple of 1, then everyone sees that c1 is followed by is the address with an offset of 1, which happens to be an integer multiple of 1, so c2 is placed at the position in the above figure.
  Let’s look at i again, according to the same rules, the type of i is int, the size is 4 bytes, which is smaller than the default value of 8 in vs, so choose to use 4, that is, put it at an integer multiple of 4, and The offset of the address behind c2 is 2, not a multiple of 4, so continue to look backwards, find the first offset that is an integer multiple of 4, and store i there. In the picture above, that is where the offset is 4.
  At this time, all variables have been stored, so let’s look at rule 3. The final size of the structure is an integer multiple of the maximum alignment number. In s1, the maximum alignment number is 1 for char and 4 for int, so the maximum alignment of s1 is The number is 4, and the size occupied at this time is exactly 8 (from c1 to i), which is exactly an integer multiple of 4, so the final size of s1 is 8 bytes.

struct S2
{
    
    
 	char s;
 	struct S1 s1;
 	double d;
};

  Let's look at a structure nesting question again, or look at it in combination with the picture:
insert image description here
  the first element is still placed at the offset of 0, and then we know that the size of s1 is 8 bytes, and the maximum alignment is 4 , then according to rule four, (if the structure is nested,nested structurealign toown maximum alignmentInteger multiples of , the overall size of the structure is an integer multiple of all maximum alignments (including the alignment of nested structures). ) indicates that s1 should be placed at an integer multiple of 4, so it can be placed as shown in the figure.
  Looking at d again, the size of double is 8 and the default value of 8 in vs must be equal, so it can be placed at an integer multiple of offset 8. Now s1 has already occupied the place with offset 12, followed by The integer multiple of 8 is 16, so put d at the offset of 16, and the size of all elements is 24 bytes, and then look at rule three,The size of the final structure is an integer multiple of the maximum alignment, the alignment number of the variable s is 1, the alignment number of the structure s1 is 4, and the alignment number of d is 8, so if the final size is an integer multiple of 8, if it is not enough, it will waste space and continue to search backward. 24 at this time is exactly an integer multiple of 8, so the final size is 24 bytes.

1.5.1 Why does memory alignment exist?

  1. Platform reason (transplant reason):
  Not all hardware platforms can access any data at any address; some hardware platforms can only fetch certain types of data at certain addresses, otherwise a hardware exception will be thrown.
  2. Performance reasons:
  data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that to access unaligned memory, the processor needs to make two memory accesses; while aligned memory accesses require only one access.
  In general:
  the memory alignment of structures is a practice of exchanging space for time.

1.5.2 How to further save space

  Let members who occupy a small space gather together as much as possible.

struct S1
{
    
    
 	char c1;
 	int i;
 	char c2;
};
struct S2
{
    
    
 	char c1;
 	char c2;
 	int i;
};

  The size of S1 is 12, and the size of S2 is 8. The members of S1 and S2 are exactly the same, but there are some differences in the size of the space occupied by S1 and S2.

1.6 Modify the default alignment (VS)

#pragma pack(8)//设置默认对齐数为8
struct S1
{
    
    
 char c1;
 int i;
 char c2;
};
#pragma pack()//取消设置的默认对齐数,还原为默认
#pragma pack(1)//设置默认对齐数为1
struct S2
{
    
    
 char c1;
 int i;
 char c2;
};
#pragma pack()//取消设置的默认对齐数,还原为默认
int main()
{
    
    
    //输出的结果是什么?
    printf("%d\n", sizeof(struct S1));
    printf("%d\n", sizeof(struct S2));
    return 0;
} 

  The answer is 12 for one and 6 for one. When the alignment of the structure is inappropriate, we can change the default alignment by ourselves.

1.7 Structure parameter passing

  When a function passes parameters, the parameters need to be pushed onto the stack, which will cause system overhead in time and space. If a structure object is passed, the structure is too large, and the system overhead of pushing the parameters on the stack is relatively large, which will lead to a decrease in performance. So when using a structure to pass parameters, you need to pass the address of the structure.

1.8 bit segments

Bit field declarations and structures are similar, with two differences:

  1. Members of bit fields must be int, unsigned int, or signed int.
  2. A colon and a number follow the member name of a bit field.
struct A
{
    
    
 	int a:2;
 	int b:5;
 	int c:10;
 	int d:30;
};

  The number after the colon represents the bits it can occupy. For example, int a was originally 4 bytes, 32 bits, but now int a: 2, which means it can only occupy 2 bits.

1.8.1 Memory Allocation for Bit Segments

  1. The members of the bit field can be int unsigned int signed int or char (belonging to the integer family) type
  2. The space of the bit field is opened up in the form of 4 bytes (int) or 1 byte (char) according to the need.
  3. Bit segments involve many uncertain factors, bit segments are not cross-platform, and programs that focus on portability should avoid using bit segments.
  4. If there are not enough remaining bits for one byte required, a new byte is opened.
    for example:
struct S
{
    
    
 char a:3;
 char b:4;
 char c:5;
 char d:4;
};
struct S s = {
    
    0};
s.a = 10;
s.b = 12;
s.c = 3;
s.d = 4;


  The initial appearance, and then store them one by one, and store them in the way of little-endian storage.
insert image description here
insert image description here

1.8.2 Bit segment cross-platform problem

  1. It is undefined whether an int bit field is treated as signed or unsigned.
  2. The maximum number of bits in a bit field cannot be determined. (16-bit machines can be up to 16, 32-bit machines can be up to 32, written as 27, there will be problems on 16-bit machines.
  3. Whether members of a bit field are allocated from left to right in memory or from right to left is undefined.
  4. When a structure contains two bit fields, and the members of the second bit field are too large to fit in the remaining bits of the first bit field, it is uncertain whether to discard or utilize the remaining bits.
  5. Summary: Compared with the structure, the bit segment can achieve the same effect, but it can save space very well, but there are cross-platform problems.

2. Enumeration

2.1 The concept of enumeration

  Enumeration, as the name suggests, is to enumerate one by one. List all possible values. For example, in our real life: Monday to Sunday in a week is limited to 7 days, which can be listed one by one; gender: male, female, confidential, can also be listed one by one, and enumeration can be used here.

2.2 Definition of enumeration

enum Day//星期
{
    
    
 	Mon,
 	Tues,
 	Wed,
 	Thur,
 	Fri,
 	Sat,
 	Sun
};
enum Sex//性别
{
    
    
 	MALE,
 	FEMALE,
 	SECRET
}enum Color//颜色
{
    
    
 	RED,
 	GREEN,
 	BLUE
};

  The enum Day , enum Sex , and enum Color defined above are all enumeration types. The content in {} is the possible value of the enumeration type, also called enumeration constant.The enumeration constant cannot be changed. Once the initialization is completed, it cannot be modified in the main function., these possible values ​​are all valid, starting from 0 by default, incrementing by 1 at a time, of course, the initial value can also be assigned when it is set. For example:

enum Color//颜色
{
    
    
 	RED=1,
 	GREEN=2,
 	BLUE=4
};

2.3 Advantages of enumeration

  Why use enums? We can use #define to define constants, why use enums? Advantages of enums:

  1. Increase code readability and maintainability
  2. Compared with the identifier defined by #define, the enumeration has type checking, which is more rigorous.
  3. prevents naming pollution (encapsulation)
  4. Easy to debug
  5. Easy to use, you can define multiple constants at a time

3. Consortium

3.1 The concept of a consortium

  Union is also a special custom type. The variable defined by this type also contains a series of members. The characteristic is that these members share the same space (so union is also called union).

3.2 Definition of Consortium

//联合类型的声明
union Un
{
    
    
 	char c;
 	int i;
};
//联合变量的定义
	union Un un;

3.3 Characteristics of the consortium

  The members of the union share the same memory space. The size of such a joint variable is at least the size of the largest member (because the union must at least have the ability to save the largest member). The black part c is in use while i is also in use
insert image description here  .

3.3 Calculation of union size

  1. The size of the union is at least the size of the largest member.
  2. When the maximum member size is not an integer multiple of the maximum alignment, it must be aligned to an integer multiple of the maximum alignment.
union Un1
{
    
    
 char c[5];
 int i;
};
union Un2
{
    
    
 short c[7];
 int i;
};
//下面输出的结果是什么?
printf("%d\n", sizeof(union Un1));
printf("%d\n", sizeof(union Un2));

  The answer is one is 8 and the other is 16. For the first one, i shares the same space with char[0], char[1], char[2], and char[3], plus char[4], a total of five bytes, and then according to The second rule is to align to an integer multiple of the maximum alignment number, so the final size is 8 bytes, and the second one is the same.

4. Summary

  This is the end of this issue about structure-related knowledge. The focus of this issue is still the part of calculating the size of the structure. It is recommended that you watch it multiple times to understand it, and then do some questions to consolidate it. I hope everyone can understand it here. I've made some progress.
  If you think this article is good, you can give a like to encourage the blogger. If you have something you don’t understand or find an error in it, you can leave a message in the comment area or send me a private message. Then this issue ends here , let's see you next time.

Guess you like

Origin blog.csdn.net/qq_62321047/article/details/129695759