Detailed explanation of C language—custom types (structures, enumerations, unions)

Table of contents

Structure

1. Declaration of structure type

2. Self-reference of structure

3. Definition and initialization of structure variables

4. Structure memory alignment

Rules for structure memory alignment:

Why does memory alignment exist?

 Modify the default alignment number 

5. Structure parameter passing

6. Structure implements bit segments (filling of bit segments & portability)

Bit segment memory allocation

Cross-platform issues with bit segments

enumerate

1. Definition of enumeration type

2. Advantages of enumeration

3. Use of enumerations

joint

1. Definition of union type

2. Characteristics of union

3. Calculation of union size


Structure

1. Declaration of structure type

A structure is a collection of values, called member variables, and each member of the structure can be a variable of different types.

For example, to describe a student, we can put his various information into a structure type. 

struct Stu
{
    char name[20];//名字
    int age;      //年龄
    char sex[2];  //性别
    char id[20];  //学号
};  //一定要有分号!!!

When declaring a structure, you can make an incomplete declaration:

struct
{
    int a;
    char b;
    float c;
}x;
struct
{
    int a;
    char b;
    float c;
}*p;

 Is it legal to add this code at this time?

int main()
{
    p=&x;
}

The result is as follows:

 

Although the members are the same, the compiler will treat the above two declarations as two completely different types.

If you create a structure that only needs to be used once, you can use anonymous structures, but anonymous structures are generally rarely used. 

2. Self-reference of structure

Let’s first introduce the data structure: it describes the linear data structure of the data in memory.

  • Suppose you want to store 1 2 3 4 5, you can use a continuous space to store it, this is called a sequence table.
  • You can also store data in different locations without any relationship, and then string them together through chains. When you find 1, then look for 2, and when you find 2, then look for 3, and so on. If you find 1, you can find the subsequent data. It's called a linked list.

 

Each structure that stores data in the linked list is called a node. To find the next node from one node, someone thought of defining the node as a structure.

struct N
{
    int data;
    struct Node next;
}

Is this the right way?

Of course, this method does not work. The next of the first data contains the data and next of the second data. The next of the first data will be nested and contain all subsequent data. Unable to calculate size of structure type.

In fact, it can be solved by storing the address of the next data each time. If the structure wants to contain the same type of structure, it needs to use the structure pointer form to point to the same type of data.

struct Node
{
    int data;
    struct Node* next;
};

Let's look down,

We can also use typedef to rename the anonymous structure type. In the following code, it seems that because we renamed the anonymous structure type to Node, the structure pointer type is included inside the structure.

typedef struct
{
    int data;
    Node* next;
}Node;

int main()
{
    Node n;
    return 0;
}

But this method ignores the order issue. We need to rename this anonymous structure type to generate Node . The generation condition is that the structure must be a qualified and complete type, and Node* appears in the anonymous structure type before renaming. , we have not generated Node at this time . It is absolutely not allowed to use Node in advance before generating Node .

We don't need anonymous structures, just define the struct Node type first and then use Node*.                    code show as below:

struct Node
{
    int data;
    Node* next;
};

int main()
{
    Node n;
    return 0;
}

3. Definition and initialization of structure variables

The definition of a structure has the following two forms:

  • One is defined before the semicolon of the structure,
  • The other one is defined in the main function. 
struct SN
{
	char c;
	int i;
}sn1,sn2;
int main()
{
    struct SN s1,s2;
}

 Initialization uses the form of curly brackets plus corresponding data. It can also be initialized using a dot like .i.c plus structure members within the curly braces . The initialization location can be at the structure or the main function .

struct SN
{
	char c;
	int i;
}sn1 = { 'q', 100 }, sn2 = {.i=200, .c='w'};

int main()
{
    struct SN s1 = { 'q', 100 } , s2 = {.i=200, .c='w'};
}

We can also include arrays and structures within the structure , and braces  are required to assign values ​​inside the structure .

When outputting, the structure variable uses the variable name plus .structure member, and the same goes for the structure contained in the structure. If it is a structure pointer, you need to -> add the structure members.

struct SN
{
	char c;
	int i;
}sn1,sn2;

struct S
{
    double d;
    struct SN sn;
    int arr[10];
};

int main()
{
    struct S s = { 3.14, {'a', 99}, {1,2,3} };
    printf("%lf %c %d\n", s.d, s.sn.c, s.sn.i);
	int i = 0;
	for (i = 0; i < 10; i++)
		printf("%d ", s.arr[i]);

	return 0;
}

 Output result:

4. Structure memory alignment

struct S1
{
    char c1;
    int i;
    char c2;
};

struct S2
{
    int i;
    char c1;
    char c2;
};

int main()
{   
    printf("%d\n", sizeof(struct S1));
    printf("%d\n", sizeof(struct S2));
    return 0;
}

Output result:  This is inconsistent with our imagination that the char type occupies one byte and the int type occupies four bytes. Therefore, the size of the structure S1 of two char types and one int type is obviously inconsistent with six bytes. So why is this? Woolen cloth?

At this time, structure alignment is introduced.

We can use the macro offsetof() to calculate the offset of the structure members compared to the starting position of the structure. We need to include the header file <stddef.h> and use the offset to see how the structure members are stored in memory.

printf("%d\n", offsetof(struct S1, c1));
printf("%d\n", offsetof(struct S1, i));
printf("%d\n", offsetof(struct S1, c2));

Output result: 

From this, let's draw a picture to see. According to the offset calculated by offsetof(), it seems that the structure S1 only occupies nine bytes, but in fact we have already learned that the size of the structure S1 is twelve bytes. Bytes, why is this?

Through the above phenomenon analysis, we found that the structure members are not stored continuously in the memory in order, but have certain alignment rules.

Rules for structure memory alignment:

  1. The first member of a structure is always placed at an offset of 0 from the starting position of the structure variable.
  2. Starting from the second member, each subsequent member must be aligned to an integer multiple of a certain alignment number. Alignment number: The smaller value of the size of the structure member itself and the default alignment number. The default alignment number on VS is 8. There is no default alignment number in gcc. The alignment number is the size of the structure member itself.
  3. The total size of the structure must be an integer multiple of the maximum alignment number. The maximum alignment number is the largest value among the alignment numbers of all members.
  4. If a structure is nested, the nested structure is aligned to an integer multiple of its own maximum alignment number, and the overall size of the structure is an integer multiple of all the maximum alignment numbers (including the alignment number of nested structures).

At this point we understand:

  • Assuming that char c1 is the first member, it is placed at offset 0.  The size of the second member int i is 4 bytes. Compared with the default alignment number of 8 of VS, the alignment number of i is smaller. 4. Therefore, i needs to be offset by an integer multiple of the alignment number 4, which is 4. It stores 4 bytes and occupies offsets 4 to 7.
  • The alignment number of c2 is 1, so it can be offset from 1 to 8.
  • The maximum alignment number is 4. The total size of the structure is an integer multiple of the maximum alignment number, which is an integer multiple of 4. The current size of the structure is 9, so it takes up 3 more spaces to make 12, which is consistent with 4. Integer multiples, so the size of structure S1 is ultimately 12 bytes.

Now explain S2: 

  •  The first member int type i is placed at 0 and occupies four bytes. The alignment number of char type c1 is 1 and is placed at 4. The alignment number of char type c2 is 1 and is placed at 5.
  • The maximum alignment number is int i. The alignment number is 4, so the size of the structure is an integer multiple of 4. It is currently 6 bytes. It occupies two bytes later. If the size is eight bytes, it meets the requirements. The structure The size of the body variable S2 is eight bytes.

Example 1: 

struct S3
{
	double d;
	char c;
	int i;
};

 The double type occupies eight bytes, and d occupies offset 0 to offset 7. The alignment number of char c is 1 and occupies offset 8. The alignment number of int i is 4, offset to 12 offsets which is an integer multiple of 4, and occupy offsets 12 to 15. The size of the entire structure is 16 bytes, which is exactly an integer multiple of the alignment number 8 of the maximum alignment number double. The size of structure S3 is 16 bytes.

 Example 2:

struct S4
{
	char c1;
	struct S3 s3;
	double d;
};
  • Char c1 occupies one byte and is placed at offset 0. 
  • If a structure is nested, the nested structure is aligned to an integer multiple of its own maximum alignment number, and the overall size of the structure is an integer multiple of all the maximum alignment numbers (including the alignment number of nested structures).
  • The maximum alignment number of structure S3 is 8, so it is aligned to offset 8 and occupies sixteen bytes of the size of S3 to offset 23.
  • The alignment number of double d is 8, the offset is at offset 24, an integer multiple of 8, and it occupies 8 bytes at offset 31.
  • The current size is 32, which is exactly an integer multiple of the maximum alignment number of 8, and the size of the structure is 32 bytes.

 

Why does memory alignment exist?

1. Platform reasons (transplantation reasons) :
Not all hardware platforms can access any data at any address; some hardware platforms can only access certain special data at certain addresses.
data of a certain type, otherwise a hardware exception is thrown.
2. Performance reasons :
Data structures (especially stacks) should be aligned on natural boundaries whenever possible.
The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; aligned memory access requires only one access.
ask.
In general:
Memory alignment of structures trades space for time .
When designing a structure, we must not only satisfy alignment but also save space. How to do this:
Let members who take up less space gather together as much as possible

 Modify the default alignment number 

#pragma pack(1)

struct S
{
	char c1;//1 1 1
	int a; // 4 1 1
	char c2;//1 1 1
};
#pragma pack()

int main()
{
	printf("%d\n", sizeof(struct S));
	return 0;
}

Output result: , modify the default alignment number through #pragma pack(1) so that the alignment number of each member is 1. At this time, they are not aligned and are arranged in memory in sequence. The size is six bytes.

After the structure uses the alignment number 1 , cancel the set default alignment number through #pragma pack() and restore it to the default.

When the alignment of the structure is inappropriate, we can change the default alignment number ourselves. 

5. Structure parameter passing

struct S
{
    int data[100];
    int num;
};

void print1(struct S tmp)
{
    printf("%d\n",tmp.num);
}

void print2(struct S* ps)
{
    printf("%d\n",ps->num);
}

int main()
{
    struct S s = { {1,2,3}, 100 };
    print1(s);
    print2(&s);
    return 0;
}

Output result: 

The print2 function is preferred for passing parameters:

  • When a function passes parameters, the parameters need to be pushed onto the stack, which will cause system overhead in time and space.
  • If a structure object is passed and the structure is too large, the system overhead of pushing parameters onto the stack will be relatively large, which will lead to performance degradation. 

6. Structure implements bit segments (filling of bit segments & portability)

The declaration and structure of bit fields are similar, with two differences:
1. The members of the bit field must be int, unsigned int or signed int.
2. There is a colon and a number after the member name of the bit field. This number indicates the size of the stored binary bits.
struct A
{
 int _a:2;
 int _b:5;
 int _c:10;
 int _d:30;
};

Bit segment memory allocation

1. Members of the bit field can be int unsigned int signed int or char (belonging to the integer family) type
2. The space of the bit segment is allocated in 4 bytes ( int ) or 1 byte ( char ) as needed.
3. Bit segments involve many uncertain factors. Bit segments are not cross-platform. Programs that focus on portability should avoid using bit segments.

 

 Let’s take a look at this example:

int main()
{
  unsigned char puc[4];
  struct tagPIM
  {
    unsigned char ucPim1;
    unsigned char ucData0 : 1;
    unsigned char ucData1 : 2;
    unsigned char ucData2 : 3;
  }*pstPimData;
  pstPimData = (struct tagPIM*)puc;
  memset(puc,0,4);
  pstPimData->ucPim1 = 2; 
  pstPimData->ucData0 = 3;
  pstPimData->ucData1 = 4;
  pstPimData->ucData2 = 5;
  printf("%02x %02x %02x %02x\n",puc[0], puc[1], puc[2], puc[3]);
  return 0;
}

First definepuc  an array containing 4 unsigned characters named 

tagPIM A structure named is defined  , which includes:

  • An unsigned character  ucPim1.
  • The three bit fields  ucData0, ucData1 and  ucData2, are 1, 2 and 3 bits in size respectively.

pstPimData Then a pointer of type named  is created  struct tagPIM .

puc The address is assigned  to  pstPimData. This means  the memory location pstPimData currently pointed to  puc .

Use  memset the function to  puc set all bytes to 0.

Sets  pstPimData the value of the member of the structure pointed to by:

  • ucPim1 is set to 2.
  • ucData0 is set to 3. But since it is a 1-bit field, it can only hold values ​​of 0 or 1. Therefore, setting it to 3 is equivalent to setting it to 1 (binary: 11, but only the least significant bit is considered).
  • ucData1 is set to 4. It is a 2-bit field. The binary representation of 4 is 100, but only the two least significant bits are considered, so it is set to 0.
  • ucData2 is set to 5. It is a 3-bit field. The binary representation of 5 is 101, which fits 3 bits, so it is set to 5.

Now, let's see  tagPIM what the memory layout looks like:

 ucPim1 | ucData2 | ucData1 | ucData0
    8 bits | 3 bits | 2 bits | 1 bit                                                                                       
00000010 | 101 | 00 | 1

unsigned char puc[4] allocates four bytes so in memory it is 00000010 00101001 00000000 00000000 and printing the result in hex is 02 29 00 00

Cross-platform issues with bit segments

1. It is uncertain whether the int bit field is regarded as a signed number or an unsigned number.
2. The maximum number of bits in the bit field cannot be determined. (The maximum number for a 16-bit machine is 16, and the maximum number for a 32-bit machine is 32. Writing it as 27 will cause problems on a 16-bit machine.
3. Whether members in a bit segment are allocated from left to right or right to left in memory has not yet been defined.
4. When a structure contains two bit fields, and the members of the second bit field are larger and cannot accommodate the remaining bits of the first bit field, it is uncertain whether to discard the remaining bits or use them.
Summarize:
Compared with the structure, the bit segment can achieve the same effect, but can save space very well, but there are cross-platform problems.

enumerate

1. Definition of enumeration type

Enumeration, as the name suggests, means enumerating items one by one.
List the possible values ​​one by one.
For example, in our real life:
  • There are a limited number of 7 days from Monday to Sunday in a week, which can be listed one by one.
  • Gender includes: male, female, confidential, or you can list them one by one.
  • There are 12 months in the month, and you can also list them one by one.
enum Color
{
	RED,//0
	GREEN,//1
	BLUE//2
};

int main()
{
    enum Color c = GREEN;
	printf("%d\n", RED);
	printf("%d\n", GREEN);
	printf("%d\n", BLUE);

	return 0;
}

Output result:  The default value of the enumeration type starts from 0 and increases by 1.

You can also assign an initial value when defining:

enum Color
{
	RED = 9,
	GREEN,
	BLUE
};

enum Color
{
	RED,
	GREEN = 9,
	BLUE
};

2. Advantages of enumeration

Why use enumerations?
We can use #define to define constants, why do we have to use enumerations?
    Advantages of enumerations:
  • Increase code readability and maintainability
  • Compared with identifiers defined by #define, enumerations have type checking, which is more rigorous.
  • Prevented naming pollution (encapsulation)
  • Easy to debug
  • Easy to use, multiple constants can be defined at one time

3. Use of enumerations

enum Color//颜色
{
 RED=1,
 GREEN=2,
 BLUE=4
};

enum Color clr = GREEN;//只能拿枚举常量给枚举变量赋值,才不会出现类型的差异

joint

1. Definition of union type

Union is also a special custom type
Variables defined in this type also contain a series of members. The characteristic is that these members share the same space (so a union is also called a union).
//联合类型的声明
union Un
{
 char c;
 int i;
};
//联合变量的定义
union Un un;
//计算连个变量的大小
printf("%d\n", sizeof(un));

2. Characteristics of union

The members of a union share the same memory space, so the size of a union variable is at least the size of the largest member (because the union variable
The union must at least be able to save the largest member).
union Un
{
	char c;
	int i;
};

int main()
{
	printf("%d\n", sizeof(union Un));
	return 0;
}

Output result: 

Why is it 4 ? We print the addresses of the union type Un and internal members to find out.

union Un
{
	char c;
	int i;
};

int main()
{
	printf("%d\n", sizeof(union Un));
	union Un un = { 0 };

	printf("%p\n", &un);
	printf("%p\n", &(un.i));
	printf("%p\n", &(un.c));

	return 0;
}

Output result: 

Their memory space should look like this:  

union Un
{
	char c;
	int i;
};

int main()
{
	printf("%d\n", sizeof(union Un));
	union Un un = { 0 };
	un.i = 0x11223344;
	un.c = 0x55;

	printf("%p\n", &un);
	printf("%p\n", &(un.i));
	printf("%p\n", &(un.c));

	return 0;
}

The address of un.i: 

The address of un.c:

This proves that members of a union type share the same space.


 

 The union realizes judging the big and small endian storage of the current computer:

int check_sys()
{
	union
	{
		int i;
		char c;
	}un = {.i = 1};
	return un.c;
}

int main()
{
	int ret = check_sys();
	

	if (ret == 1)
		printf("小端\n");
	else
		printf("大端\n");

	return 0;
}
  • check_sys The function defines an anonymous union un, which contains an integer  i and a character  c.
  • The characteristic of a union is that all members share the same memory, so only one member is active at a given time.
  •  An integer value  un is initialized in  the union  .i1
  • check_sys The function returns  the value of un the characters in  the union c .
  • In  main a function,  check_sys the function is called and the return value is stored in a variable  ret .
  • Then, the code checks  ret for equality  1.
  • If  ret equal  1, "little endian" is printed, indicating that the system is in little endian byte order.
  • If  ret it is not equal  1, "big endian" is printed, indicating that the system is big endian.

3. Calculation of union size

  • The size of the union is at least the size of the largest member.
  • When the maximum member size is not an integer multiple of the maximum alignment number, it must be aligned to an integer multiple of the maximum alignment number.
union Un1
{
	char c[5];
    //5个字节 每个元素占1个字节 默认对齐数8  对齐数为1
	int i;
    //4个字节 默认对齐数8 对齐数为4
};


int main()
{
	printf("%d\n", sizeof(union Un1));
    //5不是最大对齐数的整数倍,所以5+3 = 8

	return 0;
}

Guess you like

Origin blog.csdn.net/m0_73800602/article/details/133045752