Custom Types in C: Structures, Enums, Unions

foreword

To accurately describe a thing, you need to get a variety of data. In daily life, data always appear in groups. For example, when we need to record a student's information, we need to know the student's name, age, gender and other information, but these information are not the same type data, so in order to achieve the goal we need to customize some types of data. Custom types can create data types that meet the requirements according to our needs.

1. Structure

Basic knowledge of structs:

A structure is a collection of values ​​called member variables. Each member of the structure can be of a different type.

Declaration of structs and creation of variables:

struct tag{
    
    
    member-list;//成员列表
}variable-list;//变量列表--全局变量
//在声明的时候变量列表可以为空,即不在此处创建变量,在后括号后直接加上分号即可

struct tag{
    
    
    member-list;//成员列表
};

Note: When declaring a structure, you must list all the members it contains.

After the structure is declared, we can consider that there is data of this type, and the name of this data type is the struct tag
following. We declare a structure in this form (take students as an example) and create a variable:

//一个学生的信息包括名字,年龄,性别
struct Stu{
    
    
    char name[10];
    int age;
    char sex[5];
};//这只是一个结构体类型,有了这个类型就可以区创建变量

int main()
{
    
    
    struct Stu s1;//创建了一个结构体变量
    return 0;
}

The following picture can more clearly understand the creation of structure type variables

insert image description here

The difference between structures created in different locations of variables

There are three ways to create structure variables (still taking a student as an example)

The first is to create variables when declaring the structure, which are global variables.

struct stu{
    
    
    char name[10];
    int age;
    char sex[5]; 
}s1;//

The second: create a variable after the declaration and outside the function, which is also a global variable.

struct stu{
    
    
    char name[10];
    int age;
    char sex[5]; 
};

struct stu s2;
int main()
{
    
    
    return 0;
}

The third type: Created inside the function, this kind of variable is a local variable.

struct stu{
    
    
    char name[10];
    int age;
    char sex[5]; 
};
int main()
{
    
    
    struct stu s3;
    
    return 0;
}
special statement

Can be incomplete when declaring a struct

Anonymous struct type:

struct
{
    
    
	char name[5];
    int age;
    float c;
}s1;//变量只能在这个位置创建
//该结构体在声明时省略了 tag 位置的信息,这是一个匿名结构体标签

have to be aware of is:

  • 1. An anonymous structure variable must be created when it is declared.

  • 2. Declare two anonymous structure variables in a row, the compiler will think that these are two different types of data

insert image description here

Use typedef to create variables more easily

typedefA tool is an advanced data feature that allows you to customize the name of a type. This aspect is #definesimilar, but there are differences:

  • Unlike #define, typedefthe created symbol name is limited only by the type and cannot be used for the value
  • typedefInterpreted by the compiler, not the preprocessor
  • Within constraints, more flexible typedefthan#define

When we think that writing the variable type is cumbersome every time we create a variable, we can use typedefit, which makes it easier to create variables

typedef struct stu{
    
    
    char name[10];
    int age;
    char sex[5];
}stu;
int main()
{
    
    
	stu s1;//现在我们可以更方便的创建变量
    return 0;
}

Initialization of structure variables:

The same as other general types: it can be initialized when it is created, or it can be assigned later.

#include<stdio.h>

typedef struct stu{
    
    
    char name[10];
    int age;
    char sex[5];
}stu;
int main()
{
    
    
    stu s1 ;
    stu s2 = {
    
    "小明"5"男"};//创建的时候就赋值
    s1 = {
    
    "小红"6"女"};//后面赋值
    return 0;
}

Calculate the size of the structure:

The following is an example of calculating the size of sizeofa structure (operators give the overall length of a structure.)

struct S1{
    
    
    char c1;
    int i ;
    char c2;
};

int main()
{
    
    
    printf("%d\n",sizeof(struct S1));
    return 0;
}

Before printing the structure, use the accumulation method to calculate the size of the structure: two char members occupy 2 bytes, and one int member occupies 4 bytes. Before understanding "structure memory alignment", we should all get 6 this answer.

print result
insert image description here

But the final printed result is 12, indicating that the size of the structure is 12 bytes. This result is due to the effect of "structure memory alignment". Let's take a look at structure memory alignment.

Structure memory alignment

Before that, we need to understand such a function - offsetof:

This function can get the offset (in bytes) of the members of a structure relative to the initial position of the structure. The header file for the function is stddef.h.

The prototype of this function is:

size_t offsetof(structName,memberName);

We use this function to calculate the offset of each member of the following structure:

#include<stdio.h>
#include<stddef.h>
struct S1 {
    
    
    char c1;
    int i;
    char c2;
};

int main()
{
    
    
    printf("%d\n", offsetof(struct S1, c1));
    printf("%d\n", offsetof(struct S1, i));
    printf("%d\n", offsetof(struct S1, c2));
    return 0;
}

insert image description here

Now we get the offset of each member.

Earlier we know that the size of this type is 12. If we create a variable of this type s1, then we will create a s1space of 12 bytes.

insert image description here

The rules of structure type memory alignment in C language:
  1. The first member of the structure is stored at offset 0 from the start of the structure variable;
  2. Other members should be aligned to an address that is an integer multiple of a number (alignment number)
    Alignment number: the compiler defaults an alignment number and the smaller value of the size of the member
    vs the environment has a default alignment number: 8
  3. The total size of the structure is an integer multiple of the maximum alignment number (each structure member has an alignment number)
  4. If a structure is nested, the nested structure is aligned to an address that is an integer multiple of its own maximum alignment number, and the overall size of the structure is an integer multiple of the maximum alignment number (including the nested structure).

Using this rule we can explain why the size of the structure above is 12:

The first member must be placed at offset 0, and the size of the first member is 1, so

insert image description here

The size of the second member is 4, the default alignment number is 8, and the alignment number obtained for the second member is 4. At this time, we cannot start the second row from the offset of 1. We need to find the alignment number. At an integer multiple of the address, since the offset is 4, the nearest position that satisfies the condition is the position with the offset 4, and then the space opened up for the second member should be like this:
insert image description here

The size of the third member is 1, the default alignment number is 8, the alignment number of the third member is 1, and the address that meets the requirements is the position at offset 8, so the space for the third member is opened at the offset position of 8

insert image description here

After the space of each member is determined, the total size of the structure is calculated. The rule is an integer multiple of the maximum alignment number. In this example, the maximum alignment number is 4 (the first member is 1, the second member is 4, the third member is 1), so the total size is an integer multiple of 4. At this time, the size of 9 bytes has been used. To meet the conditions, it is necessary to continue to occupy space backwards. When the size of the structure space is 12 bytes (the offset is 11), all the conditions are met, and the result
is that the size of the structure is 12 bytes.

Why does memory alignment exist?

  • Platform reasons
    Not all hardware platforms can access any data at any address, some hardware platforms can only access some specific data at certain addresses, otherwise a hardware exception will be thrown

  • Performance reasons
    Data structures (especially stacks) should be aligned on natural boundaries as much as possible.
    The reason is that to access unaligned memory the processor needs to make two memory accesses, whereas to access aligned memory, the processor only needs to make one access

insert image description here

When designing the structure, that is, if you want to save space and save time, you can do the following:

Keep members that take up less space together as much as possible

Modify the default alignment

Use #pragma pack();

Note: When you modify the default alignment number, it will generally be changed to the nth power of 2

Structure as a function parameter needs attention

Try to choose call by reference, it will not waste a lot of space, you need to protect it with const

When the function passes parameters, it will push the stack. If a large space is copied, it will reduce the efficiency.

2. Bit segment

What is a bit segment

The declaration and structure of bit fields are similar with two differences:
the members of the bit field are followed by a number and a colon

for example:

struct A{
    
    
    int _a :2;//该变量只占2个比特的空间
    int _b :5;//该变量只占5个比特的空间
    int _c :10;//该变量只占10个比特的空间
    int _d :30;//该变量只占30个比特的空间
};

Structure A is a bit segment, and the size of the u single is changed to

Calculation of the space size of the bit segment

Memory allocation rules for bit segments:

  1. A member of a bit field can unsigned int signed int intbe either char;
  2. The space of the bit field is opened up in the form of 4 bytes (int) or 1 byte (char) as needed

Cross-platform issues with bit fields:

  1. It is indeterminate whether the int bit field is treated as signed or unsigned
  2. The maximum number of bits in the bit field cannot be determined.
  3. It is indeterminate whether members in a bit field are allocated from left to right or from right to left in memory
  4. When a structure includes two bit segments, and the members of the second bit segment are relatively large and cannot accommodate the remaining bits of the first bit segment, it is uncertain whether to discard the remaining bits or use them.

The fourth point is highlighted below:

This is a code segment in which the size of the bit segment has been calculated. There are two situations for the space development of the bit segment on different machines:

struct S{
    
    
  char a : 3;  
  char b : 4;  
  char c : 5;  
  char d : 4;  
};
int main()
{
    
    
    printf("%d\n",sizeof(struct S));
    return 0;
}

Case 1:

The four variables occupy a total of 16 bits, and the content stored in the middle is continuous. A total of 2 bytes are required.

Case two:

If the space used by the previous variable cannot be completely assigned to the next variable, then the next variable will re-create a space
. The first two variables occupy one byte, the variable c occupies one byte, and the variable d occupies one byte. 3 bytes in total

insert image description here

Different values ​​may appear in different environments

And the vs environment is the second case

insert image description here

2. United

Unions are declared in a similar way to their structure:

union tag{
    
    
    int i ;
    char ch;
    float f;
};

The characteristic of the union is that each member refers to the same address in memory. .

Calculate the size of the union:

If the members of a union have different lengths, the length of the union is the length of its longest member.

insert image description here

joint initialization

The initialization of the union satisfies two conditions:

  • The initial value must be the type of the first member of the union variable
  • The initial value must be placed between a pair of curly braces

E.g:

insert image description here

Simply use the union

Use the union to determine the byte order (big endian or little endian) used by the machine used

You can use this code snippet

#include<stdio.h>
union {
    
    
	int i;
	char ch;
}endian;

int main()
{
    
    
	endian.i = 1;
	printf("%d", endian.ch);
	return 0;
}

insert image description here
insert image description here

3. Enumeration

Enumeration, as the name suggests, is an enumeration.
List the possible values.
For example in our real life:

  • Monday to Sunday of the week is limited to 7 days, which can be listed one by one.

  • Gender is: male, female, confidential, you can also list them one by one.

  • There are 12 months in the month, you can also list them one by one

Definition of enumeration types

enum Day//星期
{
    
    //枚举的可能取值 
Mon,
Tues,
Wed,
Thur,
Fri,
Sat,
Sun
};
enum Sex//性别
{
    
    
MALE,
FEMALE,
SECRET
}int main()
{
    
    
    enum DAY d = Sun;  //枚举类型应该是在可能取值范围中取值。
    return 0;
}

The value of the constant in the enumeration

The possible values ​​of the enumeration have an initial value. Let's print it and take a look.

enum Day//星期
{
    
    //枚举的可能取值 
Mon,
Tues,
Wed,
Thur,
Fri,
Sat,
Sun
};
int main()
{
    
    
    printf("%d\n",Mon);
    printf("%d\n",Tues);
    printf("%d\n",Wed);
    printf("%d\n",Thur);
    printf("%d\n",Fri);
    printf("%d\n",Sat);
    printf("%d\n",Sun);
    return 0;
}

The print result is:

insert image description here

What we need to know is: by default, integer values ​​start at zero, and if an identifier in the list is assigned a value, the value of the immediately following identifier is 1 greater than the copied value. .

Here we are declaring the enumeration for one of the
insert image description here

Notice:

  • These values ​​are constants. This step is equivalent to initializing them. Their initial values ​​cannot be changed later.

  • The size of the enumeration is an integer

Advantages of enumeration

  1. Increased code readability

  2. Compared with the identifier defined by #define, enumeration has type checking, which is more rigorous

  3. Prevents naming pollution (encapsulation)

  4. Easy to debug
    Identifiers defined by #define are not good for debugging, because the substitution is already done when debugging is first started.

  5. Easy to use, you can define multiple constants at a time

Summarize

  • The information to be represented in the programming process is usually not a number or a series of numbers. The structure can put this information in one unit so that the related information can be stored in one place, rather than in multiple variables.
  • A union declaration is similar to a struct declaration, but the members of the union share the same storage space, and only one member can be in the union at a time. Essentially, it is possible to store a value that is not of a unique type in a union variable.
  • The enum tool provides a way to define symbolic constants.
  • The typedef tool provides a way to create new identifiers for base or derived types.

Guess you like

Origin blog.csdn.net/cainiaochufa2021/article/details/123095433