10W+ word C language hard core summary (1), it is worth reading and collecting!

1. Overview of C language

Welcome to the world of C, a powerful specialized programming language.

10W+ word C language hard core summary (2), it is worth reading and collecting!

10W+ word C language hard core summary (3), it is worth reading and collecting!

The C language/C++ 10,000-word summary of staying up late finishing (4)

The C/C++ 10,000-word summary of staying up late (5), file operations

Programmers must have hard-core information, click to download

1.1 The origin of the C language

Dennis Ritchie of Bell Labs developed C in 1972 while he was designing the UNIX operating system with ken Thompson, however, C was not entirely conceived by Ritchie. It comes from Thompson's B language.

1.2 Reasons to use C language

Over the past few decades, C language has become one of the most popular and important programming languages. It grew because people liked it after trying it. Over the years, many people have switched from C to the more powerful C++ language, but C has its own advantages, is still an important language, and it is the only way to learn C++.

  • Efficiency. C language is an efficient language. c exhibits the fine-grained control normally found only in assembly language (assembly language is a set of internally formulated mnemonics adopted by a particular cpu design. Different cpu types use different assembly languages). If you like, you can fine-tune the program for maximum speed or maximum memory usage.

  • portability. C language is a portable language. It means that C programs written on one system can run on other systems with little or no modification.

  • Powerful functionality and flexibility. c is powerful and flexible. For example, the powerful and flexible UNIX operating system is written in c. Many compilers and interpreters for other languages ​​(Perl, Python, BASIC, Pascal) are also written in c. The result is that when you use Python on a Unix machine, you end up with a c program that generates the final executable.

1.3 C Language Standard

1.3.1 K&R C

At first, there was no official standard for the C language. In 1978, the C language was officially published by Bell Labs of AT&T. Brian Kernighan and Dennis Ritchie published a book called The C Programming Language. Known as K&R by C developers, this book has been used for many years as the unofficial standard specification for the C language. People call this version of C K&R C.

K&R C mainly introduces the following features: structure (struct) type; long integer (long int) type; unsigned integer (unsigned int) type; change the operators =+ and =- to += and -=. Because =+ and =- will make the compiler do not know whether the user wants to process i = -10 or i = - 10, causing confusion in processing.

Even many years after the ANSI C standard was proposed, K&R C is still the most standard requirement for many compilers, and many older compilers still run the K&R C standard.

1.3.2 ANSI C/C89 Standard

From the 1970s to the 1980s, the C language was widely used, from mainframes to small microcomputers, and many different versions of the C language were derived. In 1983, the American National Standards Institute (ANSI) established a committee X3J11 to develop the C language standard.

In 1989, the American National Standards Institute (ANSI) adopted the C language standard, known as ANSI X3.159-1989 "Programming Language C". Because this standard was adopted in 1989, it is generally referred to as the C89 standard. Some people also call it ANSI C for short, because this standard is published by the American National Standards Institute (ANSI).

In 1990, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) set the C89 standard as an international standard for the C language, named ISO/IEC 9899:1990 - Programming languages ​​-- C[5] . Because this standard was published in 1990, some people call it the C90 standard for short. However, most people still call it the C89 standard because it is identical to the ANSI C89 standard.

In 1994, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) released a revised version of the C89 standard, called ISO/IEC 9899:1990/Cor 1:1994[6], or simply the C94 standard by some.

In 1995, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) reissued a revised version of the C89 standard, called ISO/IEC 9899:1990/Amd 1:1995 - C Integrity[7], some people call it the C95 standard for short .

1.3.3 C99 Standard

In January 1999, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) released a new standard for the C language, called ISO/IEC 9899:1999 - Programming languages ​​-- C, referred to as the C99 standard. This is the second official standard for the C language.

E.g:

  • Added new keywords restrict, inline, _Complex, _Imaginary, _Bool

  • Support long long, long double _Complex, float _Complex such types

  • Arrays of variable length are supported. The length of the array can be used as a variable. When declaring a type, use int a[*]. But considering the efficiency and implementation, this thing is not a new type.

Second, the memory partition

2.1 Data Types

2.1.1 Data Type Concept

What is a data type? Why do you need data types? Data types are for better memory management, allowing the compiler to determine how much memory to allocate.

In our real life, dogs are dogs, birds are birds, etc. Each thing has its own type, so the data types used in the program also come from life.

When we allocate memory to dogs, it is equivalent to building a doghouse for dogs, and when allocating memory to birds, it is to build a birdhouse for birds. We can build a villa for each of them, but it will cause a waste of memory. , can not make good use of memory space.

We are thinking that if we allocate memory to birds, we only need space the size of a bird’s nest. If we allocate memory to dogs, we only need memory the size of a dog’s kennel, instead of allocating a villa for both birds and dogs, causing memory loss. of waste.

When we define a variable, a = 10, how does the compiler allocate memory? A computer is just a machine, how does it know how much memory it can use to fit 10?

So, the data type is very important, it can tell the compiler how much memory can be allocated to fit our data.

A doghouse is a dog, a birdhouse is a bird, and if there is no data type, how do you know there is an elephant in the refrigerator!

Basic concepts of data types:

  • Types are abstractions over data;

  • Data of the same type has the same representation, storage format, and related operations;

  • All data in a program must belong to a certain data type;

  • Data types can be understood as molds for creating variables: aliases for fixed-size memory;

     2.1.2 Data Type Aliases

typedef unsigned int u32;
typedef struct _PERSON{
 char name[64];
 int age;
}Person;

void test(){
 u32 val; //相当于 unsigned int val;
 Person person; //相当于 struct PERSON person;
}

2.1.3 void data type

void literally means "untyped", void* untyped pointer, untyped pointer can point to any type of data.

It is meaningless to define a variable with void. When you define void a, the compiler will report an error.

void is really used in the following two ways:

  • Qualification on the return of the function;

  • restrictions on function parameters;

//1. void修饰函数参数和函数返回
void test01(void){
 printf("hello world");
}

//2. 不能定义void类型变量
void test02(){
 void val; //报错
}

//3. void* 可以指向任何类型的数据,被称为万能指针
void test03(){
 int a = 10;
 void* p = NULL;
 p = &a;
 printf("a:%d\n",*(int*)p);
 
 char c = 'a';
 p = &c;
 printf("c:%c\n",*(char*)p);
}

//4. void* 常用于数据类型的封装
void test04(){
 //void * memcpy(void * _Dst, const void * _Src, size_t _Size);
}

2.1.4 The sizeof operator

sizeof is an operator in C language, similar to ++, -- and so on. sizeof can tell us the size that the compiler allocates when allocating space in memory for a specific data or data of a certain type. The size is in bytes.

Basic syntax:

sizeof(变量);
sizeof 变量;
sizeof(类型);

sizeof Note :

  • The size of the footprint returned by sizeof is the size allocated for this variable, not just the space it uses. It is similar to the concept of construction area and usable area of ​​today's housing. Therefore, when using structures, in most cases, you have to consider the issue of byte alignment;

  • The data result type returned by sizeof is unsigned int;

  • Note the difference between array names and pointer variables. Under normal circumstances, we always think that the array name is similar to the pointer variable, but the difference is very big when using sizeof. Using sizeof for the array name returns the size of the entire array, and when operating on the pointer variable, it returns the pointer variable. The space occupied by itself is generally 4 under the conditions of a 32-bit machine. And when the array name is used as a function parameter, inside the function, the formal parameter is also a pointer, so the size of the array is no longer returned;

//1. sizeof基本用法
void test01(){
 int a = 10;
 printf("len:%d\n", sizeof(a));
 printf("len:%d\n", sizeof(int));
 printf("len:%d\n", sizeof a);
}

//2. sizeof 结果类型
void test02(){
 unsigned int a = 10;
 if (a - 11 < 0){
  printf("结果小于0\n");
 }
 else{
  printf("结果大于0\n");
 }
 int b = 5;
 if (sizeof(b) - 10 < 0){
  printf("结果小于0\n");
 }
 else{
  printf("结果大于0\n");
 }
}

//3. sizeof 碰到数组
void TestArray(int arr[]){
 printf("TestArray arr size:%d\n",sizeof(arr));
}
void test03(){
 int arr[] = { 10, 20, 30, 40, 50 };
 printf("array size: %d\n",sizeof(arr));

 //数组名在某些情况下等价于指针
 int* pArr = arr;
 printf("arr[2]:%d\n",pArr[2]);
 printf("array size: %d\n", sizeof(pArr));

 //数组做函数函数参数,将退化为指针,在函数内部不再返回数组大小
 TestArray(arr);
}

2.1.5 Data Type Summary

  • The data type is essentially an alias for a fixed memory size, and it is a mold. The C language stipulates that variables are defined by data types;

  • data type size calculation (sizeof);

  • You can alias an existing data type with typedef;

  • Data type encapsulation (void universal type);

2.2 Variables

2.1.1 The concept of variables

A memory object that can be both read and written is called a variable;

An object that cannot be modified once initialized is called a constant.

变量定义形式: 类型  标识符, 标识符, … , 标识符

2.1.2 The nature of variable names

  • The essence of variable names: an alias for a continuous memory space;

  • The program applies and names the memory space through variables int a = 0;

  • Access memory space by variable name;

  • Instead of reading and writing data to the variable name, read and write data to the memory space represented by the variable;

There are two ways to modify variables:

  void test(){
 
 int a = 10;

 //1. 直接修改
 a = 20;
 printf("直接修改,a:%d\n",a);

 //2. 间接修改
 int* p = &a;
 *p = 30;

 printf("间接修改,a:%d\n", a);
}

2.3 The memory partition model of the program

2.3.1 Memory partition

2.3.1.1 Before running

If we want to execute the c program we wrote, the first step is to compile the program. 1) Preprocessing: macro definition expansion, header file expansion, conditional compilation, no grammar check here

2) Compile: check the syntax, compile the preprocessed file to generate an assembly file

3) Assembly: Generate the assembly file into an object file (binary file)

4) Linking: Link the object file as an executable program

 Code area

Holds the machine instructions executed by the CPU. Usually the code area is sharable (that is, another executing program can call it), and the purpose of making it sharable is that for frequently executed programs, only one code is needed in memory. The code area is usually read-only, and the reason for making it read-only is to prevent the program from accidentally modifying its instruction t. In addition, the code area also plans the relevant information of local variables.

 Global initialization data area/static data area (data segment)

This area contains global variables that are explicitly initialized in the program, static variables that have been initialized (including global static variables and t), and constant data (such as string constants).

 Uninitialized data area (also called bss area)

Stored are global uninitialized variables and uninitialized static variables. The data in the uninitialized data area is initialized to 0 or NULL by the kernel before the program starts executing.

Generally speaking, after the program source code is compiled, it is mainly divided into two segments: program instructions (code area) and program data (data area). The code segment belongs to the program instructions, while the data field segment and the .bss segment belong to the program data.

Then why separate program instructions and program data?

  • After the program is loaded into memory, data and code can be mapped to two memory areas respectively. Since the data area is readable and writable for the process, and the instruction area is read-only for the program, after partitioning, you can set the program instruction area and data area to be readable, writable or read-only respectively. . This prevents the instructions of the program from being modified intentionally or unintentionally;

  • When there are multiple identical programs running in the system, the instructions executed by these programs are the same, so it is only necessary to save a copy of the program instructions in the memory, but the data in each program is different. Can save a lot of memory. For example, after the previous Windows Internet Explorer 7.0 runs, it needs to occupy 112 844KB of memory, and its private data has about 15,944KB, which means that 96,900KB of space is shared. If there are hundreds of programs running like this processes, it is conceivable that the shared approach can save a lot of memory.

2.3.1.1 After running

Before the program is loaded into the memory, the size of the code area and the global area (data and bss) is fixed and cannot be changed during the running of the program. Then, run the executable program, and the operating system loads (loads) the physical hard disk program into the memory, in addition to separating the code area (text), data area (data) and uninitialized data area (bss) according to the information of the executable program , and additional stack area and heap area are added.

 Code area (text segment)

The executable code segment is loaded, and all executable code is loaded into the code area. This memory cannot be modified during operation.

 Uninitialized data area (BSS)

What is loaded is the BSS segment of the executable file, which can be separated or close to the data segment. The life cycle of the data stored in the data segment (global uninitialized, static uninitialized data) is the entire program running process.

 Global initialization data area/static data area (data segment)

The executable file data segment is loaded, and the life cycle of the data stored in the data segment (global initialization, static initialization data, literal constant (read-only)) is the entire program running process.

 Stack area (stack)

The stack is a first-in, last-out memory structure that is automatically allocated and released by the compiler to store the parameter values, return values, and local variables of functions. It is loaded and released in real time during the running of the program. Therefore, the life cycle of local variables is to apply for and release the stack space of this segment.

 Heap

The heap is a large container, its capacity is much larger than the stack, but it does not have the first-in, last-out order as the stack. Used for dynamic memory allocation. The heap is located in memory between the BSS area and the stack area. Generally allocated and released by the programmer, if the programmer does not release, it will be reclaimed by the operating system when the program ends.

 2.3.2 Partitioning model

2.3.2.1 Stack Area

Memory management is performed by the system. Mainly store function parameters and local variables. After the function completes execution, the system releases the stack memory by itself, and does not require user management.

#char* func(){
 char p[] = "hello world!"; //在栈区存储 乱码
 printf("%s\n", p);
 return p;
}
void test(){
 char* p = NULL;
 p = func();  
 printf("%s\n",p); 
}

2.3.2.2 Heap Area

It is manually applied by the programmer and released manually. If it is not released manually, it will be recovered by the system after the program ends. The life cycle is the entire program running period. Use malloc or new for heap application.

char* func(){
 char* str = malloc(100);
 strcpy(str, "hello world!");
 printf("%s\n",str);
 return str;
}

void test01(){
 char* p = NULL;
 p = func();
 printf("%s\n",p);
}

void allocateSpace(char* p){
 p = malloc(100);
 strcpy(p, "hello world!");
 printf("%s\n", p);
}

void test02(){
 
 char* p = NULL;
 allocateSpace(p);

 printf("%s\n", p);
}

Heap allocated memory API:

#include <stdlib.h>
void *calloc(size_t nmemb, size_t size);

Function:

Allocate a contiguous area of ​​nmemb block length size bytes in the memory dynamic storage area. calloc automatically zeroes out the allocated memory.

parameter:

nmemb: the number of memory units required size: the size of each memory unit (unit: bytes)

return value:

Success: the starting address of the allocated space

failed: NULL

#include <stdlib.h>
void *realloc(void *ptr, size_t size);

Function:

Reallocate the size of the memory space allocated in the heap with malloc or calloc function. realloc will not automatically clean up the increased memory, it needs to be cleaned up manually. If there is continuous space behind the specified address, then the memory will be increased on the basis of the existing address. If there is no space behind the specified address, then realloc will reallocate a new one. Contiguous memory, copy the value of the old memory to the new memory, and release the old memory at the same time.

parameter:

ptr: It is the memory address allocated by malloc or calloc before. If this parameter is equal to NULL, it is consistent with the functions of realloc and malloc

size: the size of the reallocated memory, unit: bytes

return value:

Success: newly allocated heap memory address

failed: NULL

void test01(){
 
 int* p1 = calloc(10,sizeof(int));
 if (p1 == NULL){
  return;
 }
 for (int i = 0; i < 10; i ++){
  p1[i] = i + 1;
 }
 for (int i = 0; i < 10; i++){
  printf("%d ",p1[i]);
 }
 printf("\n");
 free(p1);
}

void test02(){
 int* p1 = calloc(10, sizeof(int));
 if (p1 == NULL){
  return;
 }
 for (int i = 0; i < 10; i++){
  p1[i] = i + 1;
 }

 int* p2 = realloc(p1, 15 * sizeof(int));
 if (p2 == NULL){
  return;
 }

 printf("%d\n", p1);
 printf("%d\n", p2);

 //打印
 for (int i = 0; i < 15; i++){
  printf("%d ", p2[i]);
 }
 printf("\n");

 //重新赋值
 for (int i = 0; i < 15; i++){
  p2[i] = i + 1;
 }
 
 //再次打印
 for (int i = 0; i < 15; i++){
  printf("%d ", p2[i]);
 }
 printf("\n");

 free(p2);
}

2.3.2.3 Global/Static Area

The variables in the global static area have been allocated memory space and initialized during the compilation phase. This memory always exists during the running of the program, and it mainly stores global variables, static variables and constants.

Notice:

(1) There is no distinction between initialized and uninitialized data areas, because if the variables in the static storage area do not display initialization, the compiler will automatically initialize them in the default way, that is, there are no uninitialized variables in the static storage area. .

(2) The constants in the global static storage area are divided into constant variables and string constants. Once initialized, they cannot be modified. The constant variables in static storage are global variables, which are different from local constant variables. The difference is that local constant variables are stored on the stack and can actually be modified indirectly through pointers or references, while global constant variables stored in the static constant area cannot be indirectly modified.

(3) String constants are stored in the constant area of ​​the global/static storage area.

int v1 = 10;//全局/静态区
const int v2 = 20; //常量,一旦初始化,不可修改
static int v3 = 20; //全局/静态区
char *p1; //全局/静态区,编译器默认初始化为NULL

//那么全局static int 和 全局int变量有什么区别?

void test(){
 static int v4 = 20; //全局/静态区
}
char* func(){
 static char arr[] = "hello world!"; //在静态区存储 可读可写
 arr[2] = 'c';
 char* p = "hello world!"; //全局/静态区-字符串常量区 
 //p[2] = 'c'; //只读,不可修改 
 printf("%d\n",arr);
 printf("%d\n",p);
 printf("%s\n", arr);
 return arr;
}
void test(){
 char* p = func();
 printf("%s\n",p);
}

2.3.2.4 Summary

When understanding C/C++ memory partitions, the following terms are often encountered: data area, heap, stack, static area, constant area, global area, string constant area, literal constant area, code area, etc. Beginners are confused Cloudy and foggy. Here, try to clarify the relationship between the above partitions.

The data area includes: heap, stack, global/static storage area.

  • The global/static storage area includes: constant area, global area, and static area.

  • The constant area includes: string constant area and constant variable area.

  • Code area: store the binary code after program compilation, non-addressable area.

It can be said that there are actually only two C/C++ memory partitions, namely the code area and the data area.

2.3.3 Function call model

2.3.3.1 Function call flow

The stack is one of the most important concepts in modern computer programs. Almost every program uses the stack. Without the stack, there would be no functions, no local variables, and no all computer languages ​​we can see today. Before explaining why the stack is so important, let's take a look at the traditional definition of a stack:

In classical computer science, a stack is defined as a special container. Users can push data into the stack (push), or pop the data pushed into the stack (pop), but the stack Containers must follow a rule: First In Last Out (FILO).

In classic operating systems, the stack always grows downward. The operation of pushing the stack reduces the address of the top of the stack, and the operation of popping increases the address of the top of the stack.

The stack plays an extremely important role in the running of the program. Most importantly, the stack holds the information that a function call needs to maintain, which is usually called a stack frame or an active record. The information required by a function call process generally includes the following aspects:

  • the return address of the function;

  • function parameters;

  • Temporary variables;

  • Saved context: includes registers that need to remain unchanged before and after a function call.

We analyze the calling process of the following functions from the following code:

int func(int a,int b){
 int t_a = a;
 int t_b = b;
 return t_a + t_b;
}

int main(){
 int ret = 0;
 ret = func(10, 20);
 return EXIT_SUCCESS;
}

 Programmers must have hard-core information, click to download

2.3.3.2 Calling conventions

Now, we have a general understanding of the process of function calling. During this period, there is a phenomenon that the caller and the callee of the function have a consistent understanding of the function call. For example, they both agree that the parameters of the function are according to a certain Pushed onto the stack in a fixed manner. If this is not the case, the function will not function correctly.

If the function caller pushes the a parameter first and then pushes the b parameter when passing the parameters, and the called function thinks that b is pushed first and then a is pushed, then the called function is using a, b value, it will be reversed.

Therefore, the caller and the callee of a function must have a clear agreement on how to call the function. Only when both parties follow the same convention can the function be called correctly. Such a convention is called "calling convention". Convention)". A calling convention generally includes the following aspects:

The order and method of passing function parameters

There are many ways to pass functions, the most common being through the stack. The caller of the function pushes the parameters onto the stack, and the function itself removes the parameters from the stack. For functions with multiple parameters, the calling convention dictates the order in which the function caller pushes the parameters onto the stack: left-to-right, or right-to-left. Some calling conventions also allow the use of registers to pass parameters to improve performance.

How to maintain the stack

After the function pushes the parameters into the stack, the function body is called, and then all the parameters pushed into the stack need to be popped, so that the stack remains consistent before and after the function call. This popping work can be done by the caller of the function, or by the function itself.

To distinguish the calling convention at link time, the calling convention modifies the name of the function itself. Different calling conventions have different name mangling strategies.

In fact, in the C language, there are multiple calling conventions, and the default is cdecl. Any function that does not explicitly specify a calling convention is the default cdecl convention. For example, for the declaration of the func function above, its complete writing should be:

int _cdecl func(int a,int b);

Note: cdecl is not a standard keyword and may be written differently in different compilers. For example, there is no such keyword as _cdecl in gcc, but __attribute _((cdecl)).

2.3.3.2 Function variable transfer analysis

 Programmers must have hard-core information, click to download

2.3.4 Growth direction of stack and memory storage direction

//1. 栈的生长方向
void test01(){

 int a = 10;
 int b = 20;
 int c = 30;
 int d = 40;

 printf("a = %d\n", &a);
 printf("b = %d\n", &b);
 printf("c = %d\n", &c);
 printf("d = %d\n", &d);

 //a的地址大于b的地址,故而生长方向向下
}

//2. 内存生长方向(小端模式)
void test02(){
 
 //高位字节 -> 地位字节
 int num = 0xaabbccdd;
 unsigned char* p = &num;

 //从首地址开始的第一个字节
 printf("%x\n",*p);
 printf("%x\n", *(p + 1));
 printf("%x\n", *(p + 2));
 printf("%x\n", *(p + 3));
}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324107951&siteId=291194637