The Road to C Language Learning (Advanced) - Variables and Memory Distribution (Part 2)

Note: This blog is written word by word by the blogger, which is not easy. Please respect the originality, thank you all!

Memory partitioning model for programs

1) Memory partition

1.1 Before running

If we want to execute the C program we wrote, the first step is to compile the program.

  1. Preprocessing: macro definition expansion, header file expansion, conditional compilation, syntax is not checked here
  2. Compile: check the syntax and compile the preprocessed file into an assembly file
  3. Assembly: Generate assembly files into target files (binary files)
  4. Link: Link the object file into an executable program

After we compile and generate the executable file, we can view the basic situation of an executable binary file through the linuxundersize command: a>

Insert image description here

As can be seen from the above figure, before the program is run, that is to saybefore the program is loaded into the memory, inside the executable program Three pieces of information have been divided into code area (text), data area (data) and uninitialized data area (bss) ( Some people directly combine data and bss and call it the static area or global area).

  • Code area
    stores the machine instructions executed by CPU. Usually the code area is shareable (that is, other executing programs can call it). The purpose of making it shareable is that for frequently executed programs, only one copy of the code is needed in the memory. The code area is usually read-only. The reason for making it read-only is to prevent the program from accidentally modifying its instructions. In addition, the code area also plans relevant information about local variables.

  • Global initialization data area/static data area (data section)
    This area contains global variables that are explicitly initialized in the program and initialized static variables (including global static variables and local static variables) and constant data (such as string constants).

  • Uninitialized data area (also called bss area)
    stores global uninitialized variables and uninitialized static variables. The data in the uninitialized data area is initialized by the kernel to 0 or empty (NULL) before the program starts executing.

Generally speaking, after the program source code is compiled, it is mainly divided into two types of segments: program instructions (code area) and program data (data area). The code segment belongs to the program instructions, while the data field segment and .bss segment belong to the program data.

So why separate the program instructions and program data?

  • After the program isload moved into the memory, the data and code can be mapped to two memory areas respectively. Since the data area is readable and writable for the process, and the instruction area is read-only for the program, after partitioning, the program instruction area and data area can be set to read, write, or read-only respectively. . This prevents the program instructions from being modified intentionally or unintentionally;
  • When there are multiple identical programs running in the system, the instructions executed by these programs are all the same, so you only need to save a copy of the program instructions in the memory, only the data in each program is running It's just different, this can save a lot of memory. For example, after the previous Windows Internet Explorer 7.0 is run, it needs to occupy 112844KB of memory, and its private part data is about 15944KB, also That is to say, the space is shared. If hundreds of such processes are running in the program, it is conceivable that the sharing method can save a lot of memory. 96900KB

1.2 After running

Before the program is loaded into the memory,the sizes of the code area and global area (data and bss) are fixed, and cannot be changed during the running of the program. Change. Then, run the executable program, and the operating system loads the physical hard disk program into the memory.In addition to dividing the code area (text), data area (data) and In addition to the uninitialized data area (bss), a stack area and a heap area are also added.

  • Code area (text segment)
    loads the executable file code segment. All executable codes are loaded into the code area. This memory cannot be modified during operation. of.

  • Uninitialized data area (BSS)
    loads the executable fileBSS section. The location can be separated or close to the data section. It is stored in The life cycle of the data in the data segment (global uninitialized, static uninitialized data) is the entire program running process.

  • Global initialization data area/static data area (data segment)
    loads the executable file data segment and stores it in the data segment (global initialization, Static initialization data, literal constants (read-only)) have a life cycle of the entire program running process.

  • Stack area (stack)
    The stack is a first-in, last-out memory structure that is automatically allocated and released by the compiler to store function parameter values, return values, local variables, etc. It is loaded and released in real time during the running of the program. Therefore, the life cycle of local variables is to apply for and release the stack space.

  • Heap area (heap)
    The heap is a large container. Its capacity is much larger than that of the stack, but it does not have the first-in, last-out order like the stack. Used for dynamic memory allocation. The heap is located between the BSS area and the stack area in memory. Generally, it is allocated and released by the programmer. If the programmer does not release it, it will be recycled by the operating system when the program ends.

type Scope life cycle storage location
autovariable Itshu{}Inside current function stack area
staticlocal variables Itshu{}Inside The entire program running period is initialized in the data section, uninitialized in the BSS section
externvariable the whole program The entire program running period is initialized in the data section, uninitialized in the BSS section
staticglobal variables current file The entire program running period is initialized in the data section, uninitialized in the BSS section
externfunction the whole program The entire program running period code area
staticfunction current file The entire program running period code area
registervariable Itshu{}Inside current function stored inCPUregister during runtime
String constant current file The entire program running period datapart

Summarize:

Stack area
1. First in, last out (last in, first out)
2. The compiler manages data development and release
3. The capacity is limited, do not open a large amount of data to the stack area
Heap area
1. The capacity is much larger than the stack area
2. Programmers manually open data (malloc) and manually release data (free)

2) Partition model

2.1 Stack area

Memory management is performed by the system. Mainly stores function parameters and local variables. After the function completes execution, the system releases the stack area memory on its own without user management.

Notes on the stack area: Do not return the address of local variables. Local variables will be released after the function body is executed. If it is operated again, it is an illegal operation and the result is unknown!
Example 1:

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int* func03()
{
    
    
	int a = 10; // 栈上创建的变量,当函数结束后会被释放
	return &a;
}

void test13()
{
    
    
	int* p = func03();
	// 因为func03调用结束后,变量a早已被释放,再去操作这块内存就属于非法操作
	printf("*p = %d\n", *p); // 第一次打印结果为10,是出于编译器的保护,编译器会认为你误操作
	printf("*p = %d\n", *p); // 第二次打印结果就不是10了
}

int main()
{
    
    
	test13();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Example 2:

char* getString()
{
    
    
	char str[] = "hello cdtaogang";
	return str;

}
void test14()
{
    
    
	char* p = NULL;
	p = getString();
	printf("p = %s\n", p);

}

int main()
{
    
    
	test14();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Insert image description here

2.2 Heap area

is manually applied for by the programmer and released manually. If not released manually, it will be recycled by the system after the program ends. The life cycle is the entire program running period. Use malloc or new to apply for the heap.

Use of heap area:

Example:

int* getSpace()
{
    
    
	int* p = malloc(sizeof(int) * 5);
	for (int i = 0; i < 5; i++)
	{
    
    
		p[i] = 100 + i;
	}
	return p;
}

void test15()
{
    
    
	int* p = getSpace();
	for (int i = 0; i < 5; i++)
	{
    
    
		printf("p[%d] = %d\n", i, p[i]);
	}
	// 释放堆区数据
	free(p);
	p = NULL; // 这一步是避免成为野指针
}

int main()
{
    
    
	test15();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Notes on the heap area: In the calling function, a null pointer allocates memory. In the called function, allocation (modification) using a peer pointer fails.

Example:

void allocateSpace(char* pp)
{
    
    
	char* temp = malloc(100);
	memset(temp, 0, 100);
	strcpy(temp, "hello cdtaogang");
	pp = temp;
}

void test16()
{
    
    
	char* p = NULL;
	allocateSpace(p);
	printf("p = %s\n", p);
}

int main()
{
    
    
	test16();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Insert image description here

Solution: Use high-level pointers to modify low-level pointers

Example:

void allocateSpace2(char** pp)
{
    
    
	char* temp = malloc(100);
	memset(temp, 0, 100);
	strcpy(temp, "hello cdtaogang");
	*pp = temp;
}

void test17()
{
    
    
	char* p = NULL;
	allocateSpace2(&p);
	printf("p = %s\n", p);
}

Insert image description here

Insert image description here

2.3 Global/static area

The variables in the global static area have been allocated memory space and initialized during the compilation stage. This memory exists during the running of the program. It mainly stores global variables and static variables< a i=4> and constants.

Note:
(1) There is no distinction between initialized and uninitialized data areas because if the variables in the static storage area are not explicitly initialized, the compiler will automatically use the default Initialization is performed in a static storage area, that is, there are no uninitialized variables in the static storage area.
(2) The constants in the global static storage area are divided into constant variables and string constants. Once initialized, they cannot be modified. Constant variables in static storage are global variables, which are different from local constant variables. The difference is that local constant variables are stored on the stack and can actually be modified indirectly through pointers or references, while global constant variables are stored in the static constant area and cannot be modified indirectly.
(3) String constants are stored in the constant area of ​​the global/static storage area.

  1. static variable

Sample code: is only initialized once, memory is allocated during the compilation phase, is an internal link attribute, and can only be used in the current file

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


// 静态变量
static int a = 10; // 特点:只初始化一次,在编译阶段就分配内存,属于内部链接属性,只能在当前文件中使用

void test18() // 局部静态变量,作用域只能在当前test18中
{
    
    	
	// a 和 b的生命周期是一样的
	static int b = 20;
}

int main()
{
    
    	
	g_a = 2000; // error g_a默认为内部链接属性,在文件外是无法访问g_a的
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

  1. global variables

Sample code:In the C language, the keywords are hidden before global variablesextern, which are external Link properties

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


// 静态变量
static int a = 10; // 特点:只初始化一次,在编译阶段就分配内存,属于内部链接属性,只能在当前文件中使用

void test18() // 局部静态变量,作用域只能在当前test18中
{
    
    	
	// a 和 b的生命周期是一样的
	static int b = 20;
}

// 全局变量
extern int c = 100; //在C语言下 全局变量前都隐藏加了关键字  extern,属于外部链接属性

void test19()
{
    
    
	extern int g_b;//告诉编译器 g_b是外部链接属性变量,下面在使用这个变量时候不要报错
	printf("g_b = %d\n", g_b);

}
int main()
{
    
    	
	//g_a = 2000; // error g_a默认为内部链接属性,在文件外是无法访问g_a的
	test19();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

  1. constant

Sample code:const The global variable modified by , even if the syntax passes, is protected by the constant area during runtime and fails to run

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

//1、const修饰的全局变量,即使语法通过,但是运行时候受到常量区的保护,运行失败
const int a1 = 10; //放在常量区

void test20()
{
    
    
	//a1 = 100; //直接修改 失败

	// 间接修改,语法通过,但运行失败
	int* p = &a1;
	*p = 100;

	printf("%d\n", a1);
}

int main()
{
    
    
	test20();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Sample code:const modified local variable, the data is stored in the stack area, C is called a pseudo-constant in the language and is not protected by the constant area< /span>

//2、const修饰的局部变量
void test21()
{
    
    
	const int b = 10; // 数据存放在栈区,C语言下称为伪常量
	// b = 100; // 直接修改失败的
	// 间接修改成功,分配到栈上,没有常量区保护
	int* p = &b;
	*p = 100;
	printf("b = %d\n", b);
	//int a[b]; // 伪常量是不可以初始化数组的
}

int main()
{
    
    
	test21();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Sample code:String constants can be shared and modification of string constants is not allowed

//3、字符串常量
void test22()
{
    
    
	char* p1 = "hello cdtaogang";
	char* p2 = "hello cdtaogang";
	char* p3 = "hello cdtaogang";
	// 字符串常量是可以共享的
	printf("%d\n", p1);
	printf("%d\n", p2);
	printf("%d\n", p3);
	printf("%d\n", &"hello cdtaogang");

	p1[0] = 'b'; // 不允许修改字符串常量
	printf("%c\n", p1[0]);
}

int main()
{
    
    
	test22();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Are string constants modifiable? String constant optimization:

ANSI C stipulates that if a string constant is modified, the result is undefined. ANSI C does not stipulate how compiler implementers handle strings, for example:
1. Some compilers can modify string constants, and some compilers cannot modify string constants.
2. Some compilers treat multiple identical string constants as one (this optimization may appear in string constants to save space), and some do not perform this optimization. If optimization is performed, modification of one string constant may cause other string constants to also change, and the result is unknown.
So try not to modify string constants!

Are the string constant addresses the same?

TC2.0, the same file string constant address is different.
In VS2013 and above, the string constant address is the same in the same file and in different files.
Dev C++ and QT are the same in the same file, but different in different files.

2.4 Summary

When understandingC/C++ memory partitions, you often encounter the following terms: data area, heap, stack, static area, constant area, global area, string constant area, literal constant area , code area, etc., beginners are confused. Here, try to clarify the relationship between the above partitions.

The data area includes: Heap, stack, global/static storage area.

Global/static storage area includes: Constant area, global area and static area.

The constant area includes: String constant area and constant variable area.

Code area: Stores the compiled binary code of the program. It is an unaddressable area.

It can be said that there are actually only two memory partitions in C/C++, namely the code area and the data area.

3) Function call model

3.1 Macro function

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MYADD(x, y) x+y
// 运算保证完整性
#define MYADD2(x, y) ((x)+(y))

// 在预编译阶段做了宏替换
// 宏函数注意:保证运算的完整性
// 宏函数使用场景:将频繁短小的函数,封装为宏函数
// 优点:以空间换时间(入栈和出栈的时间)
void test23()
{
    
    
	int a = 10;
	int b = 20;

	printf("a + b = %d\n", MYADD(a, b)); // x+y == a+b == 10+20 = 30
	printf("a + b = %d\n", MYADD(a, b) * 20); // x+y*20 == a+b*20 == 10+20*20 == 410
	// 运算保证完整性
	printf("a + b = %d\n", MYADD2(a, b)); // 30
	printf("a + b = %d\n", MYADD2(a, b) * 20); // 600
}

int main()
{
    
    
	test23();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

3.2 Function calling process

Stack (stack) is one of the most important concepts in modern computer programs. Almost every program uses the stack. Without a stack, there would be no functions and no local variables, that is, There was no language for all the computers we see today. Before explaining why the stack is so important, let's first understand the traditional definition of a stack:

In classic computer science, a stack is defined as a special container. Users can push data onto the stack (push, push) or push data onto the stack. The data in the stack is popped (popped, pop), but the stack container must follow a rule: the data pushed into the stack first is popped last (First In Last Out,FILO)

In classic operating systems, the stack always grows downward. The push operation causes the address at the top of the stack to decrease, and the pop operation causes the address at the top of the stack to increase.

The stack plays an extremely important role in program execution. Most importantly, the stack holds information that needs to be maintained for a function call, which is often called a stack frame (Stack Frame) or an activity record (Activate Record). A function The information required for the calling process generally includes the following aspects:

  • The return address of the function;
  • function parameters;
  • local variables;
  • Saved context: includes registers that need to remain unchanged before and after function calls.

From the code below, we analyze the calling process of the following functions:

int func(int a,int b){
    
    
	int t_a = a;
	int t_b = b;
	return t_a + t_b;
}

int main(){
    
    
	int ret = 0;
	ret = func(10, 20);
	return EXIT_SUCCESS;
}

Thinking 1:a、bWhen variables are pushed onto the stack, should they be pushed from left to right or from right to left?
Thinking 2: a、bIs the variable managed and released by the main function (the main calling function) or by func Function (called function) management release?

3.3 Calling conventions

Now, we have a general understanding of the process of function calling. During this period, there is a phenomenon that the caller and the callee of the function have a consistent understanding of the function call. For example, both of them agree that the parameters of the function are based on a certain Pushed onto the stack in a fixed manner. If not, the function will not run correctly.

If the function caller first pushes in the a parameter when passing parameters, and then pushes in the b parameter, the called function will think that it is pushed first The one that is pushed in is b, and the one that is pushed later is a. Then when the called function uses the value of a,b, it will be reversed.

Therefore,The caller and the callee of a function must have a clear agreement on how the function is called. Only when both parties follow the same agreement can the function be called correctly. Calling, such a convention is called "Calling Convention" , a calling convention generally includes the following aspects:

The order and method of passing function parameters
There are many ways to pass functions, the most common one is through the stack. The caller of the function pushes the parameters onto the stack, and the function itself takes the parameters out of the stack. For functions with multiple parameters, the calling convention dictates the order in which the function caller pushes the parameters onto the stack: left to right or right to left. Some calling conventions also allow passing parameters using registers to improve performance.

Stack maintenance method
After the function pushes the parameters into the stack, the function body will be called. After that, all the parameters pushed into the stack need to be popped out so that the stack can be maintained in the stack. Be consistent before and after function calls. This pop-up work can be done by the caller of the function, or by the function itself.

In order to distinguish calling conventions at link time, the calling convention modifies the name of the function itself. Different calling conventions have different name modification strategies.

In fact, in the c language, there are multiple calling conventions, and the default is cdecl. Any one that does not explicitly specify a calling convention Functions are all cdecl conventions by default. For example, for the declaration of the func function above, its complete writing should be:

 int _cdecl func(int a,int b);

Note: _cdecl is not a standard keyword and may be written differently in different compilers. For example, gcc does not exist_cdecl, instead use__attribute__((cdecl)).

calling convention Exit party Parameter passing name modification
cdecl function caller Push parameters onto the stack from right to left Underscore + function name
stdcall function itself Push parameters onto the stack from right to left Underscore + function name + @ + number of parameter bytes
fastcall function itself The first two parameters are passed in registers and the remaining parameters are passed on the stack. @+function name+@+number of bytes of parameters
pascal function itself Push parameters onto the stack from left to right More complicated, see related documentation

3.4 Function variable transfer analysis

Insert image description here

Insert image description here

Insert image description here

Insert image description here

Insert image description here

Simple example:

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void B()
{
    
    

}

void A()
{
    
    
	int b; // 在函数A、B中可以使用,但在main函数中使用不了
	B();
}

int main()
{
    
    
	int a;  // 在main函数和A、B函数中都可以使用(传参)
	A();
	system("pause");
	return EXIT_SUCCESS;
}

4) The growth direction of the stack and the memory storage direction

4.1 The growth direction of the stack

Insert image description here

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// 栈的生长方向
void test24()
{
    
    
	int a = 10;  // 栈底-高地址
	int b = 20;
	int c = 30;
	int d = 40;  // 栈顶-低地址

	printf("%d\n", &a);
	printf("%d\n", &b);
	printf("%d\n", &c);
	printf("%d\n", &d);
}

int main()
{
    
    
	test24();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

4.2 Memory storage direction

Insert image description here

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// 内存存放方向
void test25()
{
    
    
	int a = 0x11223344;
	char* p = &a;
	// 小端结果,大端相反
	printf("%x\n", *p); // 44 - 低位字节 - 低地址
	printf("%x\n", *(p+1)); // 33 - 相对44 高位字节 - 高地址
	printf("%x\n", *(p + 2)); // 22
	printf("%x\n", *(p + 3)); // 11
}

int main()
{
    
    
	test25();
	system("pause");
	return EXIT_SUCCESS;
}

Insert image description here

Guess you like

Origin blog.csdn.net/qq_41782425/article/details/128245252