A detailed analysis of the dynamic changes of the stack in the memory when the C language function is called (color picture) How local variables are put into the stack and out of the stack

Copyright statement: This article is the original article of the blogger and may not be reproduced without the permission of the blogger. Welcome to contact me qq2488890051 https://blog.csdn.net/kangkanglhb88008/article/details/89739105
First understand the following knowledge and process:

* Von Neumann system computer program instruction codes are loaded from the hard disk into the memory in advance to execute (if it is the computer instruction code of the Harvard architecture is directly executed in the external memory, you can see my article for details, Computer Von Nuo The difference between the Iman architecture and the Harvard architecture and the processor performance evaluation standard), these instruction codes are stored in the memory of the code segment of the process, and the instruction codes in the same function are stored in address order (determined by the compiler) ( That is to say, as long as the instruction address + 1 can automatically get the address of the next instruction), then when a function call occurs, it is entered into the continuous address code segment of another function, so when the function is called, it must be pushed into the stack in advance. Save the address of an instruction after this function.

* The definition of the top and bottom of the stack is not defined according to the address height, but is defined according to the position of the stack and the stack. The place of the stack and the stack is called the top of the stack (although the stack growth in the Windows operating system is from high Address to low address), the stack is last-in, first-out, and the called function is pushed onto the stack, so when the function returns, it is also recycled first, that is, the reclaimed space is rolled back layer by layer

* Stack is a stack, but it is often called a stack. Heap is a heap. Don't name it randomly. Regarding how the stack in the memory allocates space and what the difference is, you can see my article, Detailed Explanation of Computer Program Storage Allocation and Overview of C Language Function Calling Process

* The entire program maintains a stack (if it is running an operating system, there may be multiple processes, then there will be multiple independent stacks, and for example, a single-chip bare-metal program has only one stack). This stack is dynamically changing, and the variables Allocation and release are all about the dynamic movement of the pointer on the top of the stack. If the variables in the stack that need to be released need to be saved and continue to be used, then use the pop method, and the stack will be saved in a certain cpu register at the same time, such as EAX , This register can be used to temporarily save the variable value or the final function return value

* The cpu has multiple registers, which are mainly used for some temporary values ​​of the current active function (when a function is running, I call it the active function here), and may dynamically change the contents of the register, and the current active function Interaction or something, but we are more concerned about four, EAX, ESP, EBP, EIP, the explanation is as follows

 

* The activity function will push the local variables and the values ​​in some registers into the stack (not the address of the register into the stack, because the address of the register is defined as ebx and so on through the macro, that is, the register address has been made public, about this For details, please see my detailed explanation in this article, the structure of the embedded microprocessor and the explanation of the process from power-on to starting to run the program), because the entire cpu has only such a set of registers, but the function call can have many layers, so the current function call After the next function, the active function becomes the next function. At this time, the value of this set of registers is first put on the stack, that is, saved, and then used to support the operation of the new active function. When the new active function After the end, the value just saved on the stack will be re-assigned to this set of registers, so the execution state before the call is restored.

* The program has only one stack, but there can be hierarchical calls of functions, and each function will have a partial stack in this total stack, also called stack frame

For example, currently in the main function main, the stack is as follows:

 

The local variables here are defined in the main function, and what ebx, esi, and edi do specifically, why are they stacked (definitely some record information), don’t understand, as long as you know that each active function will put these three Just push the value in the register onto the stack. The value of the EBP register stores the bottom address of the current active function stack frame, while the ESP stores the top address of the current stack frame. The address of the next instruction executed by the cpu is directly read from the EIP register. The EIP register is used to store the address of the instruction to be executed each time, so we have to manually fill in the address of the next instruction before each execution. Go in, that is, after the function call that will be seen later, pop out the address that was put on the stack in advance from the stack. (If pop the address of the next instruction, this assembly instruction should also fill the address into the EIP register at the same time)

 

// The assembly code of the instruction process executed in the main function. The first column represents the address of the instruction. I removed the irrelevant instruction code
011C1540 push ebp // Push the stack and save the ebp (this is the function that called the main function The stack bottom address of the stack frame, I don’t know who called the main function, it should be the operating system), note that the push operation implies esp-4
011C1541 mov ebp,esp //pass the value of esp to ebp, Set the current ebp
011C1543 sub esp,0F0h //Open up space for the function, the range is (ebp, ebp-0xF0)
011C1549 push ebx
011C154A push esi
011C154B push edi
011C154C lea edi,[ebp-0F0h] //Set edi to ebp- 0xF0 The next few instructions do not need to look at
011C1552 mov ecx,3Ch //The number of dwords in the function space, 0xF0>>2 = 0x3C
011C1557 mov eax,0CCCCCCCCh
011C155C rep stos dword ptr es:[edi]
//The purpose of the rep instruction It is to repeat the above instruction. The value of ECX is the number of times of repetition.
//The function of the STOS instruction is to copy the value in eax to the address pointed to by ES:EDI, and then EDI+4
// Here is to start calling print_out(0, 2)
013D155E push 2 //The second actual parameter is pushed onto the stack
013D1560 push 0 //The first actual parameter is pushed onto the stack
013D1562 call print_out (13D10FAh)//The return address is pushed onto the stack, in this case 013D1567, and then the print_out function
013D1567 add esp,8 is called //Two actual parameters are popped out of the stack
//Note that in the call command, the implicit operation is to down The address of an instruction is pushed onto the stack, which is the so-called return address.
// When the called function is executed to the return statement, it is ready to end the function. The return process is
013D141C mov eax,1 //The return value is passed into eax
013D1421 pop edi
013D1422 pop esi
013D1423 pop ebx //Register pops
013D1424 add esp,0D0h //The following 3 commands call __RTC_CheckEsp of VS, check stack overflow
013D142A cmp ebp,esp
013D142C call @ILT+315(__RTC_CheckEsp) (13D1140h)
013D1431 mov esp,ebp //pass the value of ebp to esp, that is, restore the value of esp before the call
013D1433 pop ebp //pop ebp, restore the value of ebp
013D1434 ret //write the return address into EIP, which is equivalent to pop EIP
Now another function print_out is called in the main function, and its stack changes are as follows:

 

We can see that the hierarchical call of the function is actually the repeated stacking of the contents of different active functions (in the same way). If the print_out function calls another function, it is the same as adding another stack frame.

Now let's analyze the process and sequence of this stacking:

The main function is also called by some other function. We won’t investigate it here, because the stack grows to a lower address. We can see that the execution process of the main function (that is, the current active function) is first defined in main. The local variables are pushed onto the stack, followed by the contents of the three registers. At this time, continue to execute and find that the function prin_out is called. At this time, two 4-byte spaces will be opened in the stack (because only found Two int-type formal parameters), that is, the declaration of two variables in the C language, and at the same time fill these two spaces with 0 and 2 respectively, which completes the declaration and initialization of the function parameters (because it is still In the main function stack frame, so we can see that the declaration and assignment of the formal parameters of the called function are done in the calling function, not the space allocated by the called function itself), which is the actual parameter seen above 1, 2 exists in the stack, and then before entering the print_out function, the main function has to save the next instruction address of the print_out function (that is, the return address in the above figure) into the stack (this process is call print_out assembly The instruction will be completed automatically. In fact, the address of the next instruction is the operation of reclaiming the space occupied by the two actual parameters just allocated, that is, the address of the instruction add esp,8. No hurry, I will analyze in detail later. This), because after the print_out function is finished, the main function knows how to continue. (Question point: Can’t the address of the next instruction of this print_out function be the print_out function and tell it to the main function when the execution is almost complete? Of course not, because the print_out function itself doesn’t know who the next instruction is, and it may be different. For function calls, the outer function (the caller) has no knowledge of it). When the return address is also pushed onto the stack, you can enter the print_out function.

                     

After entering the print_out function, it is the same way as when entering the main function. First, the stack bottom address of the caller (main function) is pushed into the stack.

The bottom address of the stack frame of the main function, that is, the address of the memory unit pointed to by the red arrow in the figure, is the value of ebp(main) in the stack (the purpose is that after the print_out function call is completed, the main function becomes the active function again, main The stack frame becomes the current stack frame. Fill in the address value of the EBP register so that EBP can quickly point to the correct position, that is, the red arrow. At this time, ESP must of course point to the position of edi, which is the main function stack frame. The top position of the stack is now. Looking at it this way, it is restored to the appearance of the stack when the print_out function was not called. This is the right picture above, perfect, so perfect), and then you can enter the print_out function.

Then allocate the total space required by the local variables for the current active function (print_out function) (the allocation of 8 here is not necessarily accurate, because the values ​​in the three registers of ebx, esi, and edi should also be pushed onto the stack, which should be 20 bytes) , But for the sake of simplicity, it is not so rigorous, but the principle is correct), then push the stack local variables, ebx, esi, edi three register values, and then perform the corresponding operation process, once you encounter the return statement, At this time, the print_out function knows that its execution is about to end, so it starts to recover the stack frame of this function, just save the return value to the EXA register (there is a return value, if there is no return value, the function If it is void type, then there is no need to save the return value to the EXA register), because the local variables and the values ​​in the three registers of ebx, esi, and edi are meaningless values, just throw them away, that is, put the esp register The content is directly assigned to the address value in the ebp register, that is, esp and ebp point to the same memory unit. At this time, the top of the stack becomes ebp (main), which realizes the recovery of the stack memory, as shown in the following figure. The corresponding assembly code is mov esp, ebp,

 

At this time, fill the ebp register into the stack bottom address of the main function stack frame that was pushed into the stack in advance, that is, ebp(main) is popped out of the stack, and assigned to the ebp register at the same time

That is: pop ebp //pop ebp, and assign this address value into the ebp register at the same time, that is, restore the value of ebp, that is, ebp points to the bottom of the stack frame of the main function, as shown below


At this time, the stack frame of the print_out function has been recovered. At this time, the stack frame of the main function has been reached, but the instruction code segment of the main function has not been reached.

Then the ret instruction comes to the print_out function, that is, the return address (stored in the main stack frame) is written into the EIP, which is equivalent to pop EIP, as shown below


At this point, the print_out function is completely executed, and it returns to the main function instruction section. Obviously, the next instruction is to continue to reclaim the two variable spaces allocated for the print_out function formal parameters in the main function (the main function was originally called The process of function allocation of formal parameters is also an instruction belonging to the main function), that is, the following instruction

add esp,8 //Two actual parameters are popped from the stack, that is, the space of the two actual parameters is reclaimed, as shown below


That is to say, the instruction after the instruction to call the print_out function in the main function is the instruction add esp,8 (the compiler can know the relationship between these two instructions, so this is not dynamic), so it will be called at the beginning The return address that is pushed onto the stack in the print_out function is the address of the instruction add esp,8, which is the address of the instruction that reclaims the space of the actual parameter. What about the address of the next instruction, because we said the same at the beginning The instruction code of a function is stored in a contiguous address space, so you only need to add the address of the add esp,8 instruction + 1 to get the address of the next instruction to be executed.

In this way, the entire print_out function call is completed, and the main function stack frame is restored to the original state when the print_out function was not called. As shown in the figure above, perfect, complete.

 

Next, let’s look at an example. With the above analysis basis, the following one can be easily analyzed in the same way. The assembly instruction code inside is clear and clear, and the whole process is clear and clear.

 

 

/-------------------------------------------------------------------------------------------------------------------/


Now, let's summarize the changing process of the stack when the function is called:

1. The caller opens up the space needed for the formal parameters of the called function in his own stack frame

2. The address value that should be executed after the stack function call ends, that is, the return address, which is actually the address of the instruction that reclaims the space opened up for the formal parameters in the first step

3. Enter the called function, push the stack bottom address of the stack frame of the calling function

4. After allocating space for local variables in the current stack frame of the new function, push the local variables into the stack

5. The called function encounters the return statement, indicating that the function is about to end, and it starts to reclaim the space of the stack frame:

        1) If there is a return value, then assign the return value to EAX, if not, ignore this step.

        2) Reclaim the local variable space, that is, esp points to the top of the stack frame of the calling function

        3) The stack bottom address of the main function stack frame saved in advance is assigned into the ebp register, so that ebp points to the stack bottom of the main function stack frame

        4) Fill the return address into the EIP register, and then it will point to the instruction address of the two formal parameter spaces that the main function originally opened up for the called function

        5) Recovery of formal parameter space

This restores the main function stack frame and returns to the stack frame when that function was not called.

 

Some conclusions can be drawn from the above, a function is actually a dynamic concept, its existence is only reflected in the memory, that is, its corresponding stack frame, when its stack frame is recycled, then the function ends Up.

 

Finally, let’s discuss such a problem: We just saw that the called function passes the return value through the cpu's eax and edx registers, and then the calling function only needs to read the values ​​of these two registers to get the called function. The return value, but the two registers eax and edx are both 32 bits, which means that a total of 8 bytes of data can be returned. For basic types of data (such as char, int, float, double (occupying 8 bytes), pointers Type) is no problem, but if we want to return data of a structure type and the total size of the members exceeds 8 bytes (the common method is to pass the structure pointer. But as a language allowed method, it is necessary to clarify the compiler How to achieve this way), what is the principle?

Answer: The same program compiled by our compiler generally supports the generation of two versions of the target code, the debug version and the release version. The debug version compilation results are generally used for the debugger, and the code optimization is lower, which better restores the developer The structure of the source program written in C language. The release version refers to the release version, that is, the software is released for use on the shelves. The compiler optimizes the code to a high degree, and deletes useless code and unreachable status. (If you are interested in understanding code optimization, you can refer to the compilation principle book) , It is not easy to debug, but the operation efficiency is higher. In fact, the principles of the two are basically the same. Here we will briefly explain the debug version and the release version.

In the first case, the return process of a structure that does not exceed 8 bytes: as shown in the following figure:

 

to sum up:

  (1.1) Use edx:eax to pass the return value. The caller does not need to pass the address of the return value to the add function on the stack. That is, the process is the same as the return of basic data type variables.

  (2.2) The debug version generates a temporary object return value on the caller (this is not the case with the release version. The memory space in the red box in the figure above will not exist, but the value of the register is directly copied to the t variable of the main function , So the release version is more efficient), and then copy the temporary object to the address of the variable t specified by main. low efficiency. We can see that the temporary object is in the stack frame of the main function, which means that the main function analyzes its return value type size before calling the add function, and then allocates space. When the call is completed, the temporary object The value of (return value content) is copied to the assigned variable t on the left. At this time, the temporary object has completed its mission, and the main function reclaims the space of the temporary object.

 

In the second case, the return process of a structure exceeding 8 bytes: as shown in the following figure:

 

to sum up:

  (1) When the structure exceeds 8 bytes, it cannot be passed by EDX:EAX. At this time, the caller keeps a structure for filling the return value on his own stack frame, and its address is pushed to the stack after the actual parameters are pushed onto the stack. Up, as shown in the blue arrow above. The called function add will set the return value to this address according to this address, the red arrow.

  (2) In the main function, the debug version has one more temporary object than the release version, which is inefficient. In the release version, there are only return values ​​and temporary variables t (the temporary object in the red box in the figure does not exist), which is slightly more efficient than debug. But the two models are basically the same. You still have to copy the contents of the space in the return value to the space of the assignment variable t specified on the left (referring to the t in the main function), and then reclaim the space corresponding to the return value. The overall efficiency It is still lower than the structure pointer (because the pointer only occupies 4 bytes, it can be returned directly through the eax register, and then assigned to the pointer t), so it is recommended to use the pointer to return when returning structure type data in the C language , The code runs more efficiently.

  (3) For the above two experiments, the release version optimization is relatively strong, the assignment of t in the main function is incomplete, because the compiler thinks that some members are not used (such as the assignment of the two members of tb and tc, that is, useless code ), so there is no need to copy, as long as the code is equivalent (for specific knowledge, please refer to the code optimization chapter of the book compilation principles).

The assembly code corresponding to the above two experiments is not posted here. The compiler optimization function is not omnipotent. After we know the underlying process, we will be able to write code in the future and write higher quality and more efficient code.

Welcome to follow my blog. When I have time, I will write some easy-to-understand scientific articles about basic computer theory. On the one hand, it can be used to record my own learning process, and on the other hand, I can share it with others so that more people can understand How computers work everywhere in life today.

 

Reference article:

Function call-function stack https://www.cnblogs.com/rain-lei/p/3622057.html

Memory allocation at runtime after the program is compiled https://www.cnblogs.com/guochaoxxl/p/6977712.html

The difference between heap and stack https://www.cnblogs.com/yechanglv/p/6941993.html

About the function that returns the structure https://www.cnblogs.com/hoodlum1980/archive/2012/07/18/2598185.html
---------------------
author: biao2488890051
source: CSDN
original: https: //blog.csdn.net/kangkanglhb88008/article/details/89739105
copyright: This article is a blogger original article, reproduced, please attach Bowen link!

Guess you like

Origin blog.csdn.net/qq_25814297/article/details/108462206