Graphical C/C++ bottom layer: creation and destruction of function stack frames (Part 2)

Creation and destruction of function stack frames (Part 2)

Link to the original article

According to the study of the function stack frame process in the previous article, we learned that:

  • What are registers?

    The fastest storage unit of the computer, because the register is integrated on the CPU, and it is an independent storage space different from the memory .

  • What is a stack?

    A data structure. After the data is put into the stack in sequence, the order of taking out the elements is that the element that enters first comes out last ;

    The stack area of ​​the function is at the operating system level, and the memory area is managed, which mainly runs on the system memory .

  • The formation process of the function stack frame

    The concept of a function stack frame:

    • In the register, the addresses are stored in the two registers EBP and ESP, and the pointers of these two registers are used to maintain the function stack frame . The memory space maintained by these two pointers is the stack frame of a function .
    • Every time a function is called, a space needs to be created in the stack area, and the creation process is realized by these two pointers; whichever function is called, the two pointer addresses of EBP and ESP will maintain the memory of this function Space, this is the stack frame of the function; for example, when the main function is running, the addresses of the two pointers esp and ebp will be located at the top and bottom of the function's stack.

    The formation process of the function stack frame:

    • When the program issues a call function operation, the ESP and EBP pointers are initialized to a low address offset to form a stack frame area, and three non-volatile registers are pushed into the stack frame area, and the ESP pointer also moves upward accordingly; then, the program It will offset 0E4H space units to the lower address until it is under the previously pushed ebx register, and fill the space with characters 0CCCCCCCCh. Finally, the loaded effective space is the effective space of the stack frame in the true sense of a function, and This piece of space filled with 0CCCCCCCCh characters is the scope of this function. At this moment, the initialization of the stack frame is actually completed, and the program begins to fill in valid content in this area, such as declaring variables.
  • The process of forming function variables

    After the initialization of the stack frame is completed, every time a variable is declared, the stack bottom pointer EBP will be used as the reference to offset the byte size upward (low address) to store the variable, and the variables will be superimposed one by one. Variables cannot be actively released by the user after application.

In this article, we will continue to learn: the process of function calling and passing parameters, how to create a stack frame for a new function, the realization of function return value and function return (destruction of stack frame), the destruction of function parameters and the return of the original function value process.

Assembly instructions: calling functions and passing parameters

How does the program call the function operation? Let's take a look next!

Not much nonsense, first look at the assembly instructions:

image-20220111162651396

So many assembly instructions have been explained earlier, are the instructions you see here starting to hold back the beating DNA (bu)?

The mov in the first and third sentences means to put ebp-14h and ebp-8 into eax and ecx respectively. Let's turn to the previous step to see what ebp-14h and ebp-8 are? According to the offset of the bottom pointer of the stack to the low address, it can be seen that, yes, it is the value of our a and b. Here is to put a and b into the registers eax and ecx respectively.

The push in the second and fourth rounds is the instruction pushed onto the stack. Push eax and ecx onto the stack respectively (don't forget that every time you push onto the stack, the top pointer of the program is also changing), and what is in eax and ecx now? Isn't it the value of a and b?

At this time, the schematic diagram of the top stack area should be as follows

image-20220113003226231

We can see it in the example function. Are these 4 actions very similar to preparations before passing parameters? The answer is yes. Can such put and push operations really pass parameters into functions? How does the calling function use our parameters? Let's keep reading!

In the fifth sentence, call is actually a transfer instruction, which is transferred to another area. At the same time, in order to complete the next instruction in the original area after the transfer, the call instruction will always push the next instruction into the stack area, so that Realize returning to the original place after the command in the transfer area is completed (in simple terms, insert an eye in the original place and send it to support, and finally send it back to the line. Make it possible to go back and forth). At this point, we know that the call instruction will push the address of the next instruction in the original area onto the stack, so the top of the stack should be the address of the next instruction (00C21450). Open the memory and monitor to confirm that this is the case

image-20220113014530114

Continue to look, the string of signs on the right of the call instruction is actually the "transmission" position of the call. At this time, we press the F11 of the debugger to enter the statement, and we will see the instruction at the statement. The jmp here is to jump into the add function (we only need to understand that jmp is also a transfer operation, and we will dig into the details later)

image-20220113014858867

Next, we continue to go down, welcome to the inside of the Add function! !

image-20220113015819685

At this point, the function call and parameter passing operation have been completed, and what we can conclude is:

  • When passing parameters: Before passing parameters, the program will first put the parameters to be passed into the register, and push the register address onto the stack. Then observe their order a->b, which is pushed into the stack from left to right (the schematic diagram shows that b is on the top, and according to the first-in-last-out principle of the stack, it proves that b is the last), and the pointer on the top of the stack changes at the same time .

  • When calling a function: When calling a function, the program will use the call instruction to enter the function. The call instruction will first push the next instruction after the calling function is completed into the stack area, so as to realize the operation of returning to the original function after calling and continuing to execute the content , then it will transfer according to the logo, and finally enter the new function.

At this time, the top diagram of the stack area should be as follows

image-20220113024520348

Assembly instructions: generation of stack frames for new functions

Entering the Add function and observing all the codes before the parameter z is formed, is there a general feeling of deja vu that suddenly met the TA in the previous life?

image-20220113111732263

That's right, here is to push the basic elements of the Add function into the stack area and form its scope, and finally generate the local variables in this function;

The only thing that needs to be reminded here is that the esp and ebp we mentioned earlier are used to maintain the pointers of the current running function, and the push ebp is actually the address of the bottom pointer of the ebp stack of the main function, so as to realize ebp ebp returns to the original place after the transfer and function execution is completed.

The next part is to form an effective Add function stack frame area, from which you can get a schematic diagram of the top area of ​​the stack (it is a little pink after drawing...pink is justice!)

image-20220113111702493

Next, let's observe how the parameters passed in are used in the new function.

Assembly instructions: use of function parameters

When we started learning C language programs, we had always known a theory of function parameter passing: the formal parameter is a temporary copy of the actual parameter. Now let's see how it works!

Not much to say, on the assembly instructions!

image-20220113112827169

Observe this assembly instruction, ebp+8, ebp+0Ch, conforming to the hexadecimal conversion is ebp+8 and ebp+12, combined with the position pointed by the current ebp to look at the stack frame area, where are these two positions pointing Woolen cloth?

image-20220113114358423

That's right, it is the function parameter that has been pushed onto the stack early before entering the function. The formal parameter and the actual parameter are two independent entities on the stack. The change of the formal parameter will not affect the change of the original actual parameter, so the formal parameter is a temporary copy of the actual parameter .

At the assembly instruction, mov will put the value of ebp+8(a) into the register eax, and add will put the value of ebp+12(b) into the register eax, which is also the principle of the program to realize addition.

Look at the next one, mov puts the value in the register eax into the position of ebp-8, what is ebp-8? ebp-8 is the value of z! At this point it has changed from 0 to 30.

image-20220113114858891

So far we can see that the reason why the formal parameter in a function is a temporary copy of the actual parameter is because the program will not actively create the formal parameter, and the program has already set the actual parameter before we call this function. The formal parameters that will be used are pushed onto the stack, and the program can get the desired formal parameter value as long as it goes to the front. This is the principle of formal parameter usage of functions.

Assembly instruction: function return value and implementation of function return (destruction of stack frame)

As we saw earlier, the program will assign the value of a+b to z, and then return z. According to what we have learned before, the program will destroy the local variable in the scope, and z is a local variable temporarily generated in the new function, so how does the program get the return value of z? And after the program finishes running, how does the two stack tops and stack top pointers of esp and ebp return to their original positions, and how does the program return to the next instruction in the original function?

Next, we continue to answer this question through assembly instructions.

image-20220113120354965

I took you to read so many assembly instructions before, and now I can go back to my first assembly instruction in 5 seconds. What does it mean? 5...4...3...2...1, the answer is correct, it is to put the value of ebp-8 into eax. And the value of ebp-8 is the position of z just now, then we can get that the return value of the function is usually put into the temporary storage in the register eax (why it is called usually, because other registers will be borrowed when the register size is exceeded, such as esi)

Continue to look down, what does the pop command mean (what is the meaning in English)? The pop command means to pop the stack out of the stack to release the element. There are 3 consecutive pops here, think about what this is? are 3 non-volatile registers on top of a function. When the function returns to the end, these 3 registers will be popped out. It should be noted that the position of esp is also adjusted when popping the stack (moving to the high address and adding)

The values ​​of esp and ebp before popping the stack

image-20220113183039460

Pop 3 register addresses after completion.

image-20220113183101547

The program is about to finish popping the stack area, so my esp and ebp pointers will also return to their original places? The mov esp,ebp in the next sentence is to adjust the positions of esp and ebp, and give the value of ebp to esp.

At this time, the program pops the ebp in the add function. The ebp here is the main function ebp pushed into the stack when entering the Add function. How does it return? This needs to mention a usage of the pop instruction. The pop instruction can use a register to receive the data popped out of the stack. The pop here actually pops the ebp of the Add function and then gets the main function that was originally pushed onto the stack. ebp, to achieve the jump of ebp, this is the function of pop ebp here.

The following ret is more interesting. What does ret mean? ret is to pop the top word unit out of the stack, and assign its value to the IP register, realizing the transfer of a program. In assembly language, the IP register is an intra-segment offset that represents the next instruction to be executed. So what is the top element of the stack now?

image-20220113193952433

Remember what this (00C21450) element at the top of the stack is now? If you don't know, the answer is here! This is the address of the instruction next to the call instruction reserved by the program before calling the function. Through the ret instruction, the program has returned to the main function.

At this point, the schematic diagram of the program stack area can be as follows

image-20220113194551278

From the above we can learn that:

  • The operation of the return value of the function is to temporarily put the return value into the register eax.
  • When the function is about to return, the program first destroys the local variables of the current function, and then pops the three non-volatile registers placed on the function stack frame to pop the stack area from top to bottom. ;
  • ** Immediately afterwards, the program will assign the value of ebp of the current stack bottom pointer to the stack top pointer esp, ** the shrinkage adjustment of the stack frame area, so as to realize the destruction of the stack frame of a function; at the same time, the program will read The address of the ebp of the original function once stored on the stack area (currently at the top of the stack), and transfer the ebp to the previously recorded address, and then pop up the address element of the original function ebp pressed on the stack, so that ebp returns to the original function.
  • Finally, the program will perform a ret operation, the purpose of which is to read the address of the next instruction pressed on the stack before calling the function, so as to realize the operation of returning to the original function after calling the function and continuing to execute the instruction.

According to the above steps, the schematic diagram of the current stack area can be obtained

image-20220113213629637

Assembly instruction: Destruction of formal parameters after return and acquisition of return value

As we mentioned earlier, the formal parameters are not created in the stack frame area of ​​the new function, but the elements that are pressed on the stack after a temporary copy of the actual parameters. When the function finishes running, how should its formal parameters be destroyed? Through many previous steps and observing the schematic diagram of the current stack frame of the function above, now you must have some ideas when you start your smart brain to think about how the formal parameters will be processed! Maybe your idea is exactly the same as the answer, which is to pop the stack area.

Only two commands will be mentioned here. Not much to say, Shangcai (hui) art (bian)!

image-20220113214120484

Tell me louder now! What does add esp,8 mean? (Combined with the previous article: stack frame process 3) Link to the original text of the previous article

The answer is, esp moves 8 bytes to the high address, and we said earlier that a stack frame element on a 32-bit machine is 4 bytes, so now move 8 bytes to the high address, isn’t it just storing the original Is the space for the 2 parameters on the stack frame destroyed? This is the answer to the destruction of a function's formal parameters, and no rebuttals are accepted QWQ! ~

Up to now, our program has also returned, and the formal parameters have been destroyed, and the return value says: "What about me?" Don't worry, let's read the next sentence, what does this sentence mean?

The answer is obvious, put the value in eax at ebp-20h.

And didn't eax just put the return value of the Add function? At ebp-20h, we just mentioned that ebp has returned to the original function, and the area pointed to by ebp-20h is the memory space of the variable c we declared before. In this way, the way to obtain the return value of a function from the return point to the original function is to first put the return value in the eax register, return to the original function, and then get the return value from eax.

According to the sample code, the remaining content is nothing more than the exit of the main function and the output of printf. No more details.


Creation and destruction of function stack frames - summary

At this point, the whole process of the formation and destruction of a program's function stack frame is explained~ You may feel cloudy and foggy.
Are you sure you don't want a summary?
Now let's take you to review the whole process a little bit and make a corresponding summary .

The function stack frame can be traced back to the top of the top three non-volatile registers, and the area where the ebp-0E4h marked by edi starts to fill the high address is called the effective area of ​​the function stack frame. When the stack frame is really effective, it should remove three non-volatile registers.

Function stack frame creation can be divided into 3 steps:

  • Step 1: What is the first thing a function does before it is ready to be called? Go in first ! The program will first push the ebp address of the currently running function onto the memory stack , so that ebp can return to the original place of ebp before the call after the function is finished running. If a normal function (non-main function) is called, the address of the next instruction will be pushed in so that execution continues after the call.
    At the same time, because the new data is pushed, the top pointer of the esp stack will also move up, and then the bottom pointer of the ebp stack will also move to the top pointer of the esp stack. At this time, the two pointers of esp and ebp are at the top of the stack. area.

  • Step 2: The program will issue a sub address subtraction instruction, instructing the esp to offset an area to the lower address . After the esp is offset to the new area, a new memory space formed by the current ebp stack top pointer is the stack frame area of ​​this function, which is also the scope of the function. Then the program will push into 3 non-volatile registers eax, esi, edi , these 3 registers are a calling convention (in order to be able to run on different platforms).

  • Step 3: The program issues the lea instruction based on the position of the pointer at the bottom of the ebp stack . The purpose is to load the effective space of a function stack frame. Usually, it will offset 0E4H space units to the lower address until it is under the previously pushed ebx register, and Fill this space with characters 0CCCCCCCCh, and finally the loaded effective space is the effective space of a stack frame in the true sense of a function, and this space filled with 0CCCCCCCCh characters is the scope of this function . Now, a function's stack frame is truly complete. At this time, the program begins to execute its valid code.

The destruction and return of the function stack frame can also be divided into 3 steps:

  • Step 1: The program will first pop the local variables in the function from the stack. If the program has a return value, it will temporarily put the return value into the register eax. Then the non-volatile registers ebx, esi, and edi on the top of the stack will be popped.
  • Step 2: The program assigns the value of ebp of the current stack bottom pointer to the stack top pointer esp, and the released space after moving esp down is the stack frame area of ​​the function.
  • Step 3: If there is another instruction, the program will read the address of the ebp stored in the original function on the stack, and transfer the ebp to the previously recorded address, so that the ebp returns to the original function, and then pops the compressed The address element of the original function ebp on the stack. Finally, the program reads the address of the next instruction pressed on the stack, pops the stack area after the reading is completed, and executes the next instruction.

No matter what instructions the program makes, the most important principle to remember is: no matter how much offset is taken, it is based on the position of the ebp pointer at the bottom of the stack; no matter what content is pushed, the esp pointer at the top of the stack will be offset upwards .

Let's go back to the few questions we raised in the previous article. After learning the function stack frame, we can answer these questions one by one! Come and tell me the answer out loud now! !

  • How is the scope of a function formed?

Answer : The stack frame of a function is the scope of a function.

  • How are local variables created?

Answer : After issuing the lea (load effective
address) instruction, the program begins to draw the definition domain of this function, and then starts to delineate the area to the lower address continuously based on the bottom pointer of the ebp stack at the bottom, and assigns this area to the hexadecimal value, this process is the process of local variable creation.

  • Why are the values ​​of uninitialized local variables random or garbled?

Answer : After issuing the lea (load effective
address) command, the program begins to draw the definition domain of this function and allocate local variables. Because the initial characters in this area are all 0CCCCCCCCh, the printed value at this time is mostly 0CCCCCCCCh. Form of expression.

  • How is the function passed parameters? What is the order of passing parameters?

Answer : Before passing parameters, the program will put the parameters to be passed into the register first, and push the register address onto the stack. Then observe their order
b->a, so some function parameters are pushed onto the stack from right to left (the schematic diagram shows that a is on top, and according to the first-in-last-out principle of the stack , it is proved that a is backward).

  • What is the relationship between formal parameters and actual parameters?

Answer : In the calling function, any newly generated local variables will be created in the stack frame area of ​​the calling function, and the method of using formal parameters is actually returning to the formal parameter data that was pushed onto the stack before the current function stack frame was created. Just because the formal parameter and the actual parameter are two independent entities on the stack, the change of the formal parameter will not affect the original actual parameter, so the formal parameter is a temporary copy of the actual parameter .

  • How is the function call implemented?

Answer : Before starting to call the function, the program will push the required function parameters to the stack in advance. When calling a function, the program will first push the next instruction and the address of the ebp of the current function into the stack area, so as to realize that the program continues to execute and ebp returns to the original place after the call is completed. Then it starts from the top of the stack and simultaneously pushes 3 non-volatile registers to form a complete function stack frame area. A function call process does exactly that.

  • How does the function return after the end of the call?

Answer : When the stack frame of the called function is destroyed, the program will read the address of ebp stored in the original function on the stack, and transfer ebp to the previously recorded address, so that ebp returns to the original function, and then Pop the address element of the original function ebp pressed on the stack. Finally, the program reads the address of the next instruction pressed on the stack, pops the stack area after the reading is completed, and executes the next instruction.

  • Why is there a maximum depth of function recursion? What does the stack overflow error raised by reaching the maximum depth mean?

Answer : The reason why the recursion of the function has the maximum depth is because each function has a function stack frame, which is limited by the stack space. If the recursion depth exceeds the space that the stack can bear, a stack overflow with the maximum depth will occur at this time. warning. The depth of different functions may be different, after all, the stack space required by each function is different.

epilogue

The process notes of a function's stack frame from creation to destruction are over! ~Because the relevant knowledge points are not only important, but also very, very many, and the author wants to express all the content very, very much, so the whole article is quite long (full text 1w+ words)! Believe that you will be able to gain a lot. If you have any questions or omissions, welcome to communicate and learn together in the comment area!

The series of learning articles must be combined with the first and second articles! ! Link to the original article

epilogue

The process notes of a function's stack frame from creation to destruction are over! ~ Because the relevant knowledge points are not only important, but also very, very many, so the length is longer! Believe that you will be able to gain a lot. If you have any questions or mistakes, welcome to exchange and learn in the comment area!

If you think this series of articles is helpful to you, don't forget to like and follow the author! Your encouragement is my motivation to continue to create, share and add more! May we all meet at the top together. Welcome to the author's official account: "01 Programming Cabin" as a guest! Follow the hut, learn programming without getting lost!

insert image description here

Guess you like

Origin blog.csdn.net/weixin_43654363/article/details/124285392