Regarding the transfer of function parameters and the change of the stack pointer, there has been a lack of systematic understanding and understanding. Various blogs have only explained one partial knowledge point one-sidedly, and have no overall grasp and deep understanding of the stack. This article attempts to explain the changes of the stack when the function is called and how to transfer the parameters from the assembly and the whole. If you read this article carefully, I believe there will be some gains.
Stack pointer and related registers
The stack is the most common data structure in the operating system. Strictly speaking, the stack is composed of two data structures, a stack and a stack, but we usually say that the stack actually refers to the stack. In the stack, the two most important pointers are SP (stack pointer) and BP (base finger pointer).
- SP (Stack Pointer), stack pointer. In a 32-bit system, the ESP (Extended SP) register stores the stack pointer. In a 64-bit system, it appears as an RSP register. SP always points to the top of the top stack frame of the system stack. So SP is the top pointer
- BP (Base Pointer), the base finger pointer. In a 32-bit system, the EBP (Extended BP) register stores the base finger pointer. In the 64-bit system, it is represented as RBP register. BP points to the bottom of the stack frame, commonly referred to as the stack bottom pointer
I believe you will see the above definition in most blogs, but what exactly do these pointers and registers do? SP, the pointer is the address, and the top pointer of the stack is stored. The purpose is that the next time the stack is operated, the system can find the current position of the stack in time. For example, push pushes an operand, and stores a word-length operand in the memory space of the address of sp + 4. The role of BP will be described below.
Function call
When one function calls another, how does the stack change? How have SP and BP changed? How are function parameters passed? Later we will write a simple Demo program to deepen the understanding of stack related registers.
In a function, call another function, there are often the following steps
Assembly instructions | Instruction attribution function | SP changes | effect |
---|---|---|---|
push arg2 | Main function | sp-4 | |
push arg1 | Main function | sp-4 | |
call function |
Main function | sp-4 | Start calling the subroutine while saving the return address |
push ebp |
Subfunction | sp-4 | |
push ebp, esp | Subfunction | sp-4 | Save the current sp into bp, the purpose is to locate the function parameters |
Under SP, #num | Subfunction | SP-whether | Allocate stack space for subroutines |
… | Subfunction | … | The specific implementation logic of the function |
pop ebp | Subfunction | sp+4 | |
right | Subfunction | sp+4 |
Explanation
push arg
Before calling a function, you need to push the passed parameters onto the stack, so you need to have them. After each push, the stack has an extra word length (32-bit system-> 4 bytes), so the top of the stack needs to move up 4 bytes, the instruction impliessub sp, #4
call
The call instruction is used to call a function. The instruction has two operations: (1) push the return address onto the stack; (2) sp = sp-4push ebp
push ebp, esp
Such an operation, you will see at the beginning of each functionret
, I.e. return, this time should point spcall
instruction just pushed return address; executionret
is actually the case in the pop-up data stack, register memory to eip. eip stores the address of the next instruction to be executed. At the same time sp = sp + 4ret
The command is equivalent to pop eip; esp = esp + 4call
The instruction is equivalent to push eip; esp = esp-4
Seeing the above code and description, you may still not fully understand the changes in the stack, it does not matter, we put the picture below. . Blue indicates the assembly instruction of the original function, and green indicates the assembly instruction of the called function and the corresponding stack space.
Main function calls subfunction
The picture above shows the change in stack space when the function is called. From the bottom to the top, sp is the top of the original function stack. Here, it is assumed that the original sp = 0xc1111 0000 . At the beginning of the called function, there will always be
push ebp
mov ebp, esp
Combined with the above figure, it is not difficult to see that ebp = 0xc1111 0000-4 * 4. When no other function is called, the ebp in the function generally remains unchanged. In other words, ebp + 8 represents the address of arg1, and ebp + 12 represents the address of arg2. Thus one role of ebp is to find the function parameter, of course, also local variables to locate the stack by ebp , the subroutine is through ebp + 偏移量
to the calling parameters passed to the main program.
Subfunction returns main function
The figure above shows the change in stack space when the function returns to the previous layer. This time, look down from the top. Because just started push ebp
operations at the end of the calling function, but also need pop ebp
. ret
The operation will pop up the put back address in the stack to eip, and sp will increase by 4. After the called function is executed, the next step will continue to execute the instruction before the called function. But at this time, sp points to the original arg1 and does not point to the top of the original main function stack . If there are other data in the original stack, sp does not return to the home will cause the main function to reference the data in the stack error.
Stack balance
In this context, the concept of stack balance appeared . That is, you need to perform a separate operation on sp before you can point sp to the top of the original function stack. In the common C language, there are several calling rules for functions. For example, cdecl method and stdcall method.
cdecl embodiment, the main routine executed by the add esp, n
adjustment esp instruction stack to reach equilibrium. In stdcall embodiment, when returning from the subroutine, execution ret n
balance stack. n is actually the amount of space occupied by the parameters of the function.
Specific case
/* stack_test.c */
#include <stdio.h>
int func(int a, int b)
{
int c = a + b;
return c;
}
int main(int argc, char const *argv[])
{
int result = func(1, 2); /* */
return 0;
}
Compile the above code into a 32-bit program
gcc stack_test.c -m32 -o test
Use objdump or ida View assembly, it can be seen, the use of the default cdecl
mode stack balanced.
objdump -d test -M intel
Main program
Subroutine
64-bit system parameter passing
In 32-bit operating systems, as we mentioned in the previous section, the way to use ebp + n is to use the stack to pass parameters . The 64-bit operating system is different from 32-bit, and the function parameters are directly called by the register , so there is no stack balance.
Compile the previous example into a 64-bit program and then disassemble it. The main program is as follows
Subroutine
The role of 64-bit system registers, the first six parameters of 64-bit programs are passed through RDI, RSI, RDX, RCX, R8 and R9
to sum up
This article explains the function call and parameter passing from the perspective of assembly. Taking the 32-bit operating system as an example, it shows the dynamic change process of the stack in the program call, especially the esp and ebp registers. In this context, the concept of stack balance is introduced. Finally, it introduces the differences in function transfer parameters in the 64-bit operating system, that is, the 64-bit system uses registers to directly access the parameters of the function, so that there is no balancing stack.
Examples given herein described the case of using C language default Cdecl
mode call function, in the case of 32-bit, using the stack to pass parameters, the specific process is summarized as follows
- The main program pushes each parameter onto the stack one by one from right to left, which means that the last parameter is pushed onto the stack first
- The subroutine accesses the parameters through the ebp register
ebp + n
. When n = 8, it represents the first parameter, n = 12 represents the second parameter, and so on - The subroutine uses the ret instruction to return to the main program
- Main point to
add esp, n
be balanced stack - The main program obtains the return value of the subroutine through eax
Finally, put a picture to illustrate the distribution of 4GB virtual memory space in the x86 architecture 32-bit system, I hope everyone can have a macro understanding of the stack