function stack frame
Foreword:
In order to learn C language in depth and to facilitate understanding, I learned function stack frames. The creation and destruction of function stack frames can give me a deeper understanding of programming logic and syntax. We learn grammar and programming logic based on encapsulated knowledge. Therefore, it is necessary for us to learn the creation and destruction of function stack frames. This blog will be used to introduce the process of creating and destroying function stack frames. I hope everyone can learn together. If there are deficiencies, please point out a lot, thank you!
Note:
Here I am using vs2022 to show you. The results displayed on different compilers will be different, but the general logic is the same (it can also serve as a reference). The higher the version of the compiler, the harder it is to observe. It is not easy to watch the process of creating and destroying the function stack frame, and the encapsulation process will be more complicated.
First, understand the relevant registers and assembly instructions
1. Registers (registers are integrated on the cpu)
eax: Accumulation register, compared with other registers, it is more commonly used in operations.
ebx: Base address register, which stores the base address in memory addressing.
ecx: counting register, used for loop operations, such as repeated character storage operations or digital statistics.
edx: As the overflow register of EAX, it is always used to put the remainder generated by integer division.
esi: source index register, mainly used to store the offset of the storage unit in the segment. It is usually used as a "source address pointer" in memory manipulation instructions.
edi: destination index register, mainly used to store the offset of the storage unit in the segment.
ebp : stack bottom pointer
esp : stack top pointer
esp and ebp These two registers store addresses, these two addresses are used to maintain the function stack frame; when esp and ebp are used to maintain the function stack frame, what is being called function, it will maintain that function.
rbp, rsp (64-bit compilation, ebp, esp registers for 32-bit compilation) these two registers store addresses, and these two addresses are used to maintain the function stack frame.
2. Assembly instructions
push:
Push the stack, put an element on the top of the stack. (The data is pushed into the stack, and the esp stack top register will also change at the same time)
pop: pop
out of the stack, and delete an element from the top of the stack. (The data is popped to the specified location, and the top register of the esp stack will also be changed)
mov: data transfer instruction. (The back pointer points to the front)
sub: Subtraction command. (The value in front minus the value in the back)
add: addition command.
call: function call, 1. push return address 2. transfer to target function
jump: modify eip, transfer to target function, and call.
lea: load, load the effective address behind to the front.
Supplement:
The use of the stack area is from high address to low address.
The use of the stack area follows the first-in-last-out, last-in-first-out.
The placement of the stack area is from high address to low address: push is to push the stack
and delete is to delete from low to high: pop is out of the stack,
as shown in the figure:
2. The process of creating and destroying function stack frames
This demo takes vs2022 as an example
demo code:
#include <stdio.h>
int ADD(int x,int y)
{
int z = x + y;
return z;
}
int main()
{
int a = 3,b=6,c=0;
c = ADD(a,b);
printf("%d\n", c);
return 0;
}
Preparations:
1) Press F10 to enter the function call mode:
2) Open the call stack, and the call stack window appears:
3) Right-click the mouse in the call mode, click Go to disassembly, and enter the disassembly interface:
1. The call of the main function
The main function can also be called by other functions:
1) For the convenience of reading, we uncheck the "Show symbol name".
2) Press F10, from the call stack, we can see that the main function is called by other functions:
The main() function is called by the invoke_main() function;
the invoke_main() function is called by the __scrt_common_main_seh() function;
the __scrt_common_main_seh() function is called by the __scrt_common_main() function;
the __scrt_common_main() function is called by the mainCRTStartup(void * __formal) function.
Note:
The higher the compiler version, the harder it is to observe the disassembly. If the compiler version is too high, it will be optimized.
2. Creation of function stack frame
1) The assembly code is as follows:
int main()
{
00CD18B0 push ebp
00CD18B1 mov ebp,esp
00CD18B3 sub esp,0E4h
00CD18B9 push ebx
00CD18BA push esi
00CD18BB push edi
00CD18BC lea edi,[ebp-24h]
00CD18BF mov ecx,9
00CD18C4 mov eax,0CCCCCCCCh
00CD18C9 rep stos dword ptr es:[edi]
00CD18CB mov ecx,0CDC008h
00CD18D0 call 00CD131B
int a = 3, b = 6,c = 0;
00CD18D5 mov dword ptr [ebp-8],3
00CD18DC mov dword ptr [ebp-14h],6
00CD18E3 mov dword ptr [ebp-20h],0
c = ADD(a,b);
00CD18EA mov eax,dword ptr [ebp-14h]
00CD18ED push eax
00CD18EE mov ecx,dword ptr [ebp-8]
00CD18F1 push ecx
00CD18F2 call 00CD1217
00CD18F7 add esp,8
00CD18FA mov dword ptr [ebp-20h],eax
printf("%d\n", c);
00CD18FD mov eax,dword ptr [ebp-20h]
00CD1900 push eax
00CD1901 push 0CD7B30h
00CD1906 call 00CD10CD
00CD190B add esp,8
return 0;
00CD190E xor eax,eax
}
00CD1910 pop edi
00CD1911 pop esi
00CD1912 pop ebx
00CD1913 add esp,0E4h
00CD1919 cmp ebp,esp
00CD191B call 00CD1244
00CD1920 mov esp,ebp
00CD1922 pop ebp
00CD1923 ret
2) Open up space for the main function
00CD18B0 push ebp /*压栈,栈顶放一个元素,把ebp寄存器中的值进行压栈,此时的ebp中存放的是
invoke_main函数栈帧的ebp,esp-4*/
00CD18B1 mov ebp,esp /*把esp的值存放到ebp中,相当于产生了main函数的
ebp,这个值就是invoke_main函数栈帧的esp*/
00CD18B3 sub esp,0E4h /*sub会让esp中的地址减去一个16进制数字0xe4,产生新的
esp,此时的esp是main函数栈帧的esp,此时结合上一条指令的ebp和当前的esp,ebp和esp之间维护了一
个块栈空间,这块栈空间就是为main函数开辟的,就是main函数的栈帧空间,这一段空间中将存储main函数
中的局部变量,临时数据已经调试信息等。*/
00CD18B9 push ebx //将寄存器ebx的值压栈,esp-4
00CD18BA push esi //将寄存器esi的值压栈,esp-4
00CD18BB push edi //将寄存器edi的值压栈,esp-4
/*上面3条指令保存了3个寄存器的值在栈区,这3个寄存器的在函数随后执行中可能会被修改,所以先保存寄
存器原来的值,以便在退出函数时恢复。*/
//下面的代码是在初始化main函数的栈帧空间。
//1. 先把ebp-24h的地址,放在edi中
//2. 把9放在ecx中
//3. 把0xCCCCCCCC放在eax中
//4. 将从ebp-0x24h到ebp这一段的内存的每个字节都初始化为CCCCCCCCh
00CD18BC lea edi,[ebp-24h] //把后面有效的地址加载到前面空间里
00CD18BF mov ecx,9
00CD18C4 mov eax,0CCCCCCCCh /*每一次四个字节,总共出了*/
00CD18C9 rep stos dword ptr es:[edi] //word是一个字两个字节;dword是两个字,四个字节。
00CD18CB mov ecx,0CDC008h //把0CDC008h放在ecx里
00CD18D0 call 00CD131B //执行 call指令之前先会把call 指令的下一条指令的地址进行压栈操作
Illustration:
3) Core code
int a = 3, b = 6,c = 0;//变量a,b,c的创建和初始化,这就是局部的变量的创建和初始化
00CD18D5 mov dword ptr [ebp-8],3
00CD18DC mov dword ptr [ebp-14h],6
00CD18E3 mov dword ptr [ebp-20h],0
c = ADD(a,b);
00CD18EA mov eax,dword ptr [ebp-14h]
00CD18ED push eax
00CD18EE mov ecx,dword ptr [ebp-8]
00CD18F1 push ecx
00CD18F2 call 00CD1217
00CD18F7 add esp,8
00CD18FA mov dword ptr [ebp-20h],eax
1). Create initialization for variables a, b, and c
int a = 3, b = 6,c = 0;//变量a,b,c的创建和初始化,这就是局部的变量的创建和初始化
00CD18D5 mov dword ptr [ebp-8],3 //把3放到ebp-8地址里
00CD18DC mov dword ptr [ebp-14h],6 //把6放到ebp-14h里
00CD18E3 mov dword ptr [ebp-20h],0 //把0放到ebp-20h里
Illustration:
2). Call the Add function
c = ADD(a,b);
00CD18EA mov eax,dword ptr [ebp-14h] //把ebp-14h里的值给eax
00CD18ED push eax //压栈,压一个元素,寄存器eax里压入ebp-14h里面的值
00CD18EE mov ecx,dword ptr [ebp-8] //把ebp-8里的值给ecx
00CD18F1 push ecx //压栈,压一个元素,寄存器exc里压入ebp-8里面的值
00CD18F2 call 00CD1217 /*这条指令是去调用ADD函数,把地址00CD18F7存放到地址00CD18F2里(call指令的下一条指令的地址),按一下F11,进入被调函数ADD里(地址00CD1217),调用结束后,来到了下一条指令的地址处*/
00CD18F7 add esp,8
00CD18FA mov dword ptr [ebp-20h],eax
Figure:
3). Enter the ADD function (press F11 at the call command, and then press F11 again).
Here I re-enter the debug mode, so the position of the address has changed, and the meaning remains the same.
int main()
{
00C518B0 push ebp
00C518B1 mov ebp,esp
00C518B3 sub esp,0E4h
00C518B9 push ebx
00C518BA push esi
00C518BB push edi
00C518BC lea edi,[ebp-24h]
00C518BF mov ecx,9
00C518C4 mov eax,0CCCCCCCCh
00C518C9 rep stos dword ptr es:[edi]
00C518CB mov ecx,0C5C008h
00C518D0 call 00C5131B
int a = 3, b = 6,c = 0;
00C518D5 mov dword ptr [ebp-8],3
00C518DC mov dword ptr [ebp-14h],6
00C518E3 mov dword ptr [ebp-20h],0
c = ADD(a,b);
00C518EA mov eax,dword ptr [ebp-14h]
00C518ED push eax
00C518EE mov ecx,dword ptr [ebp-8]
00C518F1 push ecx
00C518F2 call 00C51217
00C518F7 add esp,8
00C518FA mov dword ptr [ebp-20h],eax
Press F11 to enter the ADD function
4). Create the ADD function stack frame
5). The execution process of the ADD function
int z = x + y;
00C51795 mov eax,dword ptr [ebp+8] //把ebp+8里面的值给eax
00C51798 add eax,dword ptr [ebp+0Ch] //eax里面的值加上ebp+0Ch地址里的值
00C5179B mov dword ptr [ebp-8],eax //eax的值放到ebp-8地址里
return z;
00C5179E mov eax,dword ptr [ebp-8] //eax相当于全局的寄存器,ebp-8的值放到寄存器里。
As shown in the figure:
6), the view created by the function stack frame:
3. Destruction of the function stack frame
1) Destruction of ADD function stack frame
00C517A1 pop edi //在栈顶弹出一个值,存放到edi中,esp+4
00C517A2 pop esi //在栈顶弹出一个值,存放到esi中,esp+4
00C517A3 pop ebx //在栈顶弹出一个值,存放到ebx中,esp+4
00C517A4 add esp,0CCh /*将esp的地址加上0cch,相当于回收了ADD函数的栈帧空间*/
00C517AA cmp ebp,esp //判断有没有溢出
00C517AC call 00C51244 //call指令里放的是下一个指令的地址
00C517B1 mov esp,ebp //ebp里面的值放到esp里
00C517B3 pop ebp //出栈,弹出一个元素,dsp+4
00C517B4 ret /*call指令可以实现调用一个子程序,在子程序里使用ret指令,结束子程序的执行并返回主函数,让主函数继续往下执行*/
Illustration:
2). After the ADD function stack frame is destroyed, return to the main function: After
calling the ADD function, when returning to the main function, continue to execute, you can see:
00C518F7 add esp,8 //esp直接+8,相当于跳过了main函数中压栈的
00C518FA mov dword ptr [ebp-20h],eax /*将eax中值,存档到ebp-20h的地址处,其实就是存储到main函数中c变量中,而此时eax中就是ADD函数中计算的x和y的和,可以看出来,本次函数的返回值是由eax寄存器带回来的。程序是在函数调用返回之后,在eax中去读取返回值的。*/
printf("%d\n", c);
Note:
Summary:
1 Why are local variables not initialized and the content is random or "hot"?
Because when the function stack frame is created, the value of the middle address is uncertain, and if an uninitialized variable is accessed, pointing to these uncertain values is a random value. When it is initialized to 0CCCCCCCCh, the Chinese character code of 0xCCCC (two consecutively arranged 0xCC) is "hot", so 0xCCCC is regarded as the text is "hot".
2. How are the parameters passed when the function is called? What is the order of passing parameters?
From the stack frame of the function (such as the main function) that creates local variables, it is accessed through memory, stored in eax and ecx, and then pushed onto the stack (equivalent to a temporary copy).
3. How are the formal parameters and actual parameters of the function instantiated?
The actual parameter is the value stored in the function stack frame through the ebp memory access. The formal parameter is a temporary variable stored on the stack by ebp memory access.
4. How to return the value after the function call ends?
In the ADD function, the 9 obtained by adding in the register (eax) is moved into the address position of c in the ADD function stack frame, and then the value of this address position is passed to eax, and after the ADD function stack frame is destroyed, eax The value in is passed to the c address location created in the main function stack frame.