This article mainly focuses on in-depth understanding of the process of C language function calls from the level of stack space.
Below I use a simple program to illustrate:
#include <stdio.h>
int Add(int x, int y)
{
int sum = 0;
sum = x + y;
return sum;
}
int main()
{
int a = 2;
int b = 3;
int ret = 0;
ret = Add(a, b);
return 0;
}
In the above program, three local variables are defined in the main function main, and then the Add() function in the same file is called. The three local variables are undoubtedly stored on the stack space. When the program is running, let's gradually understand how the main function implements the calling process of Add() based on the stack, and how Add() returns to main in the function.
We debug this program and open the disassembly:
The following is the assembly code of the main function:
--- d:\baidunetdiskdownload\vs2013\4-15\4-15\test.c ----------------------------
int main()
{
011B1410 55 push ebp
011B1411 8B EC mov ebp,esp
011B1413 81 EC E4 00 00 00 sub esp,0E4h
011B1419 53 push ebx
011B141A 56 push esi
011B141B 57 push edi
011B141C 8D BD 1C FF FF FF lea edi,[ebp+FFFFFF1Ch]
011B1422 B9 39 00 00 00 mov ecx,39h
011B1427 B8 CC CC CC CC mov eax,0CCCCCCCCh
011B142C F3 AB rep stos dword ptr es:[edi]
int a = 2;
011B142E C7 45 F8 02 00 00 00 mov dword ptr [ebp-8],2
int b = 3;
011B1435 C7 45 EC 03 00 00 00 mov dword ptr [ebp-14h],3
int ret = 0;
011B143C C7 45 E0 00 00 00 00 mov dword ptr [ebp-20h],0
ret = Add(a, b);
011B1443 8B 45 EC mov eax,dword ptr [ebp-14h]
011B1446 50 push eax
011B1447 8B 4D F8 mov ecx,dword ptr [ebp-8]
011B144A 51 push ecx
011B144B E8 91 FC FF FF call 011B10E1
011B1450 83 C4 08 add esp,8
011B1453 89 45 E0 mov dword ptr [ebp-20h],eax
return 0;
011B1456 33 C0 xor eax,eax
}
011B1458 5F pop edi
011B1459 5E pop esi
011B145A 5B pop ebx
011B145B 81 C4 E4 00 00 00 add esp,0E4h
011B1461 3B EC cmp ebp,esp
011B1463 E8 D3 FC FF FF call 011B113B
011B1468 8B E5 mov esp,ebp
011B146A 5D pop ebp
011B146B C3 ret
--- d:\baidunetdiskdownload\vs2013\4-15\4-15\test.c ----------------------------
#include <stdio.h>
int Add(int x, int y)
{
011B13C0 55 push ebp
011B13C1 8B EC mov ebp,esp
011B13C3 81 EC CC 00 00 00 sub esp,0CCh
011B13C9 53 push ebx
011B13CA 56 push esi
011B13CB 57 push edi
011B13CC 8D BD 34 FF FF FF lea edi,[ebp+FFFFFF34h]
011B13D2 B9 33 00 00 00 mov ecx,33h
011B13D7 B8 CC CC CC CC mov eax,0CCCCCCCCh
011B13DC F3 AB rep stos dword ptr es:[edi]
int sum = 0;
011B13DE C7 45 F8 00 00 00 00 mov dword ptr [ebp-8],0
sum = x + y;
011B13E5 8B 45 08 mov eax,dword ptr [ebp+8]
011B13E8 03 45 0C add eax,dword ptr [ebp+0Ch]
011B13EB 89 45 F8 mov dword ptr [ebp-8],eax
return sum;
011B13EE 8B 45 F8 mov eax,dword ptr [ebp-8]
}
011B13F1 5F pop edi
011B13F2 5E pop esi
011B13F3 5B pop ebx
011B13F4 8B E5 mov esp,ebp
011B13F6 5D pop ebp
011B13F7 C3 ret
First we need to understand:
ebp: stack bottom pointer
esp: stack top pointer
Each function call must open up a space maintained by edp and esp
Moreover, ebp and esp maintain the space of the mainCRTStartup function that calls the main function.
Then we enter the main function to understand the calling process of the function and the creation and destruction of the stack frame step by step according to the assembly code.
011B1410 55 push ebp
Push is to push the stack. This sentence means to push ebp to the top of the stack, and esp points to the top of the stack. The effect is as follows:
011B1411 8B EC mov ebp,esp
This sentence means to give the value of esp to ebp, that is to say, ebp points to the location pointed to by esp. Results as shown below:
011B1413 81 EC E4 00 00 00 sub esp,0E4h
This sentence means esp minus 0E4h, and the stack space points from the high address to the low address. Here, the space of 04Eh is actually opened up. This space is opened up for the main function, and esp points to the top of the stack at this time. Results as shown below:
011B13C9 53 push ebx
011B13CA 56 push esi
011B13CB 57 push edi
These three lines are still pushed on the stack, that is, ebx, esi, and edi are pushed in one by one, and esp points to the top of the stack. Results as shown below:
011B141C 8D BD 1C FF FF FF lea edi,[ebp-0E4h]
//lea就是加载的意思,ebp-0E4h就是指刚才开辟的那一段main函数空间。这一行的意思就是说把为main函数开辟的空间加载到edi里面
011B1422 B9 39 00 00 00 mov ecx,39h
//把39h给eax
011B1427 B8 CC CC CC CC mov eax,0CCCCCCCCh
//把0CCCCCCCCh给mov
011B142C F3 AB rep stos dword ptr es:[edi]
//刚才不是把为main函数开辟的空间加载到edi里面了嘛,把这个空间里重复拷贝内容,拷贝内容eax:0CCCCCCCCh,拷贝ecx:39h次。
That is to say, all the addresses of this segment of space are initialized to 0cccccccch. Results as shown below:
When these four lines are finished running, we can check the memory, and the 57 lines from ebp-0e4h (39h is 57) are all initialized to 0cccccccch.
int a = 2;
011B142E C7 45 F8 02 00 00 00 mov dword ptr [ebp-8],2
int b = 3;
011B1435 C7 45 EC 03 00 00 00 mov dword ptr [ebp-14h],3
int ret = 0;
011B143C C7 45 E0 00 00 00 00 mov dword ptr [ebp-20h],0
These four lines of assembly code start creating our local variables. Here we select a in the code, right-click to select the display symbol name, and remove the check mark to see the result. The effect of these four lines of code is shown in the following figure:
Looking at our memory:
We all know that local variables are stored in our stack area, so we call the space of the main function opened up now the stack frame of the main function.
Then we look down at the assembly code:
ret = Add(a, b);
011B1443 8B 45 EC mov eax,dword ptr [ebp-14h]
//把ebp-14h(b)的值放到eax里,esp指向栈顶;
011B1446 50 push eax
//把eax压到栈顶
011B1447 8B 4D F8 mov ecx,dword ptr [ebp-8]
//把ebp-8(a)的值放到eax里;esp指向栈顶;
011B144A 51 push ecx
//把ecx压到栈顶;
011B144B E8 91 FC FF FF call _Add (011B10E1h)
//call指令,调用函数;这个地方是最关键的地方
//在这里我们按F11进入函数会跳转到如下这样一条语句:011B10E1 E9 DA 02 00 00 jmp Add (011B13C0h);
//我们还会发现内存里在2的上面又压进去一个地址,这个地址就是我们下面这一行汇编代码开头的地址(函数调用完返回值就要使用这个地址)。如下图所示。
011B1450 83 C4 08 add esp,8
//效果如图所示:
Then we F11 to enter the Add function:
011B13C0 55 push ebp
//这里压进去的其实是main函数的edp
011B13C1 8B EC mov ebp,esp
011B13C3 81 EC CC 00 00 00 sub esp,0CCh
011B13C9 53 push ebx
011B13CA 56 push esi
011B13CB 57 push edi
011B13CC 8D BD 34 FF FF FF lea edi,[ebp-0CCh]
011B13D2 B9 33 00 00 00 mov ecx,33h
011B13D7 B8 CC CC CC CC mov eax,0CCCCCCCCh
011B13DC F3 AB rep stos dword ptr es:[edi]
The assembly code here is consistent with the previous basic principles, and the effect diagram after execution is as follows:
int sum = 0;
011B13DE C7 45 F8 00 00 00 00 mov dword ptr [sum],0
sum = x + y;
011B13E5 8B 45 08 mov eax,dword ptr [ebp+8]
//ebp+8,此时ebp+8指向形参a;
011B13E8 03 45 0C add eax,dword ptr [ebp+0Ch]
//ebp+0ch,指向形参b,把a+b放到eax里
011B13EB 89 45 F8 mov dword ptr [ebp-8],eax
//ebp-8就是sum所在的位置,把eax(a+b)的值给sum,sum=5;
return sum;
011B13EE 8B 45 F8 mov eax,dword ptr [ebp-8]
//把sum的值再放到eax里,那么eax里存放的就是我们的返回值
The effect diagram is as follows:
011B13F1 5F pop edi
011B13F2 5E pop esi
011B13F3 5B pop ebx
//pop就是出栈的意思,esp此时指向ebx下面的空间,这三个地址相当于被回收了
011B13F4 8B E5 mov esp,ebp
//把ebp的值给esp
011B13F6 5D pop ebp
//ebp就是我们所存储的main函数的ebp,那么此时ebp指向main函数里面的ebp
011B13F7 C3 ret
//ret指令要返回值,首先把栈顶call执行下一条指令的地址出栈,然后紧接着跳到下面这一行的地址,
这也是之前为什么要把这个地址保存,就起到了一个返回值的作用
//011B1450 83 C4 08 add esp,8
011B1450 83 C4 08 add esp,8
//esp+8直接把定义的形参跳过去,到这一步的时候,我们就是Add的栈桢已经!!!被销毁了!!!
011B1453 89 45 E0 mov dword ptr [ebp-20h],eax
//eax里存放的是Add函数里sum的值,把eax的值给ebp-20h(ret)就把sum的值返回了
At this point, the calling process of the function is over! ! !
The final effect picture is as follows:
At this point, the calling process of the function and the creation and destruction of the stack frame are finished! ! !