Disassembly analysis of C language
Empty function disassembly
#include "stdafx.h"
//空函数
void function(){
}
int main(int argc, char* argv[])
{
//调用空函数
function();
return 0;
}
We analyze this empty function by disassembling
outside the function
12: function();
00401048 call @ILT+5(function) (0040100a)
13: return 0;
0040104D xor eax,eax
14: }
0040104F pop edi
00401050 pop esi
00401051 pop ebx
00401052 add esp,40h
00401055 cmp ebp,esp
00401057 call __chkesp (004010e0)
0040105C mov esp,ebp
0040105E pop ebp
0040105F ret
inside the function
6: void function(){
00401010 push ebp
00401011 mov ebp,esp
00401013 sub esp,40h
00401016 push ebx
00401017 push esi
00401018 push edi
00401019 lea edi,[ebp-40h]
0040101C mov ecx,10h
00401021 mov eax,0CCCCCCCCh
00401026 rep stos dword ptr [edi]
7:
8: }
00401028 pop edi
00401029 pop esi
0040102A pop ebx
0040102B mov esp,ebp
0040102D pop ebp
0040102E ret
Analysis function
function call
00401048 call @ILT+5(function) (0040100a)
The first is to call our function function through call
inside the function
Then go inside the function
With the previous experience of drawing stack diagrams, it is not difficult to see that although our function is an empty function, its assembly code still completes the following process:
提升堆栈
保护现场
初始化提升的堆栈
恢复现场
返回
boost stack
00401010 push ebp
00401011 mov ebp,esp
00401013 sub esp,40h
protect the scene
00401016 push ebx
00401017 push esi
00401018 push edi
PS: The previous push ebp is also a protection site
Initialize the boosted stack
00401019 lea edi,[ebp-40h]
0040101C mov ecx,10h
00401021 mov eax,0CCCCCCCCh
00401026 rep stos dword ptr [edi]
recovery site
00401028 pop edi
00401029 pop esi
0040102A pop ebx
0040102B mov esp,ebp
0040102D pop ebp
PS: The mov esp, ebp here is to lower the stack, which corresponds to the previous lifting stack, so it is also part of the recovery scene
return
0040102E ret
After the function returns
Unsurprisingly, after the function returns, it returns to the next line of CALL, let's see
0040104D xor eax,eax
Here is to clear eax, notice that our statement is return 0 Here is to pass eax as the return value
Generally speaking, eax is used as the return value of the function, but it is not absolute. The return value of some functions is stored in the memory or in other situations. It needs to be analyzed in detail.
Then look at the following code:
0040104F pop edi
00401050 pop esi
00401051 pop ebx
Obviously, here is the restoration site. Don’t forget that our main program main itself is also a function, which is the site protected before the restoration call to main
then go down
00401052 add esp,40h
00401055 cmp ebp,esp
00401057 call __chkesp (004010e0)
Here, the first is to reduce esp by 40h, then compare ebp and esp, and finally call a chkesp function
It is not difficult to see from the name chkesp = check esp, check esp, this function is used to check whether the stack is balanced
continue
0040105C mov esp,ebp
0040105E pop ebp
Still at the recovery site
The last is to return
0040105F ret
Summarizing empty function analysis
We can see that even if an empty function does nothing, the assembly code generated by calling an empty function is quite a lot
There are many things to protect the site, restore the site, and check the stack balance. It can be said that although the sparrow is small, it has all the internal organs.
Simple addition function disassembly
With the previous experience of analyzing empty functions, let's analyze and analyze a simple addition function
#include "stdafx.h"
int Plus(int x,int y){
return x+y;
}
int main(int argc, char* argv[])
{
//调用加法函数
Plus(1,2);
return 0;
}
outside the function
16: Plus(1,2);
004010A8 push 2
004010AA push 1
004010AC call @ILT+0(Plus) (00401005)
004010B1 add esp,8
17: return 0;
004010B4 xor eax,eax
18: }
004010B6 pop edi
004010B7 pop esi
004010B8 pop ebx
004010B9 add esp,40h
004010BC cmp ebp,esp
004010BE call __chkesp (004010e0)
004010C3 mov esp,ebp
004010C5 pop ebp
004010C6 ret
inside the function
10: int Plus(int x,int y){
00401060 push ebp
00401061 mov ebp,esp
00401063 sub esp,40h
00401066 push ebx
00401067 push esi
00401068 push edi
00401069 lea edi,[ebp-40h]
0040106C mov ecx,10h
00401071 mov eax,0CCCCCCCCh
00401076 rep stos dword ptr [edi]
11: return x+y;
00401078 mov eax,dword ptr [ebp+8]
0040107B add eax,dword ptr [ebp+0Ch]
12: }
0040107E pop edi
0040107F pop esi
00401080 pop ebx
00401081 mov esp,ebp
00401083 pop ebp
00401084 ret
Analysis function
Function call
004010A8 push 2
004010AA push 1
004010AC call @ILT+0(Plus) (00401005)
Combined with the previous empty function analysis, we can clearly find that there are two more pushes in the function call link here
It is to push the parameters required by the function onto the stack. The parameters here are 2 and 1. Note that the order of pushing is reversed (determined by the calling protocol, which will be explained in detail in the next note)
Lifting the stack inside the function protects the field initialization
Lifting the stack, protecting the site, and the initialization part are exactly the same as the empty function, so I won’t repeat them here
00401060 push ebp
00401061 mov ebp,esp
00401063 sub esp,40h
00401066 push ebx
00401067 push esi
00401068 push edi
00401069 lea edi,[ebp-40h]
0040106C mov ecx,10h
00401071 mov eax,0CCCCCCCCh
00401076 rep stos dword ptr [edi]
actual execution
00401078 mov eax,dword ptr [ebp+8]
0040107B add eax,dword ptr [ebp+0Ch]
Here [ebp+8] is the parameter 1 we pushed in earlier, and [ebp+c] is the parameter 2 we pushed in earlier
So these two sentences are actually
00401078 mov eax,1
0040107B add eax,2
Save the result of 1+2 into eax (at this time eax is used as the carrier of the return value of the function)
Restoring the scene and returning
The following content is the same as the empty function, and restoring the scene and returning will not be repeated
0040107E pop edi
0040107F pop esi
00401080 pop ebx
00401081 mov esp,ebp
00401083 pop ebp
00401084 ret
After the function returns
004010B1 add esp,8
17: return 0;
004010B4 xor eax,eax
18: }
004010B6 pop edi
004010B7 pop esi
004010B8 pop ebx
004010B9 add esp,40h
004010BC cmp ebp,esp
004010BE call __chkesp (004010e0)
004010C3 mov esp,ebp
004010C5 pop ebp
004010C6 ret
After the function returns, we will find that there is an extra line of code compared to the previous empty function:
004010B1 add esp,8
Here are the two parameters 1 and 2 that we pushed in earlier. After pushing the parameters, the esp is reduced by 8. Here, after the function call is finished, the two parameters that were pushed in before are no longer needed, so the esp is restored to the pressure. Before entering the parameters, this is actually counted in the recovery scene to balance the stack
We can find that this statement is a balance stack operation performed after our call returns, so this operation is also called off-stack balance
The opposite is the balance in the stack: that is, the stack is balanced in the call