x86-64 assembly instruction analysis process record of C language code

First create a terminal APP through Xcode, and select C as the language. code show as below:

#include <stdio.h>

int main(int argc, const char * argv[]) {
    int a[7]={1,2,3,4,5,6,7};
    int *ptr =(int*)(&a+1);
    printf("%d\n",*(ptr));
    return 0;
}

Set a breakpoint at return 0, select Debug|Debug Workflow|Always Show Disassembly in the Xcode menu, and click Run. At this time, the breakpoint will jump to the assembly code, and the assembly code is as follows:

Terminal`main:
    0x100003ec0 <+0>:   pushq  %rbp
    0x100003ec1 <+1>:   movq   %rsp, %rbp
    0x100003ec4 <+4>:   subq   $0x50, %rsp
    0x100003ec8 <+8>:   movq   0x131(%rip), %rax         ; (void *)0x00007ff84ef2f8a0: __stack_chk_guard
    0x100003ecf <+15>:  movq   (%rax), %rax
    0x100003ed2 <+18>:  movq   %rax, -0x8(%rbp)
    0x100003ed6 <+22>:  movl   $0x0, -0x34(%rbp)
    0x100003edd <+29>:  movl   %edi, -0x38(%rbp)
    0x100003ee0 <+32>:  movq   %rsi, -0x40(%rbp)
    0x100003ee4 <+36>:  movq   0xa5(%rip), %rax
    0x100003eeb <+43>:  movq   %rax, -0x30(%rbp)
    0x100003eef <+47>:  movq   0xa2(%rip), %rax
    0x100003ef6 <+54>:  movq   %rax, -0x28(%rbp)
    0x100003efa <+58>:  movq   0x9f(%rip), %rax
    0x100003f01 <+65>:  movq   %rax, -0x20(%rbp)
    0x100003f05 <+69>:  movl   0x9d(%rip), %eax
    0x100003f0b <+75>:  movl   %eax, -0x18(%rbp)
    0x100003f0e <+78>:  leaq   -0x30(%rbp), %rax
    0x100003f12 <+82>:  addq   $0x1c, %rax
    0x100003f16 <+86>:  movq   %rax, -0x48(%rbp)
    0x100003f1a <+90>:  movq   -0x48(%rbp), %rax
    0x100003f1e <+94>:  movl   (%rax), %esi
    0x100003f20 <+96>:  leaq   0x85(%rip), %rdi          ; "%d\n"
    0x100003f27 <+103>: movb   $0x0, %al
    0x100003f29 <+105>: callq  0x100003f5a               ; symbol stub for: printf
    0x100003f2e <+110>: movq   0xcb(%rip), %rax          ; (void *)0x00007ff84ef2f8a0: __stack_chk_guard
    0x100003f35 <+117>: movq   (%rax), %rax
    0x100003f38 <+120>: movq   -0x8(%rbp), %rcx
    0x100003f3c <+124>: cmpq   %rcx, %rax
    0x100003f3f <+127>: jne    0x100003f4d               ; <+141> at main.c
->  0x100003f45 <+133>: xorl   %eax, %eax
    0x100003f47 <+135>: addq   $0x50, %rsp
    0x100003f4b <+139>: popq   %rbp
    0x100003f4c <+140>: retq   
    0x100003f4d <+141>: callq  0x100003f54               ; symbol stub for: __stack_chk_fail
    0x100003f52 <+146>: ud2    

First, we introduce several registers that will be used below:

rip: program counter register
rsp: stack pointer register, pointing to the top of the
stack rbp: stack base address register, pointing to the bottom of the stack
edi: function parameter
rsi/esi: function parameter
eax: used for accumulator or function return value

1. Push the address of rbp onto the stack, and rsp continues to point to the top of the stack.pushq %rbp

2. Assign the value of rsp at the top of the stack to rbp at the bottom of the stack.movq %rsp, %rbp

3. The top of the stack is moved down 5*16 bytes, which can be understood as 80 bytes of space reserved for the back. The size of the stack allocation in X64 is a multiple of 0x10.subq $0x50, %rsp

4、movq   0x131(%rip), %rax         ; (void *)0x00007ff84ef2f8a0: __stack_chk_guard

0x131 (%rip) means adding 0x131 to the address of the next instruction (0x100003ecf) to get the target address (0x100004000), then getting the 8-byte value and setting it to the rax register.

Select Debug workflow | View Memory, enter '0x100004000' in Address and press Enter. We can see that the content here is 0x00007ff84ef2f8a0 (note the big and small endian issue):

 5、movq   (%rax), %rax

Pass the value pointed to by the address stored in the rax register to the rax register. Similar to the previous operation, we found that the value here is 0x55d1d55afee700d6:

 6、movq   %rax, -0x8(%rbp)

Store the value in rax, which is the 8 bytes shown in the picture above, at the location of rbp-0x8. We first print the values ​​​​of rbp and rsp, and then jump to rsp to view the memory:

 

The red box is where we store the value. The 8 bytes on the right are the positions pointed by rbp.

7、movl   $0x0, -0x34(%rbp)

Set 4 bytes 0 to the rbp-0x34 position. The purpose here is to clear the high byte of -0x38 (%rbp) in the next instruction. The location is 0x7ff7bfeff3ac:

 8、movl %edi, -0x38(%rbp)

This command saves the value of the edi register to the rbp-0x38 location, which is 0x7ff7bfeff3a8 in the picture above, and the value is 1. Earlier we said that edi is used to save function parameters, that is, int argc. In this example, the value of argc is 1, so the value of the edi register is 1.

9、movq   %rsi, -0x40(%rbp)

This command saves the value of the rsi register to the rbp-0x40 location, which is 0x7ff7bfeff3a0 in the picture above, and the value is 0x7ff7bfeff518. Here is the value of parameter argv 0x7ff7bfeff708. Since argv is const char **, this is also an address value, and we go to the address to view its content.

10、movq   0xa5(%rip), %rax

 Get the 8-byte content located at 0x100003eeb+0xa5=0x100003f90, and then store it in the rax register:

 11、movq   %rax, -0x30(%rbp)

 Store the value of the rax register at the location of rbp-0x30:

12-17 is the same as the above two steps, that is, the values ​​​​of 3, 4, 5, 6, and 7 are stored. Note that 7 is stored separately, because movl represents 4 bytes, and movq represents 8 bytes, which is 2 ints.

0x100003eef <+47>:  movq   0xa2(%rip), %rax
    0x100003ef6 <+54>:  movq   %rax, -0x28(%rbp)
    0x100003efa <+58>:  movq   0x9f(%rip), %rax
    0x100003f01 <+65>:  movq   %rax, -0x20(%rbp)
    0x100003f05 <+69>:  movl   0x9d(%rip), %eax
    0x100003f0b <+75>:  movl   %eax, -0x18(%rbp)

18、leaq   -0x30(%rbp), %rax

Get rbp-0x30=0x7ff7bfeff3b0, and then store it in the rax register.

19、addq   $0x1c, %rax

Store the value 0x7ff7bfeff3b0 in the rax register, plus 0x1c (0x7ff7bfeff3cc) into the rax register. The corresponding code here is `int *ptr =(int*)(&a+1);`0x1c is 28, which means the size of the array is 28 bytes. It means that the pointer addition is to replace the value with a specific value during the compilation stage, that is This value is multiplied by the size of the pointer type.

20、movq   %rax, -0x48(%rbp)

Next, store the rax value 0x7ff7bfeff3cc at the rbp-0x48 location:

21、movq   -0x48(%rbp), %rax

Then store the value just saved into the register rax

22、movl (%rax), %esi

Store the value (1) corresponding to the address stored in the rax register into the register esi, which is used as the second parameter of the print method to be called below.

23、leaq   0x85(%rip), %rdi          ; "%d\n"

rip=0x100003f27, add 0x85 to 0x100003fac, and then set it to the rdi register, which is the first parameter of the print call. The value at the position 0x100003fac happens to be the string "%d\n".

24、movb   $0x0, %al

Store the immediate value 0 into the al register. So how to understand the relationship between eax, ax, al(ah)?
The professional point can be explained like this: eax is a 32-bit register, ax is a 16-bit register, and al(ah) is an eight-bit register.

For functions with variable length parameters, you need to use %al to specify the number of vector registers used, such as printf. Here we do not use variable parameters, so we need to set 0 to the al register. Reference: Why are the %al register and stack modified before calling printf x86 assembly from C "Hello World" program compiled by gcc

Parameter passing in assembly function call 

25、callq  0x100003f5a               ; symbol stub for: printf

Call the printf function. call has one function: push the address of the next instruction of the call instruction onto the stack:

26、movq   0xcb(%rip), %rax          ; (void *)0x00007ff84ef2f8a0: __stack_chk_guard

27、movq   (%rax), %rax

Steps 26 and 27 are similar to steps 4 and 5 and will not be repeated here.

28、movq -0x8(%rbp), %rcx

Store the bottom 8 bytes of the stack into the rcx register

29、cmpq %rcx, %rax

Compare the values ​​of the rcx register and the rax register to see if they are equal, and write the result to the status register

30、jne    0x100003f4d               ; <+141> at main.c

If the comparison result of 29 is not equal, it will jump to 0x100003f4d and continue execution, which is 35. If equal, execute 31 steps. Here we mainly use __stack_chk_guard_ptr to determine whether a stack overflow has occurred, causing the first 8 bytes at the bottom of the stack to be tampered with. You can refer to the understanding of __stack_chk_guard_ptr

31、xorl   %eax, %eax

Clear the eax register as the return value of the main function

32、addq   $0x50, %rsp

This sentence exactly corresponds to the previous onesubq $0x50, %rsp。通过给栈顶指针加上开辟栈的大小,回收栈顶指针开辟的空间。

33、popq   %rbp

This instruction means popping the stack and putting the value popped out of the stack into the register rbp.

34、retq

This sentence means that exiting the main function will restore the rip value. This example does not reflect this, and the caller of the main function saves rip.

35、callq  0x100003f54               ; symbol stub for: __stack_chk_fail

Call the __stack_chk_fail function

36、ud2

The byte encoding of the UD2 instruction is 0F 0B, which is a two-byte instruction. In assembly language, you can use the UD2 instruction to implement some special functions, such as triggering debugging breakpoints or interrupting program execution. The UD2 instruction is commonly used for debugging programs. For details, please view: ud2 assembly instructions

final stack layout

Guess you like

Origin blog.csdn.net/Mamong/article/details/132126024