Write a simple piece of C code analysis behind the assembly instructions relationship
Recently in the code to see the hotspot, hotspot bytecode interpreter would be translated into the assembler instruction, so the first review of this base
C code
#include <stdio.h>
int main(int args, char** argv){
printf("%d", add1(100, 200, 500, 600));
}
int add1(int i, int j, int k, int m){
return i + j + k + m;
}
gcc compiler to verify the results:
gcc -g2 FunctionInvokedAssembly.c -o FunctionInvokedAssembly
./FunctionInvokedAssembly
#1400
gcc compiled into assembly code
gcc -S -o FunctionInvokedAssembly.s FunctionInvokedAssembly.c
Assembly code as follows:
.file "FunctionInvokedAssembly.c"
.section .rodata
.LC0:
.string "%d"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $600, %ecx
movl $500, %edx
movl $200, %esi
movl $100, %edi
movl $0, %eax
call add1
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.globl add1
.type add1, @function
add1:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl %edx, -12(%rbp)
movl %ecx, -16(%rbp)
movl -8(%rbp), %eax
movl -4(%rbp), %edx
addl %eax, %edx
movl -12(%rbp), %eax
addl %eax, %edx
movl -16(%rbp), %eax
addl %edx, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size add1, .-add1
.ident "GCC: (Ubuntu 4.8.5-4ubuntu8) 4.8.5"
.section .note.GNU-stack,"",@progbits
And some of the registers used in some of the instructions compiled
- eax, ebx, ecx, edx, esi, edi, ebp (rbp), esp (rbp) are all X86 assembly language name of the general registers on the CPU.
- Stack frame stack bottom address rbp calling function
- rsp stack frame called function address stack bottom
- eip: the next instruction register holds a stored memory address CPU, when the CPU After completing the current instruction, the next instruction memory address read from the EIP register, then continue
- Reduction value esp (rsp) represents an extension of the register stack frame
- In X86-64, are 64-bit, 32-bit x86 relative to all registers, the identifier has changed, for example: from the original into a% ebp% rbp. For backward compatibility,% ebp still be used, but the low point of 32% rbp.
- X86-64 has 16 64-bit registers, which are:% rax,% rbx,% rcx,% rdx,% esi,% edi,% rbp,% rsp,% r8,% r9,% r10,% r11,% r12,% r13,% r14,% r15. % Rax as a function return values. % Rsp stack pointer register points to the top. % Rdi,% rsi,% rdx,% rcx,% r8,% r9 as function parameters, in turn corresponds to the first parameter, the second parameter ...% rbx,% rbp,% r12,% r13,% 14,% 15 as a data store, follow the rules used by the caller, it simply is casually used, before calling the subroutine you want to back it, in case he is modified. % R10,% r11 as a data store, follow the rules used by the caller, simply said he needed to save the original value before it is to use
A call instruction, completed two tasks:
- The next instruction calling function (main) in the stack, is returned to the called function will continue to take this instruction, 64-bit register value minus 8 rsp
- Modify the value of the instruction pointer register rip, it performs a location of the called function points
Register shown
63 31 0
+------------------------------+
|%rax |%eax | 返回值
+------------------------------+
|%rbx |%ebx | 被调用保护者
+------------------------------+
|%rcx |%ecx | 第四个参数
+------------------------------+
|%rdx |%edx | 第三个参数
+------------------------------+
|%rsi |%esi | 第二个参数
+------------------------------+
|%rdi |%edi | 第一个参数
+------------------------------+
|%rbp |%ebp | 被调用者保护
+------------------------------+
|%rsp |%esp | 堆栈指针
+------------------------------+
|%r8 |%r8d | 第五个参数
+------------------------------+
|%r9 |%r9d | 第六个参数
+------------------------------+
|%r10 |%r10d | 调用者保护
+------------------------------+
|%r11 |%r11d | 调用者保护
+------------------------------+
|%r12 |%r12d | 被调用者保护
+------------------------------+
|%r13 |%r13d | 被调用者保护
+------------------------------+
|%r14 |%r14d | 被调用者保护
+------------------------------+
|%r15 |%r15d | 被调用者保护
+------------------------------+
Stack frame
+-------------------+
| |
| |
| other frames |
| |
| |
+-------------------+
| |
| |
| last frame |
| |
| |
+-------------------+
| argument 1 |
+-------------------+
| argument 2 |
+-------------------+
| return address |
+-------------------+
%ebp-> | last frame %ebp |
+-------------------+
| |
| |
| current frame |
| |
| |
+-------------------+
%esp-> | |
+-------------------+
Entry function is main, and then call each subroutine. In the corresponding machine language, the GCC is converted into the process stack frame (Frame), simply, a process corresponding to each stack frame. X86-32 typical stack frame structure, the frame start points to the stack% ebp,% esp points to the top.
gcc decompile while debugging assembler code
gdb FunctionInvokedAssembly
> b main
> r
> disassemble /rm
Breakpoint 1, main (args=1, argv=0x7fffffffdf48) at FunctionInvokedAssembly.c:11
11 printf("%d", add1(100, 200, 500, 600));
(gdb) disassemble /rm
Dump of assembler code for function main:
9 int main(int args, char** argv){
0x00000000004004fd <+0>: 55 push %rbp
0x00000000004004fe <+1>: 48 89 e5 mov %rsp,%rbp
0x0000000000400501 <+4>: 48 83 ec 10 sub $0x10,%rsp
0x0000000000400505 <+8>: 89 7d fc mov %edi,-0x4(%rbp)
0x0000000000400508 <+11>: 48 89 75 f0 mov %rsi,-0x10(%rbp)
10 // printf("%d", add1(100, 200, 500, 600, 700, 800, 900, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13));
11 printf("%d", add1(100, 200, 500, 600));
=> 0x000000000040050c <+15>: b9 58 02 00 00 mov $0x258,%ecx
0x0000000000400511 <+20>: ba f4 01 00 00 mov $0x1f4,%edx
0x0000000000400516 <+25>: be c8 00 00 00 mov $0xc8,%esi
0x000000000040051b <+30>: bf 64 00 00 00 mov $0x64,%edi
0x0000000000400520 <+35>: b8 00 00 00 00 mov $0x0,%eax
0x0000000000400525 <+40>: e8 13 00 00 00 callq 0x40053d <add1>
0x000000000040052a <+45>: 89 c6 mov %eax,%esi
0x000000000040052c <+47>: bf f4 05 40 00 mov $0x4005f4,%edi
0x0000000000400531 <+52>: b8 00 00 00 00 mov $0x0,%eax
0x0000000000400536 <+57>: e8 b5 fe ff ff callq 0x4003f0 <printf@plt>
12 }
0x000000000040053b <+62>: c9 leaveq
0x000000000040053c <+63>: c3 retq
End of assembler dump.
> info register
rax 0x4004fd 4195581
rbx 0x0 0
rcx 0x400570 4195696
rdx 0x7fffffffdf58 140737488346968
rsi 0x7fffffffdf48 140737488346952
rdi 0x1 1
rbp 0x7fffffffde60 0x7fffffffde60
rsp 0x7fffffffde50 0x7fffffffde50
r8 0x7ffff7dd0d80 140737351847296
r9 0x7ffff7dd0d80 140737351847296
r10 0x0 0
r11 0x0 0
r12 0x400400 4195328
r13 0x7fffffffdf40 140737488346944
r14 0x0 0
r15 0x0 0
rip 0x40050c 0x40050c <main+15>
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
reference
X86-64 registers and stack frame
function call inquiry
X86 Opcode and Instruction Reference
you'll swap it, passed by value or by reference?
Register understanding and X86 Assembler entry