GNU X86-64 compilation

Source test.c

#include <stdio.h>

int main()
{
	printf("hello world!");
	return 0;
}

 gcc -S test.c generated test.s

	.file	"test.c"
	.text
	.def	__main;	.scl	2;	.type	32;	.endef
	.section .rdata,"dr"
.LC0:
	.ascii "hello world!\0"
	.text
	.globl	main
	.def	main;	.scl	2;	.type	32;	.endef
	.seh_proc	main
main:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$32, %rsp
	.seh_stackalloc	32
	.seh_endprologue
	call	__main
	leaq	.LC0(%rip), %rcx
	call	printf
	movl	$0, %eax
	addq	$32, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.ident	"GCC: (GNU) 7.3.0"
	.def	printf;	.scl	2;	.type	32;	.endef

Note assembler consists of three distinct elements:

  • Indication (Directives)  to start dot number, to indicate to the compiler, linker, debugger useful structural information. Itself is not indicative of assembly instructions. For example, .file just record the original source file. .data segment represents data (section) of the start address, and the .text represents the actual starting of the program code. It represents .string string constants in the data segment. .globl main label main indicate a global symbol that can be accessed by other modules of code. As for the rest of the instructions you can ignore.
  • Tag (Labels)  ends with a colon, used to associate the name and location tag label appears. For example, the label .LC0: indicates the name string is followed by the label main .LC0:. Pushq% rbp instruction indicates the first instruction is the main function. By convention, the start of temporary local label dot tags are generated by the compiler, other tags are visible to the user functions and global variable name.
  • Instructions (Instructions)  the actual assembly code (pushq% rbp), usually retracted, and to distinguish instructions and labels.

 ================================== ============== data network =========================

Registers and data types

x86-64 about sixteen 64-bit general purpose integer registers:
Sixteen 64-bit general purpose integer registers

We say "about sixteen universal" because early versions of each processor has its own special purpose registers, not all commands can be applied to each register. With the progress of the design, the new instructions and addressing modes been added, so many register becomes equivalent. Few instructions left behind, and in particular the associated string handling, and requires the use of% rsi% rdi. Further, two registers are retained as a stack pointer, respectively (% rsp) and a base pointer (% rbp). The final eight registers numbered and are not particularly limited.

Over the years, the architecture expanded from 8 to 16, 32, so that each register has some internal structure:
Write pictures described here

% Rax lower 8 bits are 8-bit registers% al, 8 bits are alone% ah. The lower 16 bits are% ax, it is the low 32% eax, is the entire 64-bit% rax.

Register% r8-% r15 have the same structure, but slightly different naming:
Write pictures described here

For simple point, we are only concerned about the 64-bit register. However, most compilers product mix 32 and 64-bit mode. 32-bit registers used for integer calculations, because most programs do not need integer value greater than 2 ^ 32. 64 is generally used to store memory addresses (pointers), so that the virtual memory can be addressed 16EB.

Addressing Modes

You should be appreciated that the first MOV instruction, which moves data between registers and memory. X86-64 using complex instruction set (CISC), MOV instruction so many different variants to move different data types between different storage units.
MOV and other instructions, as a single-letter prefix of the mobile determines how much data:

Write pictures described here

Different data have different addressing modes. Global value (global variables and functions) used as a reference name, for example, x or printf.
Use with immediate constant dollar sign, for example, $ 56. Reference values using register name of the register, e.g. RBX%.
Indirect addressing is used the address value corresponding to the value stored in the register memory, for example, (% rsp) points to a value indicating% rsp memory. Relative based addressing, a constant is added to the register value, for example, -16 (% rcx) represents the corresponding rear% rcx directed forward 16 bytes of memory address values. Addressing mode is important for management stack, local variables, function parameters. There are many relatively complex based addressing variants, for example -16 (% rbx,% rcx, 8) represents the -16 +% rbx +% rcx * 8 memory value corresponding to the address, in this addressing mode to access a particular element size useful when the array.
As follows using various addressing modes is a 64-bit value loaded into the% rax:
Loading a 64-bit value to% rax

In most cases, the same addressing mode may be used to store data into registers and memory. However, not all modes are supported. For example, possible to use two relatively based addressing of the MOV parameters: MOVQ -8(%rbx), -8(%rbx). To see whether to support a combination of addressing modes, you need to read the instructions manual.

Published 343 original articles · won praise 57 · Views 200,000 +

Guess you like

Origin blog.csdn.net/jadeshu/article/details/103603975