Introduction to Computer Systems - study notes - Chapter III machine-level representation of the program (continuously updated)

Chapter III machine-level representation of the program

3.1 Historical Perspective

3.2 program code

1. Command Line

(1) Compile

linux> gcc -Og -op p1.c p2.c # compiler p1.c and p2.c file

(2) ATT format assembler

linux> gcc -Og -S mytry.c # generate C language compilation files mytry.s, can directly view

linux> gcc -Og -c mytry.c # generated binary object code files mytry.o, can not directly see

linux> machine code objdump -d mytry.o # mytry.o output file and disassembly code, the command line output

linux> gcc -Og -o prog main.c mytry.c # executable file prog

linux> machine code objdump -d prog # prog output file and disassembly code, the command line output

(3) Intel format compilation

linux> gcc -Og -S -masm = intel mytry.c # generate C language compilation files mytry.s, can directly view

2. gcc command process

C preprocessor (extended source code file and insert #include #define for macros extension)

Compiler (.s generated assembly code source files)

Assembler (the binary code into compiled object code files .o)

Linker (object code library file with the code to achieve the merger, and generates an executable file .p)

 

3.2.1 machine-level code

1. The two kinds of abstract

(1) instruction set architecture / instruction set architecture (Instruction Set Architecture, ISA)

(2) the virtual address

 

3.2.2 Code Example 

1. vs assembler disassembler

(1) Instruction length: 1-15 bytes

(2) omit many disassembler end instruction q, q suffix is ​​added to the call and ret instructions

2. Disassemble .o vs disassemble prog

(1) address different

The desired (2) fill the disassembly prog instruction address and the like callq

(3) Insert a nop disassembly prog end, so that the function code becomes 16 bytes, the next block of code for storage

 

3.2.3 Notes on format

1. beginning of the line. "": Directives, guidance assembler and linker work

2.ATT format format compilation compilation vsIntel

(1) intel size suffixes are omitted

Front (2) intel omitted register%

Different (3) intel manner described to a memory location, such as "QWORD PTR [rbx]" alternative "(% rbx)"

Instead order operand (4) at a plurality of instruction with operands, the listed

 

3.3 Data Format

1.C data types in size x86-64

Note:

(1)b = byte = 8 bits, w = word = 16 bits, l = long word = 32 bits, q = quad word = 64 bits

(2) floating-point number: (? I guess) (?) S = short float = 32 bits, l = long float = 64 bits

(3) l represents either 4-byte integers and eight-byte floating-point represented, but no ambiguity, because the floating point registers a completely different set of

 

3.4 Information access

1. 16 integer registers (very important)

Note:

(1) generates an instruction result of less than 8 bytes, the remaining bytes will be how?

Generating a 2-byte instruction or remains unchanged remaining bytes, 4 bytes of instruction to generate the high-order 4 bytes is set to 0

(2) (i guess) stack pointer% rsp = register stack pointer

 

3.4.1 Number of operational indicators

1. Number (the operand) Operation

(1) immediate (immediate): represents a constant value, indicated as the ATT assembler code, "$" represents an integer of C + standard law, automatically selects the most compact encoding method (?)

(2) Register (register): represents the content of a register, a register 16 for the lower order byte 1,2,4,8 as an operand

(3) memory reference: accessing a memory location in accordance with a valid address

Note:

(. 1) R & lt A represents any register A, R & lt [R & lt A ] represented by its value (which is considered as the set of registers arrays R, register identifier as an index)

(2)Mb[Addr]表示对存储在内存中地址Addr开始的b个字节值的饮用,可省去下标b

(3)Imm(rb, ri, s)是最常用的内存引用的寻址模式,包含:立即数偏移Imm(缺省为0)、基址寄存器rb(缺省为0)、变址寄存器ri(缺省为0)、比例因子s(s=1,2,4,8,缺省为1),有效地址为Imm+R[rb]+R[ri]*s

 

 

写在前面:书中把许多不同的指令划分为指令类,每一类执行相同的操作,只不过操作数的大小不同

3.4.2 数据传送指令

1.MOV类——简单的数据传送指令:把数据从源位置复制到目的位置

格式:MOV source源操作数, destination目的操作数

注:

(1)寄存器部分的大小必须与指令做后一个字符(b、w、l、q)指定的大小相匹配

(2)S、D均可以是内存地址或寄存器,但不能同时为内存地址;从内存传送数据到内存需要两条指令:内存->寄存器,寄存器->内存

(3)movq vs movabsq: movq只能表示以表示为32位补码数字的立即数作为源操作数,然后扩展符号得到64位,而movabsq能够以任意64位立即数作为源操作数,但只能以寄存器作为目的

 

 2.MOVZ和MOVS类:将较小的源值复制到较大的目的时使用

格式:

(1)零扩展(MOV zero)——高位补0: MOVZ+源大小+目的大小 source, register

(2)符号扩展(MOV sign)——高位扩展符号位:MOVS+源大小+目的大小 source, register

注:

(1)S可以是内存地址或寄存器,R只能是寄存器

(2)不存在movzlq指令,但可以用以32位寄存器为目的的movl指令实现,高位4字节置0

(3)cltq指令无操作数,效果与movslq %eax,%rax完全一致,但是编码更紧凑(我理解为:更省地方)

(4)一个有趣的小练习

答案:

 

Guess you like

Origin www.cnblogs.com/tanshiyin-20001111/p/11619024.html