Assembly final review
Chapter 1 Basic Knowledge of Assembly Language
Machine instructions : The instructions that the CPU can directly recognize and follow are expressed in binary code, consisting of opcodes and operands, and the code only contains binary 0 or 1
Machine language : A collection of machine instructions in binary code and a set of rules for using the machine instructions
Assembly language : use the abbreviation description (mnemonic) of English words for the opcode in the machine instruction, and use symbols, variables, and constants to describe the operand
Assembly language is a symbolic language. An assembly language source program must be translated into a machine language program before it can be executed by a computer. This translation process is called "assembly", and this language processing program that translates an assembly source program into an object program is called an assembler
The composition of key one assembly language
Assembly language consists of the following three types of instructions
- Assembly instructions: mnemonics of machine codes, with corresponding machine codes. It is the core of assembly language
- Pseudo-instructions: There is no corresponding machine code, it is executed by the compiler, and the computer does not execute it
- Other symbols: such as +, -, *, /, etc., recognized by the compiler, there is no corresponding machine code
l When writing numbers in different carry counting systems, a letter is often used at the end to indicate what kind of carry counting system the number is in. Use B (binary number), O (octal number), D (decimal number), H (hexadecimal number) at the end, and the default is decimal number.
Decimal to binary conversion:
For the fractional part of the decimal system, in addition to using the power-decreasing method, multiplication can also be used, that is, multiplying by 2 continuously, and counting down the integer, and multiplying the fractional part by 2 until the result is 0.
- l Not all decimal numbers can be fully expressed in binary, and a certain precision can be selected as required.
Convert decimal to hexadecimal:,
For the fractional part of a decimal number, in addition to using the power-decreasing method, multiplication can also be used, that is, multiplying by 16 continuously, and counting down the integer, and multiplying the fractional part by 16 until the result is 0
- Not all decimal numbers can be fully expressed in hexadecimal, and a certain precision can be selected as required
Example of multiplication in hexadecimal:
The specific operation of the number's complement is:
-
Positive numbers remain unchanged, and negative numbers are reversed with the absolute value + 1
Character representation:
Basic logical operations:
- logical operation bitwise operation
- AND operation AND
- OR
- Exclusive OR operation XOR
- NOT
Chapter 2 Basic Principles of Computer
The CPU with 16-bit structure has the following structural characteristics:
- The data bus is 16 bits
- The arithmetic unit can process up to 16 bits of data at a time
- The maximum width of the register is 16 bits
- The path between the register and the arithmetic unit is 16 bits
CPU read and write operations to memory
General Data Register
- All registers of the 8086 CPU are 16-bit and can store two bytes. The four registers AX, BX, CX, and DX are usually used to store general data, and sometimes can also store addresses. They are called general-purpose data registers.
- AX: Accumulation register, this register is often used in operations, and some instructions stipulate that it must be used
- BX: base address register, in addition to storing data, it is often used to store the starting offset address of a piece of memory
- CX: Counting register, in addition to storing data, it is often used to store the number of repeated operations
- DX: data register, in addition to storing data, it sometimes stores the upper 16 bits of 32-bit data
address register
- The 16-bit 8086 processor has four 16-bit general address registers. Their main function is to store the offset address of the data, and they can also store data. These 4 registers can no longer be split and used
- SP: stack pointer, this is a special register, which stores the offset address of the top of the stack
- BP: Base address pointer, which can be used to store the offset address of data in memory
- SI: source index register, which is often used to store the offset address of the source data area in memory. The so-called index register means that it can automatically increase or decrease the value in it under the action of certain instructions
- DI: Destination index register, which is often used to store the offset address of the destination data area in memory, and can automatically increment or decrement the value in it under the action of certain instructions
segment register,
- The 16-bit 8086 processor has four 16-bit segment registers named CS, SS, DS, and ES. They are used to store the segment base addresses of the 4 segments
- CS: code segment register, used to store the segment base address of the currently executing program segment
- SS: stack segment register, used to store the segment base address of the stack segment
- DS: Data segment register, used to store the base address of the data segment
- ES: additional segment register, used to store the segment base address of another data segment
instruction pointer register
- IP: Instruction pointer register, which stores the offset address of the instruction to be executed
- FLAGS: store two types of flags of the CPU
- Status flags: reflect the current state of the processor, such as whether there is overflow, whether there is carry, etc.
- There are 6 status flags: CF, PF, AF, ZF, SF and OF
- Control flag: used to control the working mode of the processor, such as whether to respond to maskable interrupts, etc.
- There are 3 control flags: TF, IF and DF
The working process of 8086CPU
- Read instructions from the memory unit pointed to by CS:IP, and the read instructions enter the instruction buffer
- IP = IP + the length of the read instruction, thus pointing to the next instruction
- Execute the command, go to step 1 and repeat the process
The contents of CS and IP provide the address of the instruction to be executed by the CPU
the stack
- The stack area is such a special storage area. Its last unit is called the bottom of the stack. The data is stored from the bottom of the stack first, and the unit where the last stored data is located is called the top of the stack. When the stack area is empty, the top and bottom of the stack coincide. When data is stored in the stack area, it must be stored in words, one word at a time, and the data stored later will be placed in the lower address unit of the stack in turn. The stack pointer SP is decremented by 2 each time, and the stack pointer SP points out the position of the current stack top, and the data access adopts the last-in-first-out method
According to different purposes, the register ports in the interface are divided into the following three categories
- data port
- control port
- status port
The information transmission between the CPU and the port in the I/O interface is also carried out through the data bus.
Chapter 3 Assembly language program example and computer operation
System working process
- Edit source program files using an editing program
- Assemble a source program file (.asm) into an object file (.obj) using an assembler (MASM)
- Use the linker (LINK) to link the object file (.obj) into an executable file (.EXE)
- Using the debugger (DEBUG), debug the executable
Commonly used DOS commands
- Disk:; select a drive letter
- CD; select directory
- DIR; display directories and files
- REN; change file name
- CLS; clear screen
- DEL; delete file
- MD; create directory
- RD; delete directory
- COPY; copy file
- TYPE; display the content of the text file
- "; output redirection operator
- SET PATH; set or display the search path for executable files
- HELP; display command format and usage
Execute assembly in Win7 system
- DosBox is a Dos emulator under Windows environment, and Dos programs can be placed and run in this environment. The process is to mount, and the mount command is mount
Key point two: several commonly used Dos system function calls
-
Interrupt No. 21H is an interrupt provided by Dos to users for calling system functions. It has nearly a hundred functions for users to choose and use, mainly including three aspects: device management, directory management and file management.
-
Assembly language programming requires the use of various functional programs of the system
The format of a function call usually follows the following 4 steps
- Set the system function call number in the AH register
- Set the entry parameter in the specified register
- Execute the instruction INT 21H to realize the function call of the interrupt service program
- Analyze function call execution based on exit parameters
Chapter 4 Operand Addressing Mode
Instructions in a computer consist of opcodes and operands
- The operand field can have one, two, or three, often called one-address, two-address, or three-address instructions
- The two operands in the two-address instruction are called the source operand and the destination operand respectively.
- The so-called addressing mode is the way to find the operand in the instruction
The general format of 8086 assembly language instructions is: [label:] instruction mnemonic [operand] [; comment]
- The content in 【】is optional
- Label: symbolic address, indicating the location of the instruction in memory
- Command mnemonic: command name, which is the English abbreviation of command function
- Operand: The data to be operated by the instruction or the address where the data is located. registers, constants, variables, expressions
- Note: Each line starts with a semicolon ";", which is not processed by the assembler
Immediate addressing mode: the operand is in the instruction, immediately after the opcode, and the operand is stored in the code segment as part of the instruction
- There is no need to go to the memory to fetch the number during execution, so it is called immediate data
- Mainly used for assigning initial values to registers
- The immediate value can only be used as the source operand, and the length is the same as the destination operand
Register addressing mode: the operand is the value in the register, and the register name is given in the instruction
- The offset address also becomes the effective address (EA)
Direct addressing mode: the effective address EA of the operand is in the instruction, and the default segment address of the machine is in DS
- memory read operation
- memory write operation
- If you want to implement the CPU write memory operation, just change the destination operand of the MOV instruction into a storage unit, and the source operand is the register of the CPU.
- symbolic address
- In direct addressing mode, in addition to using numerical values as effective addresses, you can also use symbolic addresses. Define a name for the storage unit, which is the symbolic address. If the storage unit is regarded as a variable, the name is also a variable name
- segment prefix
- In the memory-related addressing mode, the segment address of the operand defaults to the data segment. 8086 stipulates that in addition to the data segment, data can also be stored in other three segments. If the operand is stored in other segments, it is called Segment override needs to be indicated in the instruction with a segment override prefix, that is, the segment register name and a colon are added before the operand.
Register indirect addressing mode: the effective address of the operand is in the register, only BX, BP, SI and DI registers are allowed
Register relative addressing mode: the effective address of the operand is the sum of a register and displacement
Base-indexed addressing mode: the effective address of the operand is the sum of the contents of a base register and an index register
- Base registers BX and BP, index registers SI and DI
- The default segment register collocation is the same as the register indirect addressing mode
Relative base index addressing mode: the effective address of the operand is the sum of a base register, an index register and a displacement
- Base registers BX and BP, index registers SI and DI
- The default segment register collocation is the same as the register indirect addressing mode
Chapter 5 Commonly Used Instructions
The general format of 8086 assembly language instructions is: [label] instruction mnemonic [operand] [; comment]
The 8086 instruction system can be divided into 5 groups
- data transfer command
- Arithmetic instructions
- Logical instructions and shift instructions
- string manipulation instructions
- program transfer instruction
data transfer command
-
Generic Data Transfer Instructions
-
MOV transfer
-
Rules for double-operand instructions
-
-
PUSH into the stack
-
POP out of the stack
-
XCHG exchange
-
-
Accumulator-specific transfer instructions
Among them, the I/O port is the interface between the CPU and the peripheral device to transmit data. It is addressed separately and does not belong to the memory. The port address range is 0000~FFFFH. This group of instructions is limited to AX and AL accumulators
-
IN; input from the I/O port
-
OUT; output to the I/O port
-
XLAT; Escape code (look-up table)
-
-
address transfer command
-
LEA effective address send register instruction
-
LDS pointer send register and DS instruction
-
LES pointer to register and ES instruction
-
-
flag register transfer instruction
-
Arithmetic instructions
- Addition, subtraction, multiplication, and division are basic operations that computers often perform. Arithmetic operation instructions mainly implement four arithmetic operations on binary (and decimal) data
-
type extension directive
-
addition instruction
-
ADD addition instruction
-
ADC with carry addition instruction
-
INC plus 1 command
-
-
Subtraction instruction
-
SUB subtraction instruction
-
SBB subtraction instruction with borrow
-
DEC minus 1 instruction
-
NEG complement instruction
It can be seen that the NEG instruction is actually the opposite of the number X, that is, 0-X, only when X=0, CF=0, and CF=1 in other cases
-
CMP comparison instruction
-
CMP指令虽作减法,但不回送结果,只是产生标志位,为程序员比较两个数的大小提供判断依据
![typoraImage](https://img-blog.csdnimg.cn/img_convert/6593f4f7e4d20915c95a393c4d0f453c.png)
-
multiplication instruction
-
MUL unsigned multiplication instruction
-
IMUL signed multiplication instruction
The two multiplied numbers must have the same length
SRC cannot be an immediate value
-
-
Division instruction
-
DIV unsigned division instruction
-
IDIV signed number division instruction
-
Decimal adjustment instruction of BCD code
-
All the arithmetic operation instructions introduced above are operations on binary numbers. In order to facilitate decimal operations, the computer provides decimal adjustment instructions. On the basis of binary number calculations, decimal modulation is given to directly obtain decimal results.
-
BCD (code 8421): represent decimal numbers in binary code
-
Convert decimal number to BCD code:
-
There are mainly two compressed BCD code adjustment instructions:
-
DAA; Decimal Adjustment Instruction for Addition
-
DAS; subtraction decimal adjustment instruction
-
logic instruction
-
AND; AND instruction
-
OR; or instruction
-
NOT; non-instruction
-
XOR; Exclusive OR instruction
-
TEST; test command
shift instruction
-
SHL; logical shift left
-
SAL; Arithmetic shift left
-
SHR; logical shift right
-
SAR; Arithmetic shift right
-
ROL; rotate left
-
ROR; rotate right
-
RCL; rotate left with carry
-
RCR; rotate right with carry
-
Arithmetic shift instruction is suitable for signed number operation, SAL is used to multiply by 2, SAR is used to divide by 2; logical shift instruction is suitable for unsigned number operation, SHL is used to multiply by 2, SHR is used to divide by 2
string manipulation instructions
-
MOVS; serial transfer
-
CMPS; string comparison
-
SCAS; string scan
-
STOS; store string
-
LODS; fetch from string
-
String operation instructions process bytes or words each time, so it is necessary to execute string operation instructions repeatedly to process a data string
- The function of REP: execute the serial operation instruction repeatedly until CX=0, each time the serial operation instruction is executed, CX is automatically decremented by 1
- The function of REPE/REPZ: When CX≠0 and ZF=1, execute the serial operation instruction repeatedly until CX=0 or ZF=0, each time the serial operation instruction is executed, CX is automatically decremented by 1
- The function of REPNE/REPNZ: When CX≠0 and ZF=0, execute the string operation instruction repeatedly until CX=0 or ZF=1, each time the string operation instruction is executed, CX is automatically decremented by 1
program transfer instruction
unconditional branch instruction
-
JMP jump instruction: Unconditionally transfer to the address specified by the instruction to execute the program. If the target address of the transfer and the jump instruction are in the same code segment, it is an intra-segment transfer; otherwise, it is an inter-segment transfer. If the target address of the transfer is directly given in the jump instruction, it is a direct transfer; otherwise, it is an indirect transfer
-
Intra-segment direct transfer
-
intra-segment indirect transfer
-
Inter-segment direct transfer
-
Indirect transfer between segments
conditional branch instruction
-
The conditional transfer instruction judges the test condition according to the flag bit set by the previous instruction, so as to determine the program direction. Usually, before using the conditional transfer instruction, there should be a leading instruction that can generate the flag bit, such as the CMP instruction. In the assembly instruction format, the diversion address is represented by a label, and all conditional transfer instructions do not affect the flag bit
-
branch based on the setting of a single condition flag
-
Test if the value of the CX register is 0 and branch
-
Compares two unsigned numbers, branching based on the result
-
Compares two signed numbers, branching based on the result
loop instruction
Chapter 6 Directives and Source Program Format
Statements of assembly language programs: instructions, directives, macro instructions
Instructions are executed by the computer's CPU during program execution
Pseudo-instructions: mainly used to define data variables and program structures. Directives are operations that are handled by the assembler during the assembly of the source program by the assembler
Processor selection pseudo-instruction: Tell the assembler which instruction system to choose, and the 8086 instruction system is selected by default
The assume pseudo-instruction only specifies which segment register to assign a certain segment to, and cannot load the segment address into the segment register, so in the code segment, the segment address must also be loaded into the corresponding segment register, usually with two MOV The instruction does this, but the code segment does not need to do this, it is done when the program is initialized
Simplified section definition directive
Program start and end directives
The difference between labels and variables
- Variable: The definition is generally in the non-code segment. It is the symbolic address where the data is stored in the memory. It is a data object whose value can be modified at any time during the running of the program. It is the name of a data area in the memory. Consists of an identifier without a colon after it.
- Label: The definition generally appears in the code segment, indicating the symbolic address of the instruction stored in memory. Its corresponding value is automatically calculated at assembly time. is made up of an identifier followed by a colon
The default data in the program is a decimal number. When the first digit of the data is not a number, 0 should be added in front, and negative numbers are stored in complement code form. Strings are enclosed in ''. '? 'Indicates that only the storage unit is allocated, and no value is stored
Expression assignment directives "EQU" and "="
- An expression can be assigned a constant or a name using the assignment pseudo-operator. The format is as follows:
- Variables or labels in expressions must be defined first and then referenced
- The expression name in the EQU pseudo-operation does not allow repeated definitions, while the "=" pseudo-operation allows repeated definitions
procedure definition directive
- A procedure is equivalent to a subroutine in a high-level language program. It is an independent code module that can complete a specific function. The procedure is the basis for modular programming. In the 8086, the instruction to call the procedure is CALL, and the instruction to return from the procedure is RET
- The process definition contains two pseudo-instructions, PROC and ENDP, PROC indicates the beginning of the process, and ENDP indicates the end of the process
- The procedure name is an identifier, which acts as a label and is the symbolic address of the subroutine entry
- Attributes of procedures can be of type FAR or NEAR. NEAR is near, which is an intra-segment call. The FAR type is far, which is a cross-segment call, and the default is NEAR
When there are several operators in an expression at the same time, they are executed in order of operator priority. When assembling the source program, the assembler calculates the value of the expression according to the following rules:
- Perform operations with higher priority first
- Operations with the same priority are performed in order from left to right
- You can use parentheses to change the order of operations
There are two types of assembly language source programs:
- Executable files with the extension .EXE (referred to as EXE files)
- Executable files with the extension .COM (compact format, abbreviated as COM files)
- These two files have different priorities, and their source program structures are also quite different
In addition to the program itself, the EXE file also has a file header
A COM file is composed of its own binary code, and it does not have a header area with file information like an EXE file
It is not a source program format with a COM file, it must be a COM file, and the COM file is also generated through the LINK connection program, and /T must be added after the connection command
If there are two files in the same directory, such as PROG.EXE and PROG.COM, when typing PROG to execute the program, the COM file will be executed, and typing PROG.EXE can execute PROG.EXE
Chapter 7 Branch and Loop Programming
single-branch program
multibranch program
- If there are more than two alternative branches in the branch structure, this is a multi-branch structure
- If you query the conditions of multiple branches one by one to determine which branch it is, it will only increase the code and time. In order to enter a certain branch as soon as possible, you can use the branch vector table method
Cyclic program structure
- The loop program has two structural forms: DO-WHILE structure and DO-UNTIL structure. The loop program consists of three parts: loop initial state, loop control, and loop body. There are three types of loop control conditions: counting loops, conditional loops, and conditional counting loops
counting cycle program
- Counting loops: Control loops with the value of the loop counter
conditional loop
- In a loop program, sometimes the operations performed in each loop may be different, that is, if there is a branch in the loop body, it is necessary to decide what to do based on a certain flag. A flag bit of 1 means that operation A is to be done, and a flag bit of 0 means that operation B is to be done. We can call this flag word a logical scale