Assembly language_1 computer basics; registers

basic knowledge

Early ml: 01 punch.

Assembly language: Machine language in an easy-to-remember format. replaced by machine code by the compiler.

registers: memory in the cpu. For example, AX BX in assembly is the code name of the register.

The storage unit in the memory is 8 bits, numbered sequentially starting from 0.

The composition of assembly language

  1. Assembly instructions (mnemonics for machine code)
  2. directives (compiler execution)
  3. Other symbols (recognized by the compiler)

The core part is the assembly instruction.

CPU read and write memory

Need to know: memory address, device control command, read and write data.

The line connecting the CPU to other chips is called a bus, which is divided into address, data, and control buses.

The width of the address bus determines the addressing capability, the width of the data bus determines the data transfer rate, and the width of the control bus determines how many control instructions there are.

Memory

image-20230329101852049

(Assembly programming should start from the CPU point of view)

For the CPU, what you see is the address space corresponding to each RAM ROM. All storage units are in a unified logical storage space for the CPU, that is, the memory address space.

register

The CPU internal bus connects the internal arithmetic units, controllers, and registers, and the external bus realizes the connection between the CPU and other components on the motherboard.

Taking the 8086 CPU as an example, all registers are 16-bit and two-byte. Among them, general data are stored in general registers, such as AX BX CX DX .

8086CPU The previous generation of CPU is an 8-bit register. However, the 8086 can be backward compatible by splitting the general-purpose registers into (AH AL) form.

The maximum data value that can be stored: 2^16 -1.

Character

Indicates the address of a memory location. The 8086 CPU is 16 bits, with two bytes connected, first low and then high. can be stored in 16-bit registers.

Example of assembly instructions: case insensitive.

image-20230329104300956

Pay attention to the problem of assignment overflow to HL here. Carry digits cannot be stored in registers. But it has not been discarded, which will be discussed later.

physical address

Real addresses in memory, not just logical ones anymore.

The 8086 has a 16-bit structure inside, can only transmit 16-bit addresses, and has an addressing capacity of 64K. The outside is a 20-bit structure, and the implementation method is to synthesize two 16-bit addresses. The 16-bit segment address and the 16-bit offset address are synthesized by an address adder. (Segment address * 16 + offset address, that is, the segment address is shifted to the left by 4 bits) It means that 20 bits of our register cannot be stored, so we use a 16-bit segment and an additional 4-bit offset address to represent.

There are many representations, such as 2000H 1F60H or 2100H 0F60H.

Algorithm: segment*16+offset.

command execution

Segment addresses are stored in four-bit registers on the CPU, CS, DS, SS, ES. Namely: code, data, stack, append.

CS is the code segment register, which is used to store the segment address of the instruction, and IP is the instruction pointer register.

image-20230330023017158

IP is equivalent to offset code. The combination of the two takes out the data in an instruction (such as B82301), and then puts it in the instruction buffer and executes it. Then IP+3 jumps to the next instruction.

After starting up, CS=FFFFH, IP=0000H, cpu executes the instruction of FFFF0H as the first instruction after starting up.

The value of CS IP cannot be modified by general methods, such as the mov instruction learned before. Use special transfer instructions.

Transfer instruction: such as jmp 3:0B46, that is, CS=0003, IP=0B46, physical address=00030+00B46=00B76.

Only modify IP: jmp a legal register, such as jmp AX is to put the value of AX into IP.

part

The memory itself is not segmented, what is segmented is only the segments that the CPU logically assigns to the memory.

The segment length is a multiple of 16, continuous. Contains several commands.

When the CPU executes the segment, it does not pay attention to how the segment is divided, it only pays attention to the CS IP in hand to find the corresponding instruction.

debug program

debug is a program that can be used to debug or write programs in 8086 mode.

Win10 is not directly compatible with debug.exe and needs to be used with DosBox.

DosBox needs to mount the asm file and the location of debug.exe after each startup. Syntax: mount c path. It can be directly configured in option.bat to automatically start the mount when opening the file.

After mounting, enter c: to enter the c drive. Then enter debug to run debug.exe.

Solve the troubles of win10 learning assembly tools - download and use of assembly Debug (including available download links)_Compilation debug download_NULL not error's blog-CSDN blog

image-20230330154628500

r: view or modify a register value. Modify: r 寄存器名, the next line follows the modified value.

d: View the content in memory.

e: Rewrite the content in the memory in the format of machine instructions. But the data in ROM cannot be modified, such as the production date information in fff0:00 ~ ff, the value remains unchanged after modification.

The address in b810 is the video memory address, and the display on the screen will be changed directly after modification with e.

a: Write instructions in assembly instruction format.

a 段地址:偏移地址After entering the a command, enter the code line by line, and press Enter twice to exit. d 段地址:偏移地址You can view the instruction content in the memory, and you can d fff0:0 ffview .

If d is replaced by u, it can be translated into assembly language for viewing.

image-20230330154050235

Code execution: first change the CS IP position, use r to view the current pointer position, and see if the next statement is mov ae,4e20.

Then press t to execute the current statement.

Example: A program that keeps *2:

Write at position 2000:0:

mov ax,1	//地址:20000
add ax,ax	//地址:20003
jmp 2000:3	//地址:20006

Execution sequence: ax=1, ax * 2, ax * 2, ax * 2...

memory access data

Rely on data segment address ds and offset address.

mov bx,1000H
mov ds,bx	//不能直接送入段寄存器,所以可以用通用寄存器赋值
mov al,[0]	//把ds:0的内存数据读到al里。[]代表偏移地址,这时不用写ds,cpu会自动把数据段地址和偏移地址合起来。

The data in the register is written to the memory and written in reverse.

The word and segment registers in a memory are both 16 bits, and the 8086 can also transfer 16 bits at a time.

mov al,[0] puts only one byte at position 10000 into al. in the case of

mov ax,[0] puts 10001 10000 bytes into one word and puts it in ax.

Executing program sequence: first assign value e 段:偏移 值to memory, then a 段:偏移 程序write assembler, then r adjust cs ip pointer, t execute code line by line.

the stack

LIFO。

The CPU provides related instructions to access the memory space in the form of a stack.

push ax: ax data is pushed onto the stack.

pop ax: Pop data to ax.

All are executed in units of words.

The CPU can know that this piece of memory space is used as a stack through the pointer of the SS register. The SP register stores the offset address of the top pointer on the stack.

image-20230331172148326

As shown in the figure, the data is first stored in the low order and then in the high order. The stack is from the high heap to the low one, the stack pointer is -2, and the stack is +2.

When the initial stack is empty, sp is at the position of 1000F+1=10010. When storing the first element, SP+2, and then store the data.

image-20230331173158761

pop is the opposite of push, fetching the data first and then moving the pointer. The data in 1000C 1000D does not disappear, but the index is changed, similar to hard disk formatting.

It is dangerous for the top and bottom of the stack to go out of bounds. The 8086 CPU does not save the registers of the stack range, and cannot verify the security range of the stack. The CPU only cares about where the current top of the stack is and which instruction is currently being executed.

Therefore, we must pay attention to ourselves when programming.

Because push pop will only change sp, the maximum range of stack change is 0~FFFFH.

Example: The stack range is 10000~1FFFFH, SS=1000H, find the value of SP when the stack is empty.

Answer: The position of the first element of the stack: 1FFFEH, that is, SS:1000H, SP:FFFEH.

Pop out the stack again: SP+=2, SP=0000H.

Guess you like

Origin blog.csdn.net/jtwqwq/article/details/129887772