Detailed explanation of CPU and registers

1.CPU

  • A compiler is a program that translates the programs we write into special machine language constructs. Typically, each type of CPU has its own unique machine language. This is one reason why programs written for the Mac won't run on IBM-type PCs.

  • Computers use a clock to synchronize the execution of instructions. The clock pulses at a fixed frequency (called the clock frequency). When you buy a 1.5GHz computer, 1.5GHz is the clock frequency, or 1.5 billion clock pulses per second. The clock does not record minutes and seconds. It simply beats at a constant rate. Electronic computers do their thing correctly by using this beat, like how the beat of a metronome helps you play music at the correct tempo. The number of ticks an instruction takes (or execution cycles as they often say) depends on the generation and imitation of the CPU. The number of cycles depends on the instruction preceding it and other factors.

2. The program memory needs to be segmented (take 8086CPU as an example)

insert image description here

  • In terms of access, memory is like a rectangular strip, with addresses increasing in order. The memory is a random read and write device, that is, it can access any part of the interior, and it does not need to start from scratch, as long as the actual physical address is directly given.
  • Segmentation is the mechanism of memory access, which is the way for the CPU to access memory. Only the CPU will pay attention to the segment.
    • 8086CPU has 20 address lines, and the maximum addressable memory space is 1MB, 2^20Byte. The 8086 has only 16-bit registers, and the instruction pointer (IP) and index registers (SI, DI) are also 16-bit. It is impossible to address 1MB of space with a 16-bit address. Therefore, it is necessary to segment the memory, that is, divide the 1MB space into several segments, each segment does not exceed 64KB, and set 4 16-bit segment registers in the 8086 to manage 4 segments. Note that it is here , it is not (memory It is divided into four segments, which are four parts required for program execution. However, there are more than four segments of memory, but at most four segments are working at the same time, and the others are "sleep" and "wake up" when needed. .), the specific types are: CS is the code segment, DS is the data segment, SS is the stack segment, and ES is the additional segment.
    • After the memory is segmented, the memory address (also known as the physical address) consists of two parts: the segment address and the offset address within the segment, and the segment register manages the segment address. After segmenting the memory, each segment has a segment base address, and the segment register stores the upper 16 bits of the segment base address. This 16-bit address is shifted left by four bits (plus 4 0s) to form 20-bit segment base address.
    • The above is the origin of the addressing mode of segment address × 16 (or left shift by four bits) + offset address = physical address".
  • The second is in protected mode, where segmentation emphasizes segmentation, which is used to divide memory into different address spaces, one for each segment, and then converted into actual physical addresses by the CPU's MMU. Since the program runs in different segments, it fundamentally protects each unrelated code in the CPU protected mode, the so-called process or job.

3. Different models of CPU

  • 8088, 8086:
    These CPUs are identical from a programming point of view. They are the CPUs used in early PCs. They provide some 16-bit registers: AX, BX, CX, DX, SI, DI, BP, SP, CS, DS, SS, ES, IP, FLAGS. They only support 1M bytes of memory, and can only work in real mode. In this mode, a program can access any memory address, even the memory of other programs! This can make troubleshooting and safety very difficult! Also, the program's memory needs to be divided into segments. Each segment cannot be larger
    than 64K, that is, 2^16Byte (16-bit register).

  • 80286:
    This CPU is used in AT series PCs. It adds some new instructions to the base machine language of the 8088/86. However, its main new feature is 16-bit protected mode. In this mode, it can access 16M bytes of memory and protect programs by preventing access to other programs' memory. However, the program is still divided into segments that cannot be larger than 64K, that is, 2^16Byte (16-bit registers).

  • 80386:
    This CPU greatly enhances the performance of the 80286. First, it extends many registers to accommodate 32-bit data (EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP, EIP) and adds two new 16-bit registers (FS, GS). It also adds a new 32-bit protected mode. In this mode, it can access 4G bytes. The program is also divided into segments, but now the size of each segment can also reach 4G, that is, 2^32Byte (32-bit register)

  • There are also many CPU models that are not listed here. After all, we do not do CPU chip development. We study CPUs with relatively simple structures and understand their principles.

4. Sixteen-bit registers (take 8086 as an example, that is, x86 architecture)

The register is a storage component inside the CPU and has nothing to do with the memory space. The reason for setting the register is to reduce the number of times the CPU exchanges data with the memory, so as to improve the working speed of the computer.
insert image description here
There are a total of 14 registers in the 8086 CPU, namely AX, BX, CX, DX, SP, BP, SI, DI, IP, FLAG, CS, DS, SS, ES, and they are all 16 bits.

4.1 General purpose registers:

There are 8 general-purpose registers, namely AX, BX, CX, DX, SP, BP, SI, DI.

4.1.1 Data register:

AX, BX, CX, DX are called data registers, which can temporarily store general data. They also have other special purposes. The specific special purposes are as follows:

  • AX (Accumulator): Accumulation register, also known as accumulator, in addition to addition, it can also do multiplication or division. When the number of operations is 16 bits, it is often used with DX, as follows:

    • When doing a division (DIV) operation: the
      divisor可以存放在寄存器中或者是内存单元中 , the dividend默认放在AX或(DX和AX)中,如果除数为8位,被除数则为16位,默认放在AX中;如果除数为16位,那么被除数就为32位,存放在DX和AX两个寄存器中,高16位存放在DX,低16位存放在AX。 or the divisor can be 8-bit or 16-bit:

      • When the divisor is 8 bits, the dividend must be 16 bits, and it is placed in the AX register by default. If the divisor is 8 bits, the quotient of the division operation will be saved in AL, and in AH Save the remainder of this division operation,
      • And when the divisor is 16-bit, the dividend must be 32-bit, because AX is a 16-bit register, naturally, a 32-bit dividend cannot be placed, so another 16-bit register DX needs to be used here, where DX stores 32 bits. The high 16 bits of the dividend, and AX stores the low 16 bits of the 32-bit dividend. At the same time, the function of AX is not only used to save the dividend, when the division instruction is executed, of course, if the divisor is 16 bits. , the quotient of this division operation is stored in AX, and the remainder of this division operation is stored in DX.
    • When doing a multiplication (MUL) operation:
      the two multiplied numbers are either 8-bit or both 16-bit:

      • If the two multiplied numbers are both 8-bit, one is placed in AL by default, and the other 8-bit multiplier is located in other registers or memory byte units, when the MUL instruction is executed. , if it is an 8-bit multiplication operation, the default multiplication result is stored in AX,
      • If the two multiplied numbers are both 16-bit, one is stored in AX by default, and the other 16-bit is located in a 16-bit register or a memory word unit.
        At the same time, if it is a 16-bit multiplication operation, the default multiplication result has 32 bits, of which the high-order bits are stored in DX by default, and the low-order bits are stored in AX by default.
  • BX (Base): base address register;
    BX is mainly used for its exclusive function: storing the offset address (in combination with the segment register, it can address the physical address).
    If no additional register is specified, the DS segment register will be used by default, such as [BX], which is equivalent to DS:[BP]
    segment address × 16 (or left shift by four bits) + offset address = physical address

  • CX (Count): Counter register;
    C in CX is translated into Counting, that is, the function of the counter. When the loop LOOP instruction is used in the assembly instruction, the number of times that needs to be looped can be specified by CX, and the CPU executes the LOOP instruction every time. will do two things:

    • One is to make CX = CX – 1, that is, the CX counter is automatically decremented by 1;
    • Another thing is to judge the value in CX. If the value in CX is 0, it will jump out of the loop and continue to execute the instructions below the loop. If the value in CX is not 0, it will continue to execute the instruction specified in the loop.
  • DX (Data): data register;
    it can be used with AX when doing division (DIV) or multiplication (MUL) operations, please refer to the explanation of the detailed usage of AX above.

4.1.2 Pointer register:

  • SP (Stack Pointer): stack pointer register;
    segment addresses use the default value in the SS register
    At any time, SS:SP points to the top element of the stack
  • BP (Base Pointer): base pointer register;
    the segment address uses the default value
    bp in the SS register as the base address register, which is generally used in the function to save the base address of the top of the stack when entering the function.
    sp will change with instructions with stack operations (such as PUSH, CALL, INT, RETF), but BP will not, so use BP to get parameters and access temporary variables set in the stack in the subroutine with parameters .
    Each time the subfunction is called, the system saves these two pointers at the beginning and restores the values ​​of sp and bp at the end of the function. As follows
    when the function enters:
push bp // 保存bp指针
mov bp,sp // 将sp指针传给bp,此时bp指向sp的基地址。
// 这个时候,如果该函数有参数,则[bp + 2*4]则是该子函数的第一个参数,[bp+3*4]则是该子函数的 第二个参数,以此类推,有多少个参数则[bp+(n-1)*4]。
.....
.....
函数结束时:
mov sp,bp // 将原sp指针传回给sp
pop bp // 恢复原bp的值。
ret // 退出子函数

It can only be used when looking for data in the stack and using individual addressing modes.
For example, a lot of data or addresses are pushed into the stack. You definitely want to access these data or addresses through SP, but SP points to the top of the stack. Yes, it cannot be arbitrarily changed. At this time, you need to use BP, pass the value of SP to BP, and use BP to find the data or address in the stack.

They are also general-purpose registers, and in many cases can also temporarily hold data like general-purpose registers. However, they cannot be decomposed into two 8-bit registers, because only data registers can be decomposed into two 8-bit registers.

4.1.3 Index Register

  • SI (Source Index): source index register;
    SI will use the DS segment register by default
  • DI (Destination Index): destination index register;
    DI will use the DS segment register by default

Two 16-bit pointer registers: SI and DI. Usually they are used as pointers, but in many cases they can also hold data temporarily like general purpose registers. However, they cannot be decomposed into two 8-bit registers, because only data registers can be decomposed into two 8-bit registers.

4.2 Segment registers:

  • CS (Code Segment): code segment register;
  • DS (Data Segment): data segment register;
  • SS (Stack Segment): stack segment register;
  • ES (Extra Segment): Additional segment register;

The 16-bit CS, DS, SS and ES registers are segment registers. They indicate the memory used by different parts of the program. CS stands for code segment, DS for data segment, SS for stack segment and ES for additional segment. ES is used as a temporary segment register. Details of these registers are described in a later article.

4.3 Control Register

  • IP (Instruction Pointer): Instruction Pointer Register;
    Instruction Pointer Register (IP) is used with the CS register to track the address of the next instruction to be executed by the CPU. Normally, when an instruction executes, the IP points ahead to the next instruction in memory.
  • FLAG: flag register;
    the FLAGS register stores important information about the execution result of the previous instruction. These results are stored in a single bit in a register. For example: if the execution result of the previous instruction is 0, the Z bit is 1, otherwise it is 0. Not all instructions modify bits in FLAGS.

5. Thirty-two-bit register (80386)

  • The 80386 and later processors have extended registers. For example: the 16-bit AX register is extended to 32 bits. For backward compatibility, AX still represents a 16-bit register and EAX is used to represent an extended 32-bit register. AX is the lower 16 bits of EAX just as AL is the lower 8 bits of AX(EAX). But there is no way to directly access the upper 16 bits of EAX. The other extended registers are EBX, ECX, EDX, ESI and EDI.
  • Many other types of registers are also extended. BP becomes EBP; SP becomes ESP; FLAGS becomes EFLAGSEFLAGS and IP becomes EIP. However, unlike pointer registers and general-purpose registers, only the extended form of this register is used in 32-bit protected mode.

Guess you like

Origin blog.csdn.net/MrYushiwen/article/details/122627634