architecture of cpu

Tomorrow, we will continue to work on the cache, and the following ones, 

The following is the cpu frame diagram

start explaining cpu

1. Controller

The controller is also called the Control Unit (CU for short), the following is the composition of the controller

1. Instruction register IR: It is used to store an instruction currently being executed. When an instruction needs to be executed, an instruction is fetched from the cache to the instruction register IR according to the instruction address indicated by the program counter PC.

The operation code is the mov, add, jmp and other symbol codes in assembly language; the operand address indicates the address of the operand required by the instruction in the data cache. 

2. Instruction decoder ID: The instruction in the instruction register is decoded to determine what operation the instruction should perform (that is, the operation code in the instruction) and where the operand is (the address of the operand).

3. Timing generator TG: Similar to the "time schedule", it provides the time marks required for the work of each part of the computer, and is generally realized by using the sequence of timing pulses and different pulse intervals.

4. Operation controller CU: According to the operations and signals required to be completed by the instructions, various micro-operation command sequences are issued to control all controlled objects and complete the execution of instructions.

5. Program counter PC: It is used to store the address of the next instruction to be executed, and there is a direct path between it and the memory (memory). When executing an instruction, it is first necessary to fetch the instruction from the memory to the instruction register IR according to the instruction address stored in the program counter PC to complete the operation of "fetch instruction". The program counter PC itself has the function of automatically adding 1, which can automatically give the address of the next instruction, so that each instruction is executed in a loop.

2. Calculator

The arithmetic unit generally includes at least 3 registers and 1 arithmetic logic unit (ALU). Modern computers often have general-purpose register groups inside.

Register, a high-speed storage unit with limited storage capacity, can be used to temporarily store instructions, data and addresses. There are many types of registers. Generally, there are 3 types related to the four arithmetic operations. ACC (Accumulator) is an accumulator, MQ (Multiplier-Quotient Register) is a multiplier register, and X is an operand register. When the three types of registers complete different operations, The types of operands stored also vary.

calculator

 Regarding the concept of the high digit of the product and the low digit of the product, taking the decimal system as an example, the hundreds digit is the high digit of the tens digit, and the tens digit is the low digit of the hundreds digit. When two 16-bit numbers are multiplied, the result may be 32 bits. The 16 bits in the left half are the high bits of the product, which are stored in ACC, and the 16 bits in the right half are the low bits of the product, which are stored in MQ.
Arithmetic and Logic Unit (ALU, Arithmetic and Logic Unit) is a component of arithmetic and logic operations. Arithmetic operations include integer operations such as addition, subtraction, and multiplication. Logical operations are logic operations such as AND, OR, NOT, and XOR, as well as operations such as shift, comparison, and transfer.
Shift operations, shift a character to the left or right, or float a specific bit, including signed extension and unsigned extension, are widely used in programs.

3. Register

There must be at least six types of registers in the CPU: Instruction Register (IR), Program Counter (PC), Address Register (AR), Data Register (DR), Accumulation Register (AC), and Program Status Word Register (PSW).

These registers are used to store small amounts of data for fast use by the CPU.

  1. Data Register
    Data Register (Data Register, DR) is also called data buffer register. Its main function is to serve as a transfer station for information transmission between the CPU, main memory, and peripherals to make up for the increase in operating speed between the CPU, main memory, and peripherals. difference.
    The data register is used to temporarily store an instruction or a data word read from the main memory; conversely, when an instruction or a data word is stored in the main memory, they are also temporarily stored in the data register.
    The role of the data register is:
    (1) As a transfer station for information transmission between the CPU, main memory, and peripheral devices;
    (2) To make up for the difference in operating speed between the CPU, main memory, and peripheral devices;
    (3) In a single In an arithmetic unit with an accumulator structure, the data register can also serve as an operand register.

  2. Instruction Register
    Instruction Register (Instruction Register, IR) is used to save an instruction currently being executed.
    When an instruction is executed, the instruction is first read from the main memory into the data register, and then transferred to the instruction register.
    Instructions include two fields: opcode and address code. In order to execute the instruction, the opcode must be tested to identify the required operation. The instruction decoder (Instruction Decoder, ID) completes this work. The instruction decoder decodes the operation code part of the instruction register to generate the control potential of the operation required by the instruction, and sends it to the micro-operation control circuit, and generates specific operation control under the action of the timing signal of the sequential component Signal.
    The output of the opcode field in the instruction register is the input to the instruction decoder. Once the operation code is decoded, a specific signal for specific operation can be sent to the operation controller.

  3. Program Counter
    The Program Counter (Program Counter, PC) is used to point out the address of the next instruction in main memory.
    Before the program is executed, the first address of the program, that is, the address of the main memory unit where the first instruction of the program is located, must be sent to the PC, so the content of the PC is the address of the first instruction fetched from the main memory.
    When executing an instruction, the CPU can automatically increment the contents of the PC so that it always saves the main memory address of the next instruction to be executed, ready to fetch the next instruction. If it is a single-word instruction, then (PC)+1àPC, if it is a double-word instruction, then (PC)+2àPC, and so on.
    However, when a branch instruction is encountered, the address of the next instruction will be specified by the address code field of the branch instruction, rather than obtained by sequentially incrementing the contents of the PC as usual.
    Therefore, the structure of the program counter should be a structure with two functions of registering information and counting.

  4. Address Register
    Address Register (Address Register, AR) is used to save the address of the main memory unit currently accessed by the CPU.
    Since there is a difference in operating speed between the main memory and the CPU, address registers must be used to temporarily save the address information of the main memory until the access operation of the main memory is completed.
    When the CPU and the main memory exchange information, that is, when the CPU stores data/instructions in the main memory or reads data/instructions from the main memory, the address register and the data register are used.
    If we uniformly address the peripheral device and the main memory unit, then when the CPU and the peripheral device exchange information, we also need to use the address register and the data register.

  5. Accumulation register
    Accumulation register is usually referred to simply as the accumulator (Accumulator, AC), which is a general-purpose register.
    The function of the accumulator is: when the arithmetic logic unit ALU of the arithmetic unit performs arithmetic or logic operations, it provides a work area for the ALU, and can temporarily save an operand or operation result for the ALU.
    Obviously, there must be at least one accumulation register in the arithmetic unit.

  6. Program Status Word Register
    Program Status Word (Program Status Word, PSW) is used to represent the current operation status and the working mode of the program.
    The program status word register is used to save various condition code contents established by the operation or test results of arithmetic/logic instructions, such as the operation result carry/borrow flag (C), the operation result overflow flag (O), and the operation result is Zero flag (Z), operation result is negative flag (N), operation result sign flag (S), etc. These flag bits are usually saved by 1-bit flip-flop.
    In addition, the program status word register is also used to save information such as interrupts and system working status, so that the CPU and the system can keep abreast of the machine running status and program running status.
    Therefore, the program status word register is a register that holds various status condition flags.

The following is the cpu and main memory frame diagram

4.MMU

belongs to cpu

 The role of mmu is to convert between virtual addresses and physical addresses.

The relationship between virtual address and physical address is recorded in the page table, and the page table is stored in memory.

The TLB is a cache for caching the results of page table translations, thereby reducing the time for page table lookups.

If there is no hit in the TLB, then use TWU to traverse the page table in the memory, get the physical address or virtual address, and record it in the TLB.

 5.cache

belongs to cpu

First of all, cache is called high-speed cache, why there is cache, because although the memory access data is fast, it is still too slow compared with the CPU, the cache is hundreds of times faster than the memory, and the access speed is equivalent to the CPU, so First load the data in the memory into the cache in advance, and make a cache for the CPU to use.

Cache is divided into L3 cache /L2 cache / L1 cache, L1 cache is divided into instruction cache cache and data cache cache

L1 and L2 cache are on the same CPU, and L3 cache is generally shared between multiple CPUs

 The figure below shows the types of all stored data in the computer

1. Storage 2. Cache 3. Memory 4. Hard disk

Two. CPU running process

cpu running process

1. Fetch instructions

The cpu reads an instruction in the instruction cache pointed to by the program counter, and reads the instruction into the instruction register IR.

2. Analyze and issue instructions

The instruction decoder ID analyzes the instruction

The operation controller CU and the timing generator TG issue control commands to the relevant components according to the function of the command, and execute the operation of the control command

3. Execute instructions

Executing instructions is divided into two steps: fetching operands and performing operations.

Fetching operands: The CPU reads operands from the data cache cache into registers through addressing operations and temporarily saves them

Perform operations: the operation unit operates on the number in the register through the opcode in the instruction

4. Instruction count

Modify the program counter PC to point to the address of the next instruction. Continue to loop the above steps until there is no instruction.

The size of the cpu's L3 cache

The first-level cache has the smallest capacity, and the unit is KB. There is no difference in the first-level cache between different CPUs.

The second-level cache is basically a single-digit MB, except that some server CPUs will have more than 10 MB,

L3 cache, common CPU's L3 cache is only about 10MB (now AMD EPYC's X series has reached 768MB L3 cache).

The relationship between cpu and io

Computer hardware uses DMA to access disk and other IO, that is, after the request is sent, the CPU will no longer manage it until the DMA processor completes the task, and then tells the CPU through an interrupt. Therefore, a single IO time occupies very little CPU, and it will not occupy the CPU if it is blocked, because the program does not continue to run, and the CPU time is given to other threads and processes. Although IO does not take up a lot of CPU time, very frequent IO will still waste CPU time. Therefore, in the face of a large number of IO tasks, sometimes algorithms are needed to merge IO, or use cache to relieve IO pressure.



 

Guess you like

Origin blog.csdn.net/weixin_70280523/article/details/132157339