Instruction cycle of ch8_2_CPU, pipeline technology

1. Instruction cycle

The instruction cycle refers to
the time for the CPU to fetch an instruction from the main memory, analyze the instruction, and execute the instruction.

1.1 instruction cycle

Instruction cycle: refers to the total time required for the cpu to fetch instructions from memory and execute an instruction.

For example, take out the operand from the memory unit, use the opcode add, add the value in the acc register, and save the result in the acc register.

insert image description here

Instruction fetch cycle: To access memory once, the instruction is fetched from memory and sent to the cpu.
Execution cycle: Also access the memory once, 操作数take out from the memory, send it to the cpu, and perform the addition operation.

1.2 The instruction cycle of each instruction is different

In the instruction fetch cycle, to access the memory once, the instruction is fetched from the memory and sent to the CPU.

Execution cycle: It also accesses the memory once, 操作数fetches from the memory, executes the corresponding operation, and saves the result in the register.

insert image description here

1.3 Instruction Cycle with Indirect Addressing

insert image description here
In the instruction fetch cycle, to access the memory once, the instruction is fetched from the memory and sent to the CPU.

Indirect cycle: Access memory once and take 操作数的地址it out from memory.

Execution cycle: It also accesses the memory once, 操作数fetches from the memory, executes the corresponding operation, and saves the result in the register.

1.4 Instruction Cycle with Interrupt Cycle

insert image description here

Interrupt cycle: protect the breakpoint, form the entry address of the interrupt service routine, and turn off the interrupt.

1.5 Instruction Cycle Flow

According to the different nature of CPU memory access, the working cycle of CPU can be divided into fetch cycle, indirect address cycle, execution cycle and interrupt cycle.

insert image description here

1.6 Flags of CPU duty cycle

In the stage of the instruction cycle, the controller
needs to issue different control commands, and
the controller also needs to know which stage of the instruction cycle it is currently in.

CPU memory access has four properties

Concrete operation cycle phase
fetch instruction fetch cycle
take address indirect cycle
access operand or result execution cycle
stored program breakpoint interrupt cycle

insert image description here

使用D触发器,对指令周期中不同的阶段进行标识

2. The data flow of the instruction cycle

2.1 Instruction fetch cycle data flow

MAR: The address to access the memory is stored in MAR.

MDR: The data (or instructions) fetched from the memory are placed in the MDR.

IR: In the value fetch cycle, the instruction is retrieved and stored in the IR, and the IR register is specially used to save the instruction.

CU: The operation of all components is controlled by the CU;
specifically,
the CU sends a read signal to the memory.
cu control, prepare the address of the next instruction, save the value of pc+1 in the pc total;

insert image description here

2.2 Indirect cycle data flow

Once the fetch cycle is complete, the CU checks the contents of the IR to determine if it has indirect operations.

The address of the operand to be fetched in the memory is stored in the MDR;

Enter the address stored in MDR into MAR.
insert image description here

2.3 Execution cycle data flow

The execution cycle data flow of different instructions is different

2.4 Interrupting periodic data flow

1. Save breakpoint; CU control, save the program breakpoint in which address in the memory unit, store the address in cu in MAR, MAR stores the data in the address bus, the breakpoint to be saved here is after the interrupt , the location where the program needs to be returned represents the address of the next instruction that needs to be executed after the interrupt is resumed, and the address is saved in the pc, and the pc saves the value in the MDR.
2. Form the entry address of the interrupt service routine, which is also given by cu and written into the pc.
3. Hardware, turn off interrupts;
insert image description here

3. Command flow

3.1 How to increase machine speed

    1. Improve memory access speed
      High-speed chips, Cache, multi-body parallelism

Multi-body parallelism: The CPU performs cross-access to multiple memories in one main memory cycle, and multiple memory banks provide data to the CPU.

    1. Increased transfer speed between I/O and host
      Interrupts, DMA, channels, I/O handlers, multiple buses
    1. Improve the speed of the arithmetic unit
      High-speed chips improve the algorithm fast carry chain

In order to improve the operation speed, high-speed chips and fast carry chains can be used, as well as improved algorithms and other measures.

• Improve the processing capacity of the whole machine,
improve the system structure with high-speed devices, and develop the parallelism of the system

3.2 Parallelism of the system

  • Parallelism includes both simultaneity and concurrency:

Simultaneous: Refers to two or more events 同一时刻occurring
concurrently: Refers to two or more events 同一时间段occurring.

  • level of parallelism

Process level (program, process), coarse-grained, software implementation
Instruction level (between instructions), intra-instruction, fine-grained, hardware implementation

3.3 Principle of instruction pipeline

  • Serial Execution of Instructions

insert image description here

There is always one idle unit when the instruction fetch unit completes

execute instruction execute instruction component complete

  • Secondary pipeline of instructions
    insert image description here

If the instruction fetch and execution stages overlap completely in time,
the instruction cycle is halved and the speed is doubled

3.4 Factors Affecting the Doubling of Instruction Pipeline Efficiency

  • Execution time > Fetch time:
    insert image description here

条件转移指令:
It is a factor that affects the doubling of instruction pipeline efficiency. When a conditional transfer instruction is encountered, the next instruction is unknown, because it must wait until the end of the execution phase to know whether the condition is true, so as to determine the address of the next instruction, resulting in time loss.

3.5 Six-level pipeline of instructions

instruction effect
FI fetch instruction
FROM instruction decoding
CO form operand address
FO fetch operand
NO perform an operation
WO write back the result

Result writing back refers to writing the running result back to a given register, or writing back to a given memory unit.
insert image description here

4. Factors Affecting Instruction Pipeline Performance

4.1 Structural correlation

Resource conflicts arise when different instructions contend for the same functional unit.

As shown in the figure, in the fourth time period, the two instructions FO and FI access the memory at the same time;

One way to solve memory access conflicts is to set up two independent memories to store operands and instructions separately, so as to avoid conflicts when fetching instructions and fetching operands at the same time, so that the implementation time of fetching a certain instruction and fetching the operand of another instruction on the overlap.

insert image description here

Solutions:
• Pause, will run conflicting instructions, one of which pauses to the next time period.
指令存储器和数据存储器分开, separate instruction cache and data cache;

• Instruction prefetching technology (applicable to the case of short memory access cycle), fast access to memory, fetching parts, taking advantage of idle time, fetching multiple instructions from the memory unit, and putting them in the instruction buffer queue in the CPU, reducing the need for memory A conflict occurred while accessing.

4.2 Data correlation

Different instructions may change the read/write access sequence of operands due to overlapping operations.

  • Read after write related:

For a certain storage unit, or a register, the write operation needs to be completed first, and then the read operation is completed.
The subtraction and addition operations in the figure; Since the addition operation uses the contents of the R1 register, the R1 register must be written first, and then the R1 must be read.

  • Read after write:

STA, to store the content in R2 into the M memory unit, it needs to be read first;
then, use the write operation to store the result of the addition into R2.

insert image description here

Solution:

  • Backward method, must wait for the end of the previous instruction, and then execute the subsequent instruction.

  • Using bypass technology;

It is equivalent to using a short-circuit connection method, that is, after the addition operation, before the result is written into the R1 register, a short-circuit connection is used to connect to the result first, and the operation result is taken out.

4.3 Control related

Control dependencies are mainly caused by branch instructions.

For instructions related to conditional judgment, you must wait for the result of the judgment before knowing which instruction to execute next.

caused by a transfer instruction;

insert image description here
insert image description here

5. Pipeline performance

5.1 Throughput rate

Throughput rate: the number of instructions or output results completed by the pipeline per unit time;

Let the time of each segment of the m-segment pipeline be Δt
• Maximum throughput:

T p m a x = 1 Δ t T_{pmax} = \frac{1}{Δt} Tp ma x=Δt1

• Actual throughput
rate
T p = nm ⋅ Δ t + ( n − 1 ) ⋅ Δ t T_{p}= \frac{n} {m ·Δt + (n-1) · Δt}Tp=mΔt+(n1)Δtn

5.2 Speedup ratio

The ratio of the speed of the pipeline in the m section to the speed of the non-pipeline with the same function
Let the time of each section of the pipeline be Δt;

To complete n instructions on the m-segment pipeline:
T = m ⋅ Δ t + ( n − 1 ) ⋅ Δ t T= {m Δt + (n-1) Δ t}T=mΔt+(n1)Δt

To complete n instructions on the equivalent non-pipelined line:
T p = nm ⋅ Δ t T_{p}= {nm Δt }Tp=nmΔt

S p = nm ⋅ Δ tm ⋅ Δ t + ( n − 1 ) ⋅ Δ t S_{p}= \frac{nm ·Δt } {m ·Δt + (n-1) ·Δ t}Sp=mΔt+(n1)ΔtnmΔt

= n m m + n − 1 = \frac{nm}{ m+ n-1} =m+n1nm

5.3 Efficiency

The utilization rate of each functional section in the pipeline

Due to the establishment time and emptying time of the pipeline,
it is impossible for the equipment of each functional segment to be in working condition all the time

insert image description here

The utilization rate of each functional section in the pipeline

insert image description here

6. Multi-issue technology of pipeline

6.1 Superscalar technology

insert image description here

6.2 Super-pipeline technology

insert image description here

6.3 Ultra-Long Instruction Word Technology

insert image description here

7. Pipeline structure

7.1 Instruction Pipeline

insert image description here

7.2 Operation pipeline

insert image description here

Guess you like

Origin blog.csdn.net/chumingqian/article/details/131256355