1. Instruction cycle
The instruction cycle refers to
the time for the CPU to fetch an instruction from the main memory, analyze the instruction, and execute the instruction.
1.1 instruction cycle
Instruction cycle: refers to the total time required for the cpu to fetch instructions from memory and execute an instruction.
For example, take out the operand from the memory unit, use the opcode add, add the value in the acc register, and save the result in the acc register.
Instruction fetch cycle: To access memory once, the instruction is fetched from memory and sent to the cpu.
Execution cycle: Also access the memory once,操作数
take out from the memory, send it to the cpu, and perform the addition operation.
1.2 The instruction cycle of each instruction is different
In the instruction fetch cycle, to access the memory once, the instruction is fetched from the memory and sent to the CPU.
Execution cycle: It also accesses the memory once, 操作数
fetches from the memory, executes the corresponding operation, and saves the result in the register.
1.3 Instruction Cycle with Indirect Addressing
In the instruction fetch cycle, to access the memory once, the instruction is fetched from the memory and sent to the CPU.
Indirect cycle: Access memory once and take 操作数的地址
it out from memory.
Execution cycle: It also accesses the memory once, 操作数
fetches from the memory, executes the corresponding operation, and saves the result in the register.
1.4 Instruction Cycle with Interrupt Cycle
Interrupt cycle: protect the breakpoint, form the entry address of the interrupt service routine, and turn off the interrupt.
1.5 Instruction Cycle Flow
According to the different nature of CPU memory access, the working cycle of CPU can be divided into fetch cycle, indirect address cycle, execution cycle and interrupt cycle.
1.6 Flags of CPU duty cycle
In the stage of the instruction cycle, the controller
needs to issue different control commands, and
the controller also needs to know which stage of the instruction cycle it is currently in.
CPU memory access has four properties
Concrete operation | cycle phase |
---|---|
fetch instruction | fetch cycle |
take address | indirect cycle |
access operand or result | execution cycle |
stored program breakpoint | interrupt cycle |
使用D触发器,对指令周期中不同的阶段进行标识
2. The data flow of the instruction cycle
2.1 Instruction fetch cycle data flow
MAR: The address to access the memory is stored in MAR.
MDR: The data (or instructions) fetched from the memory are placed in the MDR.
IR: In the value fetch cycle, the instruction is retrieved and stored in the IR, and the IR register is specially used to save the instruction.
CU: The operation of all components is controlled by the CU;
specifically,
the CU sends a read signal to the memory.
cu control, prepare the address of the next instruction, save the value of pc+1 in the pc total;
2.2 Indirect cycle data flow
Once the fetch cycle is complete, the CU checks the contents of the IR to determine if it has indirect operations.
The address of the operand to be fetched in the memory is stored in the MDR;
Enter the address stored in MDR into MAR.
2.3 Execution cycle data flow
The execution cycle data flow of different instructions is different
2.4 Interrupting periodic data flow
1. Save breakpoint; CU control, save the program breakpoint in which address in the memory unit, store the address in cu in MAR, MAR stores the data in the address bus, the breakpoint to be saved here is after the interrupt , the location where the program needs to be returned represents the address of the next instruction that needs to be executed after the interrupt is resumed, and the address is saved in the pc, and the pc saves the value in the MDR.
2. Form the entry address of the interrupt service routine, which is also given by cu and written into the pc.
3. Hardware, turn off interrupts;
3. Command flow
3.1 How to increase machine speed
-
- Improve memory access speed
High-speed chips, Cache, multi-body parallelism
- Improve memory access speed
Multi-body parallelism: The CPU performs cross-access to multiple memories in one main memory cycle, and multiple memory banks provide data to the CPU.
-
- Increased transfer speed between I/O and host
Interrupts, DMA, channels, I/O handlers, multiple buses
- Increased transfer speed between I/O and host
-
- Improve the speed of the arithmetic unit
High-speed chips improve the algorithm fast carry chain
- Improve the speed of the arithmetic unit
In order to improve the operation speed, high-speed chips and fast carry chains can be used, as well as improved algorithms and other measures.
• Improve the processing capacity of the whole machine,
improve the system structure with high-speed devices, and develop the parallelism of the system
3.2 Parallelism of the system
- Parallelism includes both simultaneity and concurrency:
Simultaneous: Refers to two or more events 同一时刻
occurring
concurrently: Refers to two or more events 同一时间段
occurring.
- level of parallelism
Process level (program, process), coarse-grained, software implementation
Instruction level (between instructions), intra-instruction, fine-grained, hardware implementation
3.3 Principle of instruction pipeline
- Serial Execution of Instructions
There is always one idle unit when the instruction fetch unit completes
execute instruction execute instruction component complete
- Secondary pipeline of instructions
If the instruction fetch and execution stages overlap completely in time,
the instruction cycle is halved and the speed is doubled
3.4 Factors Affecting the Doubling of Instruction Pipeline Efficiency
- Execution time > Fetch time:
条件转移指令
:
It is a factor that affects the doubling of instruction pipeline efficiency. When a conditional transfer instruction is encountered, the next instruction is unknown, because it must wait until the end of the execution phase to know whether the condition is true, so as to determine the address of the next instruction, resulting in time loss.
3.5 Six-level pipeline of instructions
instruction | effect |
---|---|
FI | fetch instruction |
FROM | instruction decoding |
CO | form operand address |
FO | fetch operand |
NO | perform an operation |
WO | write back the result |
Result writing back refers to writing the running result back to a given register, or writing back to a given memory unit.
4. Factors Affecting Instruction Pipeline Performance
4.1 Structural correlation
Resource conflicts arise when different instructions contend for the same functional unit.
As shown in the figure, in the fourth time period, the two instructions FO and FI access the memory at the same time;
One way to solve memory access conflicts is to set up two independent memories to store operands and instructions separately, so as to avoid conflicts when fetching instructions and fetching operands at the same time, so that the implementation time of fetching a certain instruction and fetching the operand of another instruction on the overlap.
Solutions:
• Pause, will run conflicting instructions, one of which pauses to the next time period.
• 指令存储器和数据存储器分开
, separate instruction cache and data cache;
• Instruction prefetching technology (applicable to the case of short memory access cycle), fast access to memory, fetching parts, taking advantage of idle time, fetching multiple instructions from the memory unit, and putting them in the instruction buffer queue in the CPU, reducing the need for memory A conflict occurred while accessing.
4.2 Data correlation
Different instructions may change the read/write access sequence of operands due to overlapping operations.
- Read after write related:
For a certain storage unit, or a register, the write operation needs to be completed first, and then the read operation is completed.
The subtraction and addition operations in the figure; Since the addition operation uses the contents of the R1 register, the R1 register must be written first, and then the R1 must be read.
- Read after write:
STA, to store the content in R2 into the M memory unit, it needs to be read first;
then, use the write operation to store the result of the addition into R2.
Solution:
-
Backward method, must wait for the end of the previous instruction, and then execute the subsequent instruction.
-
Using bypass technology;
It is equivalent to using a short-circuit connection method, that is, after the addition operation, before the result is written into the R1 register, a short-circuit connection is used to connect to the result first, and the operation result is taken out.
4.3 Control related
Control dependencies are mainly caused by branch instructions.
For instructions related to conditional judgment, you must wait for the result of the judgment before knowing which instruction to execute next.
caused by a transfer instruction;
5. Pipeline performance
5.1 Throughput rate
Throughput rate: the number of instructions or output results completed by the pipeline per unit time;
Let the time of each segment of the m-segment pipeline be Δt
• Maximum throughput:
T p m a x = 1 Δ t T_{pmax} = \frac{1}{Δt} Tp ma x=Δt1
• Actual throughput
rate
T p = nm ⋅ Δ t + ( n − 1 ) ⋅ Δ t T_{p}= \frac{n} {m ·Δt + (n-1) · Δt}Tp=m⋅Δt+(n−1)⋅Δtn
5.2 Speedup ratio
The ratio of the speed of the pipeline in the m section to the speed of the non-pipeline with the same function
Let the time of each section of the pipeline be Δt;
To complete n instructions on the m-segment pipeline:
T = m ⋅ Δ t + ( n − 1 ) ⋅ Δ t T= {m Δt + (n-1) Δ t}T=m⋅Δt+(n−1)⋅Δt
To complete n instructions on the equivalent non-pipelined line:
T p = nm ⋅ Δ t T_{p}= {nm Δt }Tp=nm⋅Δt
S p = nm ⋅ Δ tm ⋅ Δ t + ( n − 1 ) ⋅ Δ t S_{p}= \frac{nm ·Δt } {m ·Δt + (n-1) ·Δ t}Sp=m⋅Δt+(n−1)⋅Δtnm⋅Δt
= n m m + n − 1 = \frac{nm}{ m+ n-1} =m+n−1nm
5.3 Efficiency
The utilization rate of each functional section in the pipeline
Due to the establishment time and emptying time of the pipeline,
it is impossible for the equipment of each functional segment to be in working condition all the time
The utilization rate of each functional section in the pipeline