[Principles of Computer central processing unit]

The basic structure and functions of the CPU

A central processing unit (CPU) by the arithmetic unit and a controller. Function of the controller is responsible for coordinating and executing the program control instruction sequence of each of the computer components, including instruction fetch, analyze and execute instructions in; calculator function is the data processing. CPU specific features include:

  1. Command control. Fetch operation is completed, analysis and execution of instructions, i.e., the sequence control program.
  2. Operational control. They tend to have a function of a combination of several command signals to implement the operation. CPU manages and generates each of the operation instruction signal is taken out of the memory, the various operation signals to the respective components, thereby controlling these components operates according to the request command.
  3. time control. Time control of the various operations to be. Time control to provide proper control signals for each instruction in chronological order.
  4. Data processing. Data arithmetic and logic operations.
  5. Interrupt handling. And the special case of abnormal operation request appearing during computer processing.

 

1. operator

  The controller receives a command from the operator and sent corresponding action is performed, the data processing and handling. Operator is the central data processing computer, which is mainly composed of an arithmetic logic unit (ALU), temporary registers, the accumulation registers (the ACC), general-purpose register, a program status word register (PSW). Shifter, a counter (CT) and so on.

  1. An arithmetic logic unit. Its main function is an arithmetic / logic operations.
  2. Temporary register. From main memory for temporarily storing data to read, the data can not be stored in the general register, the original content would be destroyed. Temporary register is transparent to the application programmer .
  3. Accumulation register. It is a general-purpose register, for temporarily storing the result of the ALU operation information can be used as an input of the adder.
  4. General-purpose register. Such as the AX, BX, CX, DX, SP et al. (Also commonly R0, R1 ... shown) for storing the number of operations (including the source operand and destination operand intermediate results), and various address information. SP is the stack pointer, indicating the address of the stack.
  5. Program status word register. Various status information retained by the result of an arithmetic or logical operation instructions established test command, such as overflow flag (. OF), sign flag (SF), zero flag (the ZF), carry flag (CF) and the like. These positions involve the PSW and decides formed micro-operation.
  6. Shifter. Operand shift operation or operation result.
  7. counter. Step number control operation of multiplication and division.

2. Controller

  The controller is the command center of the whole system, under the control of the controller, the operational unit, memory and input / output devices and other functional components constituting an organic whole, according to the command and coordination requirements of the whole machine instruction. The basic function of the controller is to execute an instruction, a set of micro-operations performed each instruction issued by the controller are implemented.

  The controller has a hardwired controller and a micro program controller types. Controller by the program counter (PC), instruction register (IR), instruction decoder, memory address registers (the MAR), a memory data register (MDR), and the system timing signal generator composed of micro-operations.

  1. The program counter. It is used to indicate the next instruction address stored in main memory. The contents of PC CPU to the main memory fetch. PC has a self-energizing function.
  2. Instruction register. Save for that instruction currently being executed.
  3. Instruction decoder. Decoding the opcode field only, to provide a specific operation signal to the controller.
  4. Memory address register. A main memory unit storing the address to be accessed.
  5. Memory data register. Used to store information or information read out from the main memory to the main memory write.
  6. The timing system. For generating various timing signals, which are unified by the clock (the CLOCK) is divided.
  7. Micro-operations signal generator. IR based on the content (command), the contents of the PSW (status information) and timing signal, generates various control signals required for controlling the entire computer system, which structure combinational logic and storage logic type two.

  The controller works, according to step instruction opcode, instruction (micro-command sequence) and the condition signal to form a control signal for the current computer to use the components. Each computer machine hardware to be coordinated under the control of the control signals, to produce the desired effect of implementation.

  CPU internal registers can be divided into two categories: one is a user-visible registers of such registers may be programmed, such as the general-purpose register, a program status word register; the other is invisible to the user registers, it is transparent to the user , not programming of such register, memory address registers, memory data register, instruction register.

 

The same number of digits than the processor data bus, the number of bits of data that a CPU can handle, i.e. the number of bits of the CPU.

Bits of the program counter depends on the capacity of the memory.

Number of bits depends on the instruction word length register.

It depends on the number of bits of the general register machine word.

Instruction comprises an opcode field and an address code field, but only the instruction decoder decodes an opcode field, thereby determining the operation instruction function.

The address decoder is part of a memory such as main memory, its role is the only selected memory cell in accordance with the input address code, which is not part of the CPU.

Between the end of the address period, the contents of the CPU registers for operand address MAR.

 

Instruction execution process

  All the CPU time required to remove from the main memory and executes each instruction is called an instruction cycle, i.e., an instruction of CPU time to complete.

  Some common machine cycle instruction cycle is represented, and a machine cycle comprises a number of clock cycles (also referred to beat or T cycle, which is the basic unit of operation of the CPU). The number of machine cycles per instruction cycle can vary, the number of beats in each machine cycle may also vary.

 

  For unconditional jump instruction JMP X, does not need to access the main memory when executed, it contains only fetch stage (including fetch and analysis) and execution stage, the instruction cycle comprising only fetch cycle and execution cycle. For indirect addressing instruction for the operand fetch, a need to access main memory, a valid address is removed, and then access the main memory. Remove the operands, so it needs to include inter-address period. Interposed between the fetch address period and the period between execution cycles.

  A complete instruction fetch cycles include, inter site, and performs the interrupt 4 cycles. CPU cycle has four memory access operations, memory access different purposes only. Fetch cycle to fetch an instruction, to fetch address period between the effective address period is performed for operand fetch, in order to save the program interrupt cycle break. Push operation interruption period is minus 1 SP, on the contrary it and push the traditional sense of the operation because the computer's stack is increased to lower the address, so push operation is minus 1 plus 1 instead.

 

Fetch cycle task is to fetch an instruction code from the contents of the PC main memory and stored in the IR. At the same time the instruction fetch, PC is incremented.

Task is to take the period between site operand effective address. The inter-site address of the instruction codes to the MAR to the address bus and thereafter CU send a read command to the memory, to obtain the effective address to coexist MDR.

Task execution cycle is to produce an execution result by the ALU operation according to the instruction word in IR opcode and operands, different execution cycles of operation different instruction.

Interrupt cycle task is to process the interrupt request.

 

Instruction execution program

  1. single instruction cycle. All instructions are selected for execution of time to complete the same, the program called single instruction cycle. In this case, each instruction in a fixed clock cycle, between the serial execution of instructions, i.e., the next instruction can only be executed after the preceding instruction to start. Thus, depending on the instruction cycle execution times of the longest instruction.

  2. Multi-cycle instruction. Different instructions use different execution steps, referred as a multi-cycle program instruction. Serial execution between an instruction that the next instruction can only be executed after the end of the previous one instruction to start. However, the choice of a different number of clock cycles to complete the execution of different instructions of the instruction takes several cycles to assign several cycles, rather than requiring all occupy the same instruction execution time.

  3. pipeline scheme. Program instructions may be executed in parallel between, called pipelining scheme, which goal is to strive to complete one instruction in each clock cycle execution (ideally only to achieve the effect). By this solution a starting instruction in each clock cycle, multiple instructions run as far as possible, but each at different steps in.

 

Instruction always read from main memory according to the program counter.

An unconditional jump instruction in the instruction cycle, the value of the program counter (PC) is modified twice. First, after the end of the fetch cycle, the PC automatically incremented; In the execution cycle, the value of the PC to modify the address jump. In summary, an unconditional jump instruction in the instruction cycle, the value of the program counter (PC) is modified twice. Fetch operations are automatically performed, the controller does not need to give the corresponding instructions. Instruction of a different length, which fetch operations may be different. The controller can be distinguished stored in the storage unit is an instruction or data.

Automatically fetch the operation, the controller does not need to give the corresponding instructions.

Instruction of a different length, which fetch operations may be different.

 

The basic structure and function of the data path

  Data transfer path between the functional component called a data path. A data transmission path between the operator and the central processor each register is internal data path. Data path information describing where to start, after the intermediate register or multiplexer which switches, which finally transferred to the register, these must be controlled. Task of establishing data path is from the "operation control part" to complete. Functional data path is to realize the data exchange between the CPU internal registers and register with the operator.

 

The basic structure of the data path are the following:

  1. Single internal CPU bus mode. The input and output registers are all connected to a common channel, such a structure is relatively simple, but there are more data transmission conflict phenomenon.
  2. Internal multi-CPU bus mode. The input and output registers are all connected to a plurality of common channel, transmitted simultaneously on a plurality of different data buses, to improve efficiency.
  3. Dedicated data path mode. Arranged according to the direction of flow of data and addresses during instruction execution in the connecting line, avoiding the use of a shared bus, higher performance, but the large amount of hardware.

 

Refer to the same internal bus is a member, such as an internal bus connected between the CPU register and operation member; refers to a system bus among the various components of the same computer system, such as CPU, memory, channels and various I / O interfaces from each other connecting bus.

Data transfer between CPU registers via an internal bus is completed.

Data transfer between the CPU and main memory by CPU internal bus have completed.

When performing arithmetic or logical operation, due to the ALU circuit itself is not a combination of internal storage function, so as to perform the addition operation, the sum of the two numbers must be valid for the two inputs of the ALU.

 

The CPU of a single bus, only one input of the ALU can be connected to the bus, the other input terminal connected through the register to the bus for an otherwise two ports simultaneously obtain two identical data, the data path is not normal jobs.

 

Micro-operation sequence implemented ADD R1, (R2) of

(PC) → SEA

M → MDR

(PC) + 1 → PC

MDR → IR

R1 → LA

(R2) → MAR

M → MDR

MDR → LB

(LA) + (LB) → R1

 

Functions and working principle of the controller

  Different micro-operation control signal is generated according to the embodiment of the controller, the controller can be hardwired into the micro controller and an operation controller, a PC controller and two IR are the same, but the way to determine and represent steps executed instruction and a control signal given by the control programs required for operation of the components are different.

 

Hard-wired controller

  The basic principle is based on the hard-wired controller request command, the current state and the external and internal timing, and transmits a series of micro-operation control signal in order of time. It consists of a complex combination of a number of logic gates and flip-flops, and therefore also known as combinatorial logic controller .

     Opcode of the instruction is to determine the control unit issues a different operation command (control signal) of the key. To simplify the logic control unit (CU), the decoding of the instruction operation code and the clock generator is separated from the CU, the control unit can be simplified in FIG. CU input source is as follows:

  1. Instruction information generated by the instruction decoder decodes. Current instruction operation code determines different operations in different instruction cycles required to complete execution of the instruction opcode field of the input signal so that the control unit, which generates various control signals with the clock.
  2. Machine cycle timing signal and a cadence signal generated by the system. In order for the control unit according to a certain order, a certain rhythm emits respective control signals, the control unit must be controlled by the clock, i.e. a clock pulse causing the control unit transmits an operation command, or transmits a set of the operation command needs to be executed simultaneously.
  3. Feedback information from the execution unit, i.e. flag. The control unit CPU may need to rely on the current state of the control signal is generated which, as BAN instruction, the control unit is to generate different control signals in accordance with instruction on whether the result is negative.

The control unit also receives a control signal from the system bus (control bus), such as interrupt requests, the DMA request.

 

The timing of the system controller and hardwired micro-operation

  1. Clock cycle. Clock signal control clock generator, may generate a beat, a width exactly corresponding to each beat of a clock cycle. Within each beat machine performing one or several operations need to be performed simultaneously.
  2. Machine cycle. Machine cycle time can be regarded as a reference during the execution of all instructions. Different operations of different instructions, instruction cycle is different. A memory access time is fixed, it is often the cycle time as a reference to access time, i.e. the minimum time a memory read instruction word of the machine cycle. In the case of storing the instruction word length equal to the word length, it can also be considered fetch cycle machine cycle. In one machine cycle can be completed in a number of micro-operations, each micro-operations are required a certain time, the clock signal can be used to control the operation of each micro instruction is generated.
  3. Instruction cycle. See above.
  4. Analysis of micro-operations command. The control unit has a function of issuing various operation commands (control signal) sequence. These commands and instructions, but also to follow a certain order must be issued in order to make the machine work in an orderly manner.

An instruction is divided into three operating cycles: fetch cycle, and the execution period between the address period.

 

CPU control methods are mainly the following three:

  1. Synchronous control. Indicates that the system is a single clock, the clock signals are all of the control signal from the unified micro typically the longest and most tedious operation sequence as a standard micro-operations, complete uniformity, with the same number and the same time interval as the beat machine cycle to run different instructions. Synchronous control mode is the advantage of simple control circuit, the disadvantage is running slow.
  2. Asynchronous control mode. Asynchronous control mode reference timing signal is not present, the various components according to their own inherent rate by liaison response mode. Advantage of the asynchronous mode is fast, the drawback is more complicated control circuit.
  3. Joint control. Joint Synchronization between control is a compromise between asynchronous. This method of operation of a variety of micro-instructions to implement most of the synchronous control, using a small part of the asynchronous control approach.

 

Micro-program controller

  Micro program controller implemented using logical storage, the micro-operation signal is coded, so that each converted into machine instructions and stored in a section dedicated microprogram memory (control memory), the control signal generated by the micro-operation microinstruction.

 

(1) and the micro-micro-operation command. Can be decomposed into a machine instruction sequence of a micro operation, the micro-computer in the basic operations, an operation no longer decomposed. Microprogram control computer, the various programs of the control member control commands issued to the execution unit called microinstruction, which is the smallest unit constituting the control sequences. Micro-operation commands and micro-one correspondence. The micro-command is a micro-operation control signal, the micro-operation during micro-command is executed. Micro commands have points of compatibility and mutual exclusion. Compatibility of those micro-command may be generated simultaneously, a slight microinstruction together to complete the operation; and micro exclusive command means commands the micro machine is not allowed to occur simultaneously. Compatibility and mutual exclusion are relative, a micro-command and some of the micro-command compatibility, and another slightly command are mutually exclusive.

(2) the micro microinstruction cycle. Microinstruction is a collection of several micro-command. Control store memory address microinstructions called micro cell address. A micro instruction information typically includes at least two parts:

  1. Operation control field, also known as micro-operation code field, various operation control signals for a step to produce the desired operation.
  2. Sequence control field, a code field, also known as micro-address, address of the next microinstruction to be executed generates a control.

    Generally refers to micro-cycle microinstruction read from the control memory and time required to perform the respective micro-operation.

(3) the main memory and control memory. A main memory for storing programs and data, the CPU on the outside, to achieve a RAM; control memory (CM) for storing microprogram, inside the CPU, realized by ROM.

(4) Program the microprogram. Procedure is an ordered set of instructions for performing a specific function; microprogram microinstruction set is ordered, the function implemented by an instruction section microprogram .

 

Micro-program and program are two different concepts. Microprogram microinstruction is composed of, for describing the machine instructions. Micro program is actually a real-time interpreter machine instructions are implemented by computer designers prepared well and stored in the control memory, is generally not available to the user. For programmers, the structure and function of the micro-program computer system is transparent. The final program consisting of machine instructions are implemented in software design better staffing and stored in main memory or secondary storage in.

 

Distinction the following registers:

1. address register (MAR). For storing main memory read / write address.

2. Micro address register (CMAR). Control memory for storing the read / write address of the microinstruction.

3. Instruction Register (IR). Used to store instructions read from the main memory.

4. The micro instruction register (or the CMDR μIR). For storing microinstructions read out from the controller.

 

If the instruction cycle having n types of machine instructions , the number of microprogram control memory at least. 1 + n (. 1 public fetch microprogram interrupt again +1).

 

Microinstruction encoding:

  1. Coded directly (direct control) mode. Encoded directly without the need for decoding the command field of the microinstruction micro each represents in a micro-command. When designing the microinstruction, a microinstruction selection or choice, as long as the corresponding bit is set indicating that the micro-command to 0 or to 1. Each micro-command corresponds to the operation data, and controls a micro-passage.
  2. Field direct encoding. The command field of the microinstruction micro divided into several smaller fields, the exclusive combination of microspheres in a unified command field, the compatibility of the composition in micro-commands in different fields, each field is coded independently, each command encoding represents a micro defined meanings and each field is coded separately, independent of other fields, this field is a direct encoding. Each small pieces typically set aside a further state, indicating the present field does not issue any command Micro. Thus, when a field length is 3, it represents a maximum of 7 exclusive of each micro-command, usually represented by 000 does not operate.
  3. Indirect field encoding. A micro command field needs some interpreted by some other micro-command field, since the field not by directly decoding the microinstruction issued, so called indirect field coding, also known implicitly encoded. This embodiment can be further shortened word length micro instructions, but weaken parallel microinstruction control, therefore usually as an aid to direct field coding.

 

The microinstruction format and encoding related microinstructions, usually divided into horizontal and vertical microinstruction microinstruction.

  1. Horizontal microinstruction. From the coding perspective, direct coding, field coding direct, indirect coding and field coding are mixed microinstruction level belong. An instruction word corresponding to a control signal, the output is 1, and 0 otherwise. A horizontal microinstruction define and execute several parallel basic operations.

     

     

  2. Vertical microinstruction. Features vertical microinstruction is similar machine instruction operation code, microcode field is provided in the microinstructions, the microcode compilation method employed, by microinstruction predetermined microcode functions, only a vertical microinstruction define and execute a basic operation.
  3.  

    Mixed microinstruction. Increase the number of parallel operations in a less complex vertical basis. Microinstruction short, yet easy to write; micro-program is not long, execution speed.

 

Comparative microprogram controller and hardwired controller
Comparison Project \ category Micro-program controller Hard-wired controller
working principle Micro-operation control signal in the form of a microprogram stored in the control memory is read out to the instruction is executed Micro-operation control signal generated by a combinational logic circuit according to the current instruction code, status and timing instant
Speed ​​of execution slow fast
Regularity A more structured Cumbersome, irregular
Applications CISC CPU RISC CPU
Easy expandability Easy to modify the expansion difficult

 

In the microprogram controller, forming a micro program entry address of the machine instruction opcode field.

A microprogram control store is used to store microprogram, the microprogram is the core component of the controller, part of the CPU belongs.

硬布线控制器与微程序控制器相比,微程序控制器的时序系统比较简单。

通常情况下,一个微程序的周期对应一个指令周期。

 

指令流水线

当多条指令在处理器中执行时,可以采用以下三种方式:

  1. 顺序执行方式。指令顺序执行,前一条指令执行完后,才启动下一条指令。设取指、分析、执行三个阶段的时间都相等,用 t 表示,顺序执行 n 条指令所用时间为 T 为 T = 3nt。
  2. 一次重叠方式。这种方式同时进行第 k 条指令的执行阶段和第 k + 1 条指令的取指阶段。采用此种方式时,执行 n 条指令所用的时间为 T = (1 + 2n)t。

     

     

     

     

  3. 二次重叠方式。为了进一步提高指令的执行速度,可以把取 k +1 条指令提前到分析第 k 条指令的期间完成,而将分析第 k + 1 条指令与执行第 k 条指令同时进行。采用此种方式时,执行 n 条指令所用的时间是 T = (2 + n)t。 

     

     

流水线方式的特点(与传统串行方式相比):

  1. 把一个任务(一条指令或一个操作)分解为几个有联系的子任务,每个子任务由一个专门的功能部件来执行,并依靠多个功能部件并行工作来缩短程序的执行时间。
  2. 流水线每个功能部件后面都要有一个缓冲寄存器,或称锁存器,其作用是保存本流水段的执行结果,提供给下一流水段使用。
  3. 流水线中各功能段的时间应尽量相等,否则将引起堵塞、断流。
  4. 只有连续不断地提供同一任务时才能发挥流水线的效率,所以在流水线中处理的必须是连续任务。在采用流水线方式工作的处理机中,要在软件和硬件设计等多方面尽量为流水线提供连续的任务。
  5. 流水线需要有装入时间和排空时间。装入时间是指第一个任务进入流水线到输出流水线的时间。排空时间是指最后一个任务进入流水线到输出流水线的时间。

 

流水线的分类:

  1. 部件功能级、处理机级和处理机间级流水线。部件功能级流水将复杂的算术逻辑运算组成流水线工作方式。例如,可将浮点数加法操作分成求阶差、对阶、尾数相加及结果规格化等 4 个子过程。处理机级流水把一条指令解释过程分成多个子过程,如取指、译码、执行、访存和写回 5 个子过程。处理机间级流水是一种宏流水,其中每个处理机完成某一专门任务,各个处理机得到的结果需存放在与下一个处理机共享的存储器中。
  2. 单功能流水线和多功能流水线。单功能流水线是指只能实现一种固定的专门功能的流水线;多功能流水线是指通过各段间的不同连接方式可以同时或不同时地实现多种功能的流水线。
  3. 动态流水线和静态流水线。静态流水线指在同一时间内,流水线的各段之能按同一种功能的连接方式工作。动态流水线指在同一时间内,当某些段正在实现某种运算时另一些段却在进行另一种运算。
  4. 线性流水线和非线性流水线。线性流水线中,从输入到输出,每个功能段只允许经过一次,不存在反馈回路。非线性流水线存在反馈回路,从输入到输出的过程中,某些功能段将数次通过流水线,这种流水线适合进行线性递归的运算

 

影响流水线的因素:

  1. 结构相关(资源冲突)。由于多条指令在同一时刻争用同一资源而形成的冲突称为结构相关,有以下两种解决方法:
    1. 前一指令访存时,使后一条相关指令(以及其后续指令)暂停一个时钟周期。
    2. 单独设置数据存储器和指令存储器,使两项操作各自在不同的存储器中进行,这属于资源重复配置。
  2. 数据相关(数据冲突)。数据相关指在一个程序中,存在必须等前一条指令执行完才能执行后一条指令的情况,此时这两条指令即为数据相关。当多条指令重叠处理时就会发生冲突,解决的办法由以下几种:
    1. 把遇到数据相关的指令及其后续指令都暂停一至几个时钟周期,直到数据相关问题消失后再继续执行,可分为硬件阻塞(stall)和软件插入“NOP”指令两种方法。
    2. 设置相关专用通路,即不等前一条指令把计算结果写回寄存器组,下一条指令也不再读寄存器组,而直接把前一条指令的 ALU 的计算结果作为自己的输入数据开始计算过程,使本来需要暂停的操作变得可以继续执行,这称为数据旁路技术
    3. 通过编译器对数据相关的指令编译优化的方法,调整指令顺序来解决数据相关。
  3. 控制相关(控制冲突)当流水线遇到转移指令和其他改变 PC 值的指令而造成断流时,会引起控制相关。解决的办法有以下几种:
    1. 对转移指令进行分支预测,尽早生成转移目标地址。分支预测分为简单(静态)预测和动态预测。静态预测总是预测条件不满足,即继续执行分支指令的后续指令。动态预测根据程序执行的历史情况,进行动态预测调整,有较高的预测准确率。
    2. 预取转移成功和不成功两个控制流方向上的目标指令。
    3. 加快和提前形成条件码。
    4. 提高转移方向的猜准率。

Cache 缺失的处理过程也会引起流水线阻塞。在不过多增加硬件成本的情况下,如何尽可能提高指令流水线的运行效率(处理能力)是选用指令流水线技术必须解决的问题。

流水线中有三类数据相关冲突:写后读(RAW)相关;读后写(WAR)相关;写后写(WAW)相关

 

 

流水线的性能指标:

1.流水线的吞吐率。在指令级流水线中,吞吐率是指在单位时间内流水线所完成的任务数量,或输出结果的数量。计算流水线吞吐率(TP)的最基本的公式为 TP = n / (Tk)。其中 n 是任务数,Tk是处理完 n 个任务所用的时间。

 

 2.加速比。

 

 3.效率。

 

 

1.超标量流水线技术。每个时钟周期可并发多条独立指令,即以并行操作方式将两条指令或多条指令编译并执行,为此需配置多个功能部件。超标量计算机不能调整指令的执行顺序,因此通过编译优化技术,把可并行执行的指令搭配起来,挖掘更多的并行性。

 

2.超流水线技术。在一个周期内再分段,在一个时钟周期内一个功能部件使用多次。不能调整指令的执行顺序,靠编译程序解决优化问题。

 

3.超长指令字。由编译程序挖掘出指令间潜在的并行性,将多条能并行操作的指令组合成一条具有多个操作码字段的超长指令字(可达几百位),为此需要采用多个处理部件。

 

 

流水 CPU 是以时间并行性为原理构造的处理器。

超标量流水线能结合动态调度技术提高指令并行性。(与上面矛盾,暂时以此为准)

五阶段流水线可分为取指 IF、译码/取数 ID、执行 EXC、存储器读 MEM、写回 Write Back。数字系统中,各个子系统通过数据总线连接形成的数据传送路径称为数据通路,包括程序计数器、算术逻辑运算部件、通用寄存器组、取指部件等,不包括控制部件。

流水线按序流动时,在 RAW、WAR、WAW 中,只可能出现 RAW相关。

Guess you like

Origin www.cnblogs.com/oneMr/p/11481521.html