Principles of Microcomputers (1)-Structure and Working Principles of Computer Systems (Chapter 2)

table of Contents

1. Related knowledge

(1) Hierarchical model of computer system

(2) Computer system

(3) CPU instruction set

(4) Memory hierarchical subsystem

(5) Bus and input/output subsystem

(6) Parallel technology

(7) Multi-machine and multi-core structure

(8) Word length

(9) Storage capacity

(10) Calculation speed

2. Important after-school questions and answers


1. Related knowledge

(1) Hierarchical model of computer system

a) Application development:

application
User interface
System software: operating system, compiler, database management system, Web browser, device driver, interrupt service program
System call interface
Hardware system: abnormal mechanism processing, instruction system, CPU, memory, I/O and communication subsystem

b) Software and hardware implementation:

System analysis layer (mathematical model, algorithm)
User program layer (syntax programming)
Language processing layer (interpretation, compilation)
Operating system layer
Command system layer (machine language commands)
Microarchitecture layer (microprogram) (hard core level)
Digital logic layer (hard-wired logic) (hard core level)

c) Language function:

Application Language Virtual Machine
High-level language virtual machine
Assembly language virtual machine
Operating system virtual machine
Machine language level
Microprogram level
Register level (hardware)

Step-by-step generation process: CPU hard core draws up the instruction system, configures the operating system, configures the required language processing programs and other software resources, and uses the operating system to manage the scheduling, compile and enter the user program, and process the execution; the process of solving the problem: the user Construct mathematical models and design algorithms according to task requirements; users select appropriate computer languages ​​and write source programs based on algorithms; under the control of the operating system, call language processing programs to translate the source programs into machine language description targets;

(2) Computer system

Computer architecture refers to the conceptual structure and functional characteristics of the computer that programmers care about ; computer composition refers to the logical design, hardware implementation, and interconnection organization technology of each operating unit in the physical machine ; lower-level integrated circuit design technology, packaging technology, and power supply technology , Cooling measures and micro-assembly technology are called computer realization .

The Von Neumann machine is composed of five components: arithmetic unit, controller, memory, input device and output device. It has 3 main features: (1) The computer is centered on memory and consists of 5 parts, and the arithmetic unit is used for data. Processing, the memory is used to store various information, the controller interprets the program code and generates various control signals to work with various components, input devices and output devices are mainly used for human-computer interaction; (2) Control information and data inside the computer The information is all binary and stored in the same memory. (3) The computer works according to the principle of storage and the principle of the program. The basic point is that the program is driven by instructions. The compiled program is inputted into the device in advance and stored in the memory; after the computer starts working, it is not needed. In the case of manual intervention, the controller will automatically and high-speed sequentially fetch instructions from the memory and execute them;

That is, its model is interconnected by the CPU subsystem (controller and arithmetic unit), memory subsystem, and input/output subsystem through a bus. The bus is a common channel connecting the above components, used to realize the transmission and exchange of data, information, etc. between the components. The bus is divided into a data bus, an address bus, and a control bus. The address is unidirectional and sent by the master device. Used to select a certain read/write object, the data bus is used for data interaction and is bidirectional, and the control bus is used to monitor and control the device. The memory subsystem is used to store the current running programs and data. The unique number of the byte unit is called the memory unit address, and the stored byte information is called the content of the memory unit. The input/output subsystem is used to complete the exchange of information between the computer and the outside. .

The arithmetic unit completes the operation and processing of various data, and is generally composed of arithmetic logic unit (ALU), accumulator (ACC), flag register (FR) and temporary storage. The arithmetic logic unit is the core, based on the full adder, supplemented by shift registers and corresponding control logic circuits. The accumulator is a general register in the register array that provides one of the two arithmetic operands sent to the ALU, and The result of the operation is always sent back to ACC, and the function of the accumulation latch is to prevent the ALU from being fed back to the ALU input terminal through the ACC. The scratchpad is similar to ACC. It is used to save operands. The scratchpad is inaccessible and transparent to the programmer. The flag register is a register accessed by bitwise operations. It is used for some important states or characteristics of the ALU operation result. The status or feature can be identified by a binary bit.

The controller is the control center of the entire microprocessor. It consists of an instruction register (IR), an instruction decoder (ID) and an operation controller (OC). According to the address specified by the program counter (PC), the CPU first sets the instruction opcode Take it out of the memory and input it into the instruction register (IR) by the data bus, and then the instruction decoder (ID) analyzes what operation should be performed, and sends it to the corresponding component through the timing determined by the operation controller (OC) Send out the control signal, the controller mainly includes control logic such as pulse generator, control matrix, reset circuit and start-stop circuit.

The register array is a temporary storage unit inside the CPU, used to temporarily store data and addresses. The access efficiency of registers is higher than that of memory. When some operands or intermediate results need to be reused, they can be placed in registers to avoid frequent memory access, thereby shortening instruction length and instruction execution time, and speeding up CPU processing Speed ​​also brings convenience to programming. However, due to the limitation of chip area and integration level, the number of internal registers of the general CPU is not very large. The register array can be divided into special registers and general registers. The functions of special registers are fixed. For example, stack pointer (SP), program counter (PC), and flag register (FR) are special registers. When the PC is used to store the next instruction to be executed, every time 1 byte of the instruction is fetched, the content of the PC is automatically increased by 1, so when all the bytes of an instruction are fetched from the memory, the next instruction is stored in the PC The first address. If you want to change the normal execution sequence of the program, you must load the new target address into the PC, saying that the program has shifted. There are some instructions in the instruction system to control the transfer of the program, called transfer instructions. The stack is a specific area opened up in a group of registers or memory. Putting data in the stack is called a push operation, and taking out data from the stack is called a pop-up (POP) operation, which is carried out in a "first-in-last-out" (FIFO) or "last-in-first-out" (LIFO) manner. Whether the data is pushed onto the stack or popped from the stack, it is always performed on the top of the stack. The stack pointer (SP) is a register used to indicate the address of the top of the stack.

Both address and data buffers are used as bus buffers, which are the entrance and exit of microprocessor address and data signals, used to isolate the internal and external buses of the microprocessor, and provide additional drive capabilities. Fetch data from the memory, perform logical operations on the data, etc. The entire instruction set that the CPU can process is called the instruction set. If the program written by the instruction is in the form of binary code that the microcomputer can directly understand and execute, then the instruction system used is called machine language, and the corresponding program is called machine language program. People use symbols composed of several letters to replace machine language instructions, called assembly language, and programs written in assembly language are called assembly language source programs. The process of translating source code into a target program expressed in machine language is called assembly.

Instruction type Opcode example Operand example Description
Arithmetic addition ADD Rs1,Rs2,Rd (Rs1)+(Rs2)→Rd Operational instructions can only directly manipulate the data or immediate data in the register
Subtraction SUB Rs1,Rs2,Rd (Rs1)-(Rs2)→Rd
Logic class Bit and AND Rs1,Rs2,Rd (Rs1)∩(Rs2)→Rd
Bit or OR Rs1,Rs2,Rd (Rs1)∪(Rs2)→Rd
Bit not NOT Rs,Rd !(Rs)→Rd
Transport class Memory or I/O read LDR [MEM],Rd [MEM]→(Rd) Read the value of the memory cell or I/O port of the specified address into the register Rd
Memory or I/O write STR Rs,[MEM] (Rs)→[MEM] Write the value of the register Rs to the storage unit or I/O port of the specified address
Register access MOV Rs,Rd (Rs)→(Rd)  
Jump class Unconditional jump JMP Lable Label→(PC)  
Conditional jump JX / JNX Lable IF x is true/false, then Label→(PC)  
Procedure call CALL Sub-Lable Sub-Label→(pc) Call subroutine
Process return RIGHT -   Return to main program
other Downtime LDS -    

每条指令执行的基本过程都可以分为取指令(fetch)、分析指令(decode)和执行指令(execute)三个阶段,上述程序段基本为顺序执行的过程,其中溢出转移将可能导致转移的发生,即程序计数器(PC)会被重新设置。系统对冯.诺伊曼结构的改进主要体现在以下几方面:①指令集的更新和优化②利用局部性原理将存储器划分为多个层次,以达到速度、容量和价格的平衡③高速总线成为系统核心。串行性是冯.诺伊曼型计算机的本质特点,即以计算机以存储程序原理为基础,将程序和数据混合放在单一存储器中,并使用单一处理部件按“取址——分析——执行”的步骤顺序执行指令,结构性瓶颈的主要原因是指令执行的串行性和存储器读取的串行性。其发展方向一是改变冯.诺伊曼的串行执行模式,发展并行技术;二是改变冯.诺伊曼的控制驱动方式,发展数据驱动、需求驱动、模式驱动等其他驱动模式。

(3)CPU指令集

其功能设计实际上就是确定软硬件的功能分配,这里主要考虑因素有3个:速度、成本和灵活性,一般来说,用硬件实现的特点是速度快、成本高、灵活性差,而用软件实现则相反。指令集的不同反应设计原理、制造技术和系统类别的差别。有精简指令集计算机(RISC)和复杂指令集计算机(CISC)

最初计算机系统指令系统比较简单,随着半导体技术和微电子技术的发展,硬件成本降低,越来越多的高级复杂指令被添加到指令系统中,但由于当时的存储器速度慢、容量小,为减少对存储器的存取操作,减小软件开发难度,设计人员将复杂指令功能通过微程序实现,将微程序固化或硬化后交由硬件实现,这就是CISC系统的设计思路。

实际上,一般来说利用包括对简单数据进行传输和运算及转移控制操作在内的十余指令就可以实现现代计算机执行的所有操作,更复杂的功能可以由这些简单指令组合完成。因此随着存储器价格下降和CPU制造技术的提高,RISC结构开始被广泛采用。RISC的出现简化了指令系统,克服了CISC的缺点,使得更多的芯片硅面积可以用于实现流水和高速缓存,有校地提高了计算机的性能。RSIC的性能就能更依赖于编译程序的有效性。设计应当遵守以下原则:①指令条数少,格式简单,易于译码;②提供足够的寄存器,只允许Load和Store指令访问内存;指令由硬件直接执行,在单个周期内完成;③指令由硬件直接执行,在单个周期内完成;④充分利用流水线⑤强调优化编码器的作用;

(4)存储器分层子系统

从整体上看,cache——主存层次的存取速度接近于cache的存取速度,容量和每位存储的平均价格却接近主存的存取速度,容量和每位存储的平均价格却接近辅存,解决了大容量和低成本的矛盾。上层的存储器离CPU越近,存取速度越快,但价格也越较高,因此容量也越小。越上层的存储器离CPU越近,存取速度越快,但价格也越高,因此容量越小。

哈佛结构计算机将程序存储器与数据存储器分开,改善了冯.诺伊曼计算机存储器串行读写效率低下的瓶颈,CPU拥有两套独立的地址和数据总线。

(5)总线与输入/输出子系统

计算机系统中连接各子系统的通路集合称为互连结构。总线是迄今为止使用最普遍的互连结构。总线是一组传送信息的公共通路,总线上设备可分为主设备和从设备两大类。总线主设备指能够启动总线活动的设备(如CPU),而那些只能等待启动命令的被动型设备称为总线从设备。简总线结构的不足表现在两个方面:①CPU是总线唯一的主设备(2)总线结构与处理器紧密相关,通用性差。由总线控制器来协调主设备对总线的请求,其中数据传送总线包括地址、数据及相应的控制线;仲裁总线包括总线请求线和总线授权线;中断和同步总线用于处理带优先级的中断操作,包括中断请求线和中断认可线;公用线包括时钟、电源/地、系统复位线等。

(6)并行技术

并行性指计算机系统在同一时刻或同一时间间隔内进行多种运算或操作。它包括同时性和并发性。同时性指两个或两个以上事件在同一时刻发生,并发性指两个或两个以上事件在同一时间间隔内发生,并行处理技术就是描述多个处理器级并行技术(SLP)、线程级并行技术(TLP)、指令级并行技术(ISP)、电路级并行技术(CLP),其基本思想包括时间重叠、资源重复和资源共享。指令级并行主要目标是使计算机在单位时间内处理更多指令,电路级并行是为了使多个CPU一起工作,解决同一个问题。流水线是指令级并行技术的典型应用。下图表示了顺序执行和流水线执行的方法:

流水线只有在不出现断流或阻塞的条件下才能获得较高的效率。常见冒险包括数据冒险(写后读、写后写、读后写)、结构冒险(硬件资源不够)和控制冒险(分支等跳转指令引起)。超标量机通过重复设置多分流水线硬件并行来提高性能。超长指令字机依靠编译器在编译时找出指令之间潜在的并学习,并通过指令调度把可能出现的数据冲突减少到最小,最后把能并行执行的多条指令组装成一条很长的指令,然后由处理机中多个相互独立的执行部件分别执行长指令中的一个操作,即相当于同时执行多条指令。

数据冒险可以采用定向或调度技术,定向技术指将结果数据从其产生的地方直接传送到所有需要它的功能部件,调度技术则可以利用编译器或硬件来重新组装指令顺序以减少流水线停顿。两者合称为乱序技术。

(7)多机与多核结构

多机系统是指由两台以上的计算机经网络互连、并能够在操作系统的控制下合作解决一个共同问题的计算机系统,按照多机系统的模式将多个处理器集成到单个芯片中的想法已经成为现实,这种多核芯片称为单片多处理器(CMP),片内的多个处理器能并行执行不同的进程,从而大幅度提高CPU性能。细粒度多线程在每个指令中切换线程,结果是多个线程交叉处理,处理器必须能在每个时钟周期切换线程。粗粒度多线程是为了替代细粒度多线程而发明的,当到开销大的阻塞时才切换线程,主要缺陷在于克服吞吐量损失能力的局限,特别是对于短的阻塞。

单指令单数据流(SISD)代表了冯.诺伊曼计算机,即大多数的单处理机,都是由单一指令流控制的;单指令多数据流(SIMD)结构具有单一的控制部件和多个处理部分,代表机型为阵列处理机,多用于处理物理学和工程学中涉及阵列或其他高度规则数据结构问题。多指令单数据流(MISD)并无实际机型,多指令多数据流(MIMD)的每一对CU和PU组合可以看成一个独立的CPU核。

集中式共享存储器属于紧耦合多处理机,而分布式存储器结构多处理属于松耦合多处理机。

(8)字长

字长是一个基本的微处理器设计决策,它指CPU能够一次处理的最大数据宽度,在相同的运算速度下,字长直接影响计算精度,而在执行相同量的工作时,字长较大的CPU速度较快。

(9)存储容量

访问空间是指CPU能直接访问的存储单一数量及容量,一般由CPU的地址总线宽度确定,即2^地址总线宽度。

(10)运算速度

时钟频率是CPU性能的重要衡量标准,单机用户关心的整体性能,即单个程序的执行时间;而数据处理中心的管理员则更关心单位时间里完成的任务数,采用的一致和可靠评价方法是使用基准测试程序的执行时间来衡量。用f表示时钟频率,IC表示指令数目,用CPI表示每种指令的平均执行周期数,用MIPS表示每秒百万条指令,用T表示执行时间,则有下面公式:

MIPS=f(MHz)/CPI

T(s)=(IC . CPI)/f(Hz)

2.重要课后题及答案

(1)完成下列逻辑运算

101+1.01 = 110.01;1010.001-10.1 = 111.101;-1011.0110 1-1.1001 = -1100.1111;10.1101-1.1001 = 1.01;110011/11 = 10001;(-101.01)/(-0.1) = 1010.1

(2)完成下列逻辑运算

1011 0101 ∨1111 0000 = 1111 0101;1101 0001 ∧1010 1011 = 1000 0001;1010 1011 ⊕0001 1100 = 1011 0111

(3)选择题

<1>下列无符号数中最小的数是 ( A ) 。 A.(01A5) H  B.(1,1011,0101) B  C.(2590) D  D.(3764)O

<2>下列无符号数中最大的数是 ( B ) 。A.(10010101)B B.(227) O C.(96)H D. (143)D

<3>在机器数 ( A ) 中,零的表示形式是唯一的。 A.补码 B.原码 C.补码和反码 D.原码和反码

<4>单纯从理论出发,计算机的所有功能都可以交给硬件实现。而事实上,硬件只 实现比较简单的功能,复杂的功能则交给软件完成。这样做的理由是 ( BCD ) 。 A.提高解题速度 B.降低成本 C.增强计算机的适应性,扩大应用面 D.易于制造

<5> 编译程序和解释程序相比,编译程序的优点是 ( D ),解释程序的优点是 ( C ) 。 A.编译过程 ( 解释并执行过程 ) 花费时间短 B.占用内存少 C.比较容易发现和排除源程序错误 D.编译结果 ( 目标程序 ) 执行速度快

(4)通常使用逻辑运算代替数值运算是非常方便 的。例如,逻辑运算 AND将两个位组合的方法同 乘法运算一样。 哪一种逻辑运算和两个位的加法 几乎相同?这样情况下会导致什么错误发生?

答:逻辑运算OR和两个位的加法几乎相同。 问题在于多个 bit 的乘或加运算无法用 AND或OR运算替代, 因为逻辑运算没有相应的进位机制。

(5)假设一台数码相机的存储容量是 256MB,如果每个像素需要 3 个字节的存储空间, 而且一张照片包括每行 1024 个像素和每列 1024 个像素, 那么这台数码相机可以存放多少张照片?

答:每张照片所需空间为:1024*1024*3=3MB 则 256M可存照片数为: 256MB/3MB≈85张

(6)某测试程序在一个 40 MHz处理器上运行, 其目标代码有 100 000 条指令,由如下各类指令 及其时钟周期计数混合组成, 试确定这个程序的 有效 CPI、MIPS的值和执行时间。

指令类型 指令计数 时钟周期计数
整数算术 45 000 1
数据传送 32 000 2
浮点数 15 000 2
控制传送 8000 2

CPI=(45000/100000)*1+(32000/100000)*2+(1 5000/100000)*2+(8000/100000)*2=0.45*1+0. 32*2+0.15*2+0.08*2=1.55

MIPS=40/1.55=25.8

执行时间 T=(100000*1.55)*(1/(40*10 ∧ 6) )=15.5/4*10 ∧(-3 )= 3.875*10 ∧(-3 ) s= 3.875ms

(7)假设一条指令的执行过程分为“取指令” 、 “分析”和“执行”三段,每一段的时间分别为 t,2t 和 3t 。在下列各种情况下, 分别写出连 续执行 n 条指令所需要的时间表达式。

<1> 顺序执行方式:T= ( t+2t+3t)*n=6nt

<2> 仅“取指令”和“执行”重叠:T=6t+5*t*(n-1)=(5n+1)t

<3>“取指令”、“分析”和“执行”重叠T=tn-1+3t=(3+3n) t

 

Guess you like

Origin blog.csdn.net/qq_35789421/article/details/115018114