[Principles of Computer Composition] General Review Notes (Part 1)

Special statement:
This article is for reference only. Part of the content of this article comes from AI summary, Internet collection and personal practice. If any information contains errors, readers are welcome to criticize and correct them. This article is only for learning and communication, not for any commercial purposes.

Article directory

Chapter 1 Computer System Overview

1.1 Von Neumann computer structure

Standard definition: Von Neumann computer architecture is a computer architecture in which instructions and data are stored in binary form in memory, and the CPU distinguishes instructions and data through different stages of the instruction cycle.

Popular explanation: A von Neumann computer is like a factory. There is an operating manual (instructions) in the factory, and raw materials and finished products (data) are placed in a large warehouse (memory). The factory manager (CPU) follows the steps in the operation manual to take out the raw materials from the warehouse, process them, and finally put the finished products back into the warehouse.

Instruction Cycle
The instruction cycle is the complete process that a computer goes through to execute an instruction. It usually includes stages such as Fetch, Decode, Execute, Memory Access and Write-back. During the instruction cycle, the CPU obtains instructions from the memory and performs specific operations at each stage, including obtaining data, processing data, etc.

In this case, the CPU can distinguish whether the current operation is an instruction or data based on the different stages of the instruction cycle. Typically, during the fetch phase, the CPU fetches the next instruction from memory; during the execution phase, the CPU may need to fetch data from memory to execute the instruction operation.

Control unit (Control Unit)
The control unit is an important component of the computer and is responsible for commanding and coordinating the operation of various components within the computer. It mainly performs operations such as instruction decoding (Decode) and execution (Execute) in the instruction cycle. The control unit interprets the instruction, determines the instruction operation type, and sends corresponding control signals to other components to ensure that the instruction can be executed correctly. The control unit is also responsible for coordinating various execution units, including arithmetic logic units (ALU) and registers, etc., to ensure that instructions are executed in the correct order and time.

Memory Unit (Memory Unit)
Memory unit is a component in a computer used to store data and instructions. It is divided into main memory (RAM) and secondary memory (such as hard drive), etc. Primary memory is used for temporary storage of programs and data, while secondary memory is used for long-term storage of data and programs. The memory unit is used by the CPU to read and write data, instructions and other information, as well as temporarily save intermediate calculation results, etc.

1.2 High-level languages ​​and machine-level object code

Standard definition: High-level language source programs need to be converted into machine-level object code through a compiler, while assembly language source programs are translated into machine language programs through an assembler.

Popular explanation: High-level language is like the everyday language that people use, while machine-level object code is the language that computers can understand. The compiler is like a translator, responsible for translating articles written by people in high-level languages ​​into machine language that the computer can execute.
This question involves the process of converting high-level language source programs into machine-level object code. The following is an explanation of relevant professional terms:

Compiler (Compiler)
A compiler is a software tool that translates high-level language source code into target code all at once. It converts the entire high-level language program source code into equivalent machine language object code. The compiler first performs lexical analysis, syntax analysis, semantic analysis and other operations, and then generates the corresponding target code. This object code can be saved in a file and later executed without having to translate the source code again.

Assembler (Assembler)
Assembler is a translation program that translates assembly language source programs into machine language programs. Similar to a compiler, an assembler converts assembly language code into machine-level object code. The difference is that assembler targets assembly language rather than a high-level language. It translates symbolic instructions in assembly language into corresponding machine codes and generates executable machine language programs.

Linker
The linker is used to combine modules in multiple object code files or library files into a single executable file. It connects symbols (such as functions or variables) referenced in various modules and resolves symbol reference relationships to create a complete executable program. This process includes steps such as address resolution, symbol resolution, and relocation.

Interpreter (Interpreter)
Interpreter is software that translates a statement of the source program into the corresponding machine object code and executes it immediately. It does not generate an object code file, but reads the source code line by line and interprets it for execution. An interpreter translates a statement and executes it one by one, unlike a compiler which translates the entire program into object code in one go.

In the question, the correct answer is C, which is a compiler, because a compiler is software that translates high-level language source code into target code all at once.

1.3 Types of programs executed by computer hardware

Standard definition: Computer hardware can directly execute machine language programs, while assembly language programs need to be assembled before they can be executed.

Popular explanation: A computer is like a smart worker. The language it can understand is machine language, and assembly language is like a simple language used by workers to communicate, which needs to go through a translation process.

  • Machine language program (I):

    • Can be executed directly by computer hardware.
    • Machine language is binary code that computers can understand and execute.
  • Assembly language program (II):

    • An assembly process is required to convert assembly code into machine language.
    • Assembly language is a lower-level programming language that is more human-readable.
  • Hardware description language program (III):

    • A language that is not directly executed by computer hardware.
    • It is used to describe the hardware circuit structure and needs to be converted into an actual circuit through synthesis tools.

Therefore, what the computer hardware can directly execute is the machine language program (I), and the answer is A.

1.4 Basic ideas of von Neumann computers

Standard definition: The basic idea of ​​von Neumann computers includes that the functions of the program are realized by the central processor executing instructions. Instructions and data are expressed in binary numbers, with no difference in form. Instructions are accessed by address, and data are given directly in the instructions. Before program execution, instructions and data need to be stored in memory in advance.

Popular explanation: A von Neumann computer is like an organized kitchen. The recipes (instructions) tell the chef (CPU) how to cook food. The ingredients and finished products (data) are stored in the refrigerator (memory) in the same way. Before cooking, all ingredients and recipes need to be prepared in advance.

  • Basic ideas of von Neumann computers:

    • Von Neumann architecture computers include input devices, output devices, memory, arithmetic units, and controllers.
    • Functions are implemented by the central processing unit (calculator and controller) executing instructions.
  • Instructions and data representation:

    • Binary representation: Both instructions and data are expressed in binary numbers, with no difference in form.
    • Memory unification: instructions and data are stored in the memory with equal status.
  • Instruction access and data storage:

    • Instructions are accessed by address: The program executes instructions through the CPU, and instructions are accessed by address.
    • Data storage: Data is stored in the memory in binary form. Except for immediate addressing, the data is stored in the memory.
  • Storage before program execution:

    • Pre-storage: Before program execution, instructions and data need to be pre-stored in memory.

1.5 Conversion process of high-level language programs

Standard definition: The process of converting high-level language source programs into executable object files includes preprocessing, compilation, assembly and linking.

Popular explanation: Just like cooking a delicious dish, the high-level language source program is like a recipe, and the conversion process is to translate the recipe into steps that the chef can understand. Preprocessing is like preparing ingredients, compilation is like cooking steps, assembly is like putting ingredients together, and linking is connecting each step into the final delicious dish.

  • SRAM (static random access memory):

    • No periodic refresh is required.
    • Use flip-flops as storage units to keep information unchanged.
  • SDRAM (Synchronous Dynamic Random Access Memory):

    • Requires periodic refresh.
    • When using capacitor storage, it must be refreshed every once in a while, otherwise the information will be lost.
  • ROM (Read Only Memory):

    • No periodic refresh is required.
    • Usually used to store fixed data, read-only.
  • Flash memory:

    • No periodic refresh is required.
    • Uses non-volatile storage technology, suitable for applications such as flash memory.

1.6 Measures to improve program execution efficiency

Standard definition: Measures to improve program execution efficiency include increasing the CPU clock frequency, optimizing the data path structure, and compiling and optimizing the program.

Popular explanation: Just like improving factory production efficiency, increasing the CPU clock frequency is to make workers work faster, optimizing the data path structure is to improve the production line, and compiling and optimizing the program is to optimize the production process so that products can be produced faster. .

  • Increase the CPU clock frequency (main frequency):

    • Explanation: Increasing the CPU clock frequency can shorten the time of each clock cycle, making each execution step take less time, thereby speeding up program execution. This is a method of shortening program execution time by increasing the speed of the hardware.
  • Optimize data path structure:

    • Explanation: The data path is responsible for implementing the arithmetic units, registers, and data exchange between registers within the CPU. By optimizing the data path structure, the throughput of the computer system can be improved and the execution speed of the program can be accelerated.
  • Compile and optimize the program:

    • Explanation: Compilation optimization is to optimize the program when converting a high-level language program into a sequence of machine instructions. By getting a better sequence of instructions, the program execution time can be shortened.

1.7 Floating point operation speed index

Standard definition: Floating point operation speed is usually described by MFLOPS (millions of floating point operations per second).

Popular explanation: Just like a mathematical genius can solve a large number of mathematical problems in a short time, MFLOPS describes how many complex floating-point number operations a computer can complete in one second. It is a standard for measuring the speed of a computer in processing mathematical problems.

  • MIPS (Millions of instructions per second):

    • Explanation: Measures how many million instructions are executed per second, suitable for measuring the performance of scalar machines.
  • CPI (number of clock cycles):

    • Explanation: The average number of clock cycles per instruction.
  • IPC (instructions per clock cycle):

    • Explanation: IPC is the reciprocal of CPI, which is the number of instructions executed per clock cycle.
  • MFLOPS (million floating point operations per second):

    • Explanation: Measures how many millions of floating-point operations are performed per second. It is used to describe the speed of floating-point operations and is suitable for measuring the performance of vector machines.

1.8 The impact of increasing computer speed on program execution time

Standard definition: Increasing computer speed, especially increasing CPU clock frequency, can effectively reduce program execution time.

Popular explanation: Just like increasing the speed of factory machines, increasing computer speed means making the processor inside the computer work faster, thereby speeding up program execution, just like the speed at which a factory produces products.

  • Calculation of new running time after increasing CPU speed by 50%:

    • New CPU time: Original CPU time / 1.5 (50% faster)
    • New I/O time: unchanged
  • Calculate the new total running time:

    • New total run time = new CPU time + new I/O time
  • calculation process:

    • New CPU time = 90 / 1.5 = 60s
    • New total run time = 60 + 10 = 70s

1.9 Calculation of MIPS number

Standard definition: MIPS (millions of instructions per second) is a measure of computer performance, which can be obtained through the computer's main frequency and CPI (number of clock cycles per instruction).

Popular explanation: Just like measuring the speed of a car, MIPS indicates how many millions of instructions the computer can execute per second. The main frequency is like the speed of the car, while CPI is the time spent on each instruction. The product of the two can get the MIPS number.

  • CPI(Cycles Per Instruction):

    • Explanation: The average number of clock cycles required to execute each instruction. For the benchmark program, the CPI can be obtained by weighting the CPI of each instruction, that is, CPI = Σ (the proportion of each type of instruction × the CPI of the instruction).
  • MIPS(Million Instructions Per Second):

    • Explanation: Millions of instructions executed per second. The calculation formula is MIPS = main frequency/CPI. Clock frequency refers to the computer's clock frequency, and CPI is the average number of clock cycles per instruction.

In this question, the MIPS number of the computer is obtained by calculating the CPI and the main frequency, that is, MIPS = main frequency/CPI.

1.10 Factors affecting program execution time

Standard definition: Program execution time is affected by CPU clock frequency, number of instructions and CPI.

Popular explanation: Just like making a travel plan, the program execution time is affected by the comprehensive impact of the computer's internal clock frequency (the speed of the plan), the number of instructions (the number of places to be passed) and the CPI (the time it takes to reach each place). Optimizing these factors can help your computer complete tasks faster, just like planning a trip properly.

  • The number of instructions is reduced to 70% of the original:

    • Explanation: After compilation and optimization, the number of instructions required for the execution of program P is reduced to 70% of the original, that is, the new number of instructions is 0.7x.
  • CPI increased to 1.2 times its original value:

    • Explanation: After compilation and optimization, the average number of clock cycles per instruction increases to 1.2 times the original, that is, the new CPI is 24f/x.
  • Execution time calculation:

    • Original execution time: The execution time of the original program P on machine M is 20s.
    • New execution time: (new instructions × new CPI) / f = (0.7x × 24 × f / x) / f = 24 × 0.7 = 16.8s.

1.11 Machine word length and number of parts

Standard definition: Machine word length refers to the width of the data path within the CPU used for integer operations.

Popular explanation: Just like the width of a conveyor belt, the machine word length determines the number of data bits that can be processed simultaneously inside the CPU. If the machine word length is 32 bits, just like the conveyor belt width is 32 inches, the data path inside the CPU can also process 32 bits of data at the same time.

  • CPI(Cycles Per Instruction):

    • Explanation: The average number of clock cycles required to execute each instruction. In this question, the average CPI on M1 is 2 and the average CPI on M2 is 1.
  • Main frequency:

    • Explanation: The clock frequency of the computer, the main frequency of M1 is 1.5GHz, and the main frequency of M2 is 1.2GHz.
  • Running time calculation:

    • Running time on M1: Number of instructions × CPI / main frequency = number of instructions × 2 / 1.5
    • Running time on M2: Number of instructions × CPI / main frequency = number of instructions × 1 / 1.2

1.12 Supercomputer performance indicators

Standard definition: Supercomputer performance is typically measured in terms of PFLOPS (petaflops of floating point operations per second).

Popular explanation: Just like measuring the speed of a car, PFLOPS indicates how many petaflops a supercomputer can perform per second and is a standard for evaluating its processing power. A powerful supercomputer is like a speeding supercar.

  • Machine word length (Word Length):

    • Explanation: Refers to the width of the data path within the CPU used for integer operations. It is equal to the number of arithmetic bits and general register width used for integer operations inside the CPU.
  • ALU(Arithmetic Logic Unit):

    • Explanation: Arithmetic logic unit, used to perform arithmetic and logical operations. The number of digits may or may not be the same as the machine word length.
  • Instruction register:

    • Explanation: It is usually used to store instructions that are being executed or are about to be executed. The number of bits is not necessarily the same as the machine word length.
  • General register:

    • Explanation: The register used to store temporary data may have the same number of bits as the machine word length or may be different.
  • Floating point registers:

    • Explanation: The register used to store floating point numbers may or may not have the same number of bits as the machine word length.

1.13 Program performance analysis and optimization

Standard definition: By compiling and optimizing the program, the execution efficiency of the program can be improved and the execution time can be reduced.

Popular explanation: Just like improving learning efficiency, compiling and optimizing a program means making some adjustments to the code so that the computer can understand and execute this code faster, just like students can master knowledge faster through the optimization of learning methods. Same.

  • PFLOPS(PetaFLOPS):
    • Explanation: A unit for measuring supercomputer performance, the number of floating-point operations per second. 1 PFLOPS equals 10^15 floating point operations per second.

1.14 Comparison of program running time on different computers

Standard definition: Different computer clock speeds and architectures have an impact on program execution time. The running performance of programs on different computers can be evaluated by comparing the average CPI and clock speed.

Popular explanation: Just like different vehicles driving on different roads, different computers have different running speeds under different tasks. By comparing a computer's clock speed and the average number of clock cycles required for each instruction, we can learn how different computers perform, much like comparing the speed of different vehicles on different roads.

  • Main frequency:

    • Explanation: The clock frequency of the computer. In this question, the main frequency is 1GHz, which is 1×10^9 clock cycles per second.
  • CPI(Cycles Per Instruction):

    • Explanation: The average number of clock cycles required to execute each instruction. In this question, 80% of the instructions require an average of 1 clock cycle to execute, and 20% of the instructions require an average of 10 clock cycles to execute. Therefore, CPI = 80% × 1 + 20% × 10 = 2.8.
  • CPU execution time:

    • Explanation: Refers to the time required for the program to run on the CPU. In this question, program P executed a total of 10,000 instructions, and each instruction required an average of 2.8 clock cycles. Therefore, the CPU execution time = (10000 × 2.8) / 10^9 = 28 × 10^-6 seconds = 28μs.

Chapter 2 Bus

2.1 Bus bandwidth calculation

  • Definition: Bus bandwidth refers to the number of bits of data transmitted on the bus per unit time. It is usually measured by the number of bytes of information transmitted per second, and the unit is B/s.

  • Metaphor: Bus bandwidth is like a water pipe. It represents how much water (data) this water pipe can transport per second.

  • Bus cycle:

    • Explanation: In a computer system, a bus cycle is the time it takes for the bus to complete a data transfer cycle. In this question, one bus cycle takes 2 clock cycles.
  • Clock period:

    • Explanation: In computers, the clock cycle is determined by the computer's clock frequency and represents the length of a clock pulse. In this question, the bus clock frequency is 10MHz, that is, there are 10^7 clock cycles per second.
  • Bus bandwidth:

    • Explanation: The number of bits of data transmitted on the bus per unit time, usually measured by the number of bytes of information transmitted per second. The unit is B/s. In this question, the data transmitted through the bus is 4 bytes, so the bus bandwidth is [ \frac{4 \text{bytes}}{2 \times \frac{1}{\text{bus clock frequency}}} ] , the calculated result is 20MB/s.

2.2 Bus standards

  • Definition: A bus standard is a specification for connecting various components inside a computer. Common ones include PCI, EISA, ISA, PCI-Express, etc.

  • Metaphor: The bus standard is like the plug used by each device in the computer. Different devices need to match the corresponding plug to connect.

  • PCI(Peripheral Component Interconnect):

    • Explanation: PCI is a standard bus used to connect expansion devices within a computer. It defines the physical and electrical characteristics of slots and cards, allowing a variety of devices, such as graphics cards, network cards, and storage controllers, to be connected to the computer motherboard.
  • CRT:

    • Explanation: CRT, short for Cathode Ray Tube, is a traditional display technology that uses electron beams to generate images on a fluorescent screen.
  • USB(Universal Serial Bus):

    • Explanation: USB is a universal external device connection standard used to connect computers with various external devices, such as printers, keyboards, mice, and removable storage devices.
  • EISA(Extended Industry Standard Architecture):

    • Explanation: EISA is a bus standard for connecting expansion devices within a computer. It is an extended version of ISA that provides higher performance and more slots for applications that require more bandwidth and larger system capacity.
  • ISA(Industry Standard Architecture):

    • Explanation: ISA is an early computer bus standard used to connect expansion devices inside the computer, such as plug-in cards. The ISA bus has been superseded by later standards, but has played an important role in history.
  • CPI(Cycles Per Instruction):

    • Explanation: CPI is a measure of computer instruction execution efficiency, indicating the average number of clock cycles required to execute each instruction.
  • VESA(Video Electronics Standards Association):

    • Explanation: VESA is an organization dedicated to promoting and developing video electronics standards. In the computer world, VESA specifications typically address display and graphics standards, ensuring compatibility and interoperability.
  • SCSI(Small Computer System Interface):

    • Explanation: SCSI is an interface standard used to connect computers and external devices (such as hard drives, printers), providing high performance and reliability.
  • RAM(Random Access Memory):

    • Explanation: RAM is a type of computer memory used for temporary storage of data and programs that are in use. RAM allows fast reads and writes, but stored information is lost when power is lost.
  • MIPS(Million Instructions Per Second):

    • Explanation: MIPS is a measure of computer performance and represents the number of millions of instructions executed per second.

2.3 Information transmission on data lines

  • Transmitting information: Instructions, operands, and interrupt type numbers may be transmitted on the data lines of the system bus.

  • Metaphor: A data line is like a highway inside a computer, through which different types of information can be transmitted.

  • instruction:

    • Explanation: In a computer system, an instruction is information passed from the control unit to the execution unit to perform a specific operation. On the bus, instructions are transmitted over data lines.
  • Operands:

    • Explanation: Operands refer to data in computer instructions used to perform arithmetic or logical operations. On the bus, operands are also transmitted via data lines.
  • Handshake (response) signal:

    • Explanation: The handshake signal usually refers to the confirmation signal during the communication process, which is used to ensure the correct transmission of data. On the data line of the system bus, the handshake signal is not a direct data transmission, but a control signal used to confirm the status of receiving or sending data.
  • Interrupt type number:

    • Explanation: The interrupt type number is information that identifies the type of interrupt when an interrupt occurs. It is usually transferred as data onto the data bus to inform the interrupt handler about the details of the interrupt.

2.4 Data transmission on synchronous bus

  • Features: A synchronous bus uses the same clock signal for communication, but a bus transaction may not be completed within one clock cycle.

  • Metaphor: The way sync bus communicates is like at a concert, the band members are unified in playing to the rhythm of the same conductor, but not every instrument responds at every beat.

  • Synchronous bus:

    • Explanation: A synchronous bus is a bus that synchronizes various components in the system through clock signals. It uses a unified clock signal to coordinate the transmission of data and ensure that the operations of various components are synchronized.
  • Clock frequency:

    • Explanation: Clock frequency refers to the number of oscillations of the clock signal per unit time. For a synchronous bus, the clock frequency determines the speed of data transfer.
  • Width is 32 bits:

    • Explanation: The width of the bus represents the number of bits transferred per clock cycle. Here, the bus width is 32 bits, that is, 32 bits of data can be transferred per clock cycle.
  • Address/data line multiplexing:

    • Explanation: Address/data line multiplexing means that on the bus, the same line can transmit both address information and data information. This technology can reduce the number of buses and improve the efficiency of the system.
  • Burst transmission method:

    • Explanation: The burst transmission method means that during data transmission, data of a series of adjacent addresses can be transmitted continuously without the need for independent address transmission each time.

According to the information in the question, the clock frequency of the bus is 100MHz, the width is 32 bits, the address/data lines are multiplexed, and each address or data transmission takes up one clock cycle. Therefore, the time required to transmit 128 bits of data is 40ns (4 clock cycles), plus the 10ns to transmit the address, for a total of 50ns.

2.5 USB bus characteristics

  • Features: USB supports plug-and-play, hot-swap, and connects multiple peripherals. It is a serial communication bus.

  • Metaphor: USB is like a multi-function power strip on a computer, which can easily connect various peripherals and supports plug-and-play.

  • USB (Universal Serial Bus):

    • Explanation: USB is a serial bus standard used to connect computers and external devices. It has the characteristics of plug-and-play, hot-swappable, and can be cascaded to connect multiple peripherals.
  • Plug and Play and Hot Swap:

    • Explanation: Plug and play means that after plugging in the device, the system can automatically recognize and configure the device without restarting. Hot plugging means that devices can be plugged in or out while the system is running without affecting the normal operation of the system.
  • Connect multiple peripherals via cascading:

    • Explanation: USB supports connecting multiple external devices in a cascading manner to form a daisy chain connection structure. This allows users to conveniently connect multiple devices without the need for expansion slots.
  • Communication bus, connecting different peripherals:

    • Explanation: USB is a communication bus that can connect various types of external devices, including printers, keyboards, mice, cameras, etc.
  • It can transmit 2 bits of data at the same time, with high data transmission rate:

    • Explanation: This option describes an error. USB is a serial bus that transfers one bit of data per clock cycle, rather than two bits at once. The maximum transfer rate of USB 2.0 is 480Mb/s.

2.6 Device interconnection interface standards

  • Connection interface standards: Common device interconnection interface standards include PCI, USB, AGP, PCI-Express, etc.

  • Metaphor: Device interconnection interface standards are like multiple doors on a computer. Each door corresponds to a connection method, and different devices enter the computer through the corresponding door.

  • PCI(Peripheral Component Interconnect):

    • Explanation: PCI is a standard interface used to connect internal computer devices (such as network cards, sound cards, etc.) and the motherboard. It is a universal, parallel interface standard.
  • USB(Universal Serial Bus):

    • Explanation: USB is a serial bus standard used to connect computers and external devices. It supports plug-and-play, hot-swappable, and connects various types of external devices such as printers, keyboards, mice, etc.
  • AGP(Accelerated Graphics Port):

    • Explanation: AGP is an interface standard specifically used to connect graphics cards (graphics cards) and motherboards to increase the speed of graphics data transmission.
  • PCI-Express(Peripheral Component Interconnect Express):

    • Explanation: PCI-Express is an evolution of the PCI standard and is used for high-speed data transmission. It is often used to connect graphics cards, hard disks, network cards, etc.

2.7 Bus frequency and bandwidth calculation

  • Calculation method: bus bandwidth = number of data bits × clock frequency × 2 (consider rising and falling edges).

  • Metaphor: The calculation of bus frequency and bandwidth is like calculating the speed of vehicles passing in a lane, which depends on the width of each vehicle and the frequency of passing.
    Bus bandwidth:

  • Explanation: Bus bandwidth refers to the number of data bits transmitted by the bus per unit time, usually expressed in bit rate or byte rate. In this context, refers to the amount of data transferred over the bus per second.

  • Calculation: There are 32 data lines, data is transmitted twice per clock cycle, and the bus clock frequency is 66MHz. Therefore, the maximum amount of data transmitted by the bus per second is: 32 data lines * 2 times/cycle * 4 bytes/time * 66M cycles/second = 528MB/s.

2.8 Bus transaction mode

  • Burst transfer: In a bus transaction, the master device gives a first address, and the slave device can read or write multiple data continuously.
  • Metaphor: Burst transmission is like shopping in a supermarket. You can handle multiple items in sequence at one time.
    Parallel Transfer :

Parallel transfer is a method of data transfer in which multiple data bits are transferred between devices at the same time. In parallel transmission, multiple bits of data can be transmitted per clock cycle. Compared with serial transmission, it can transmit more information at the same time. Parallel transmission is often used for high-speed data transmission between internal buses or devices. For example, in computer systems, data between memory and processor are usually transmitted in parallel.

Serial Transfer :

Serial transmission is a method of data transmission in which the binary code of data is transmitted bit by bit in time sequence, one bit per clock cycle. Compared with parallel transmission, serial transmission reduces the number and complexity of cables, but the amount of information transmitted at the same time is smaller. Serial transmission is commonly used in long-distance communications, external device connections, and high-speed communications.

Burst Transfer :

Burst transfer is a bus transaction method in which the master device only needs to provide a first address, and the slave device can sequentially read or write data in several consecutive units starting from the first address. This method can effectively utilize the bus bandwidth, reduce address transmission overhead, and improve data transmission efficiency. Burst transfers are often used to transfer data between cache and memory, helping to improve system performance.

Synchronous Transfer :

Synchronous transmission means that during the data transmission process, the transmission process is controlled by a unified clock signal to ensure synchronization between sending and receiving. Synchronous transmission can coordinate data transmission at the sending and receiving ends by using a synchronous clock signal on the data line to ensure that they occur within the same time interval. This helps improve the stability and reliability of data transmission. In serial communication, synchronous transmission is a common method.

2.9 Bus timing

  • Asynchronous communication: In asynchronous communication, fully interlocked protocols are slower and non-interlocked protocols are less reliable.
  • Metaphor: Asynchronous communication is like when communicating, some people may need to wait for the other party to respond, while some people do not wait for the other party to respond and proceed directly to the next step.
    Bus timing :

Bus timing means that in a computer system, data transmission between devices needs to be carried out in a certain time sequence to ensure reliable transmission of data. In computers, bus timing involves concepts such as asynchronous communication, synchronous communication, and interlocking protocols. The following is a brief explanation of related professional terms:

  • Asynchronous Communication : Asynchronous communication is a communication method that does not use clock signals for synchronization. The start and end of communication are identified by specific start and stop bits, and the data transfer rate is not controlled by the global clock.

  • Synchronous Communication : Synchronous communication is a communication method that uses a unified clock signal for synchronization. Both the sending and receiving of communications rely on a global clock signal to ensure data transmission between devices within the same clock cycle.

  • Interlocking Protocol (Handshaking Protocol): The interlocking protocol is a protocol that ensures that both parties in communication proceed simultaneously. At different stages of communication, devices interact with each other through handshake signals to ensure the correct transmission of data. In asynchronous communication, the fully interlocked protocol requires that the sending and receiving of communications be carried out in a certain order and timing, while the non-interlocked protocol is less reliable because there are no strict interlocking requirements.

  • Semi-synchronous communication method : Semi-synchronous communication method combines the characteristics of asynchronous and synchronous communication. The start and end of communication are marked by asynchronous start and stop bits, but the data transfer rate is controlled by a synchronous clock. In semi-synchronous communication, the sampling of handshake signals is controlled by a synchronous clock.

The choice of bus timing has an important impact on the performance and reliability of a computer system and therefore requires careful consideration when designing and configuring computer systems.

2.10 Bus design principles

  • Principles: Parallel buses and serial buses, signal line multiplexing, burst transmission and separate transaction communication methods.

  • Metaphor: Bus design principles are like architectural design principles, taking into account different needs and materials, choosing appropriate design solutions.
    Bus design :

  • Parallel bus : A parallel bus refers to a bus that transmits multiple bits of data at the same time. Each wire can carry one binary bit and is often used to transmit large amounts of data at high speed over short distances. While faster in some cases, they may face issues such as mutual interference at high clock frequencies.

  • Serial bus : A serial bus is a bus that transfers data bit by bit, one binary bit at a time. Since the serial bus has fewer wires, it is easier to control interference between lines, so it can provide higher transmission rates at high clock frequencies.

  • Signal line multiplexing technology : Signal line multiplexing refers to a technology that can use fewer lines to transmit more information by transmitting different information at different times to save space and cost.

  • Burst transmission mode : Burst transmission can transmit data with multiple consecutive storage addresses in one bus cycle. One address and a batch of consecutive data with addresses can be transmitted at a time, which helps to improve the bus data transmission rate.

  • Separate transaction communication method : Separate transaction communication is a bus reuse method. By separating different communication transactions, the bus utilization can be improved. This approach allows multiple transactions to be transferred by efficiently utilizing time slices on the bus.

In bus design, choosing the appropriate transmission method and technology depends on the system's needs and design goals.

2.11 Multi-bus structure

  • Connection principle: The bus close to the CPU is faster, and the memory bus supports burst transmission.

  • Metaphor: A multi-bus structure is like urban transportation planning. To improve efficiency, roads near the city center need to be wider, and highways need to support continuous high-speed circulation.
    Multi-bus structure :

  • Buses closer to the CPU are faster : In computer architecture, to improve performance, a faster bus is usually connected to the central processing unit (CPU). This ensures faster data transfer between high-speed devices and the CPU.

  • The memory bus can support burst transfer mode : The memory bus supports one transfer mode, namely burst transfer. This approach allows multiple data units to be transferred continuously in a single transaction, thereby improving the efficiency of memory reading and writing.

  • Buses must be connected through bridges : In computer systems, buses of different types or speeds usually need to be connected through bridges. A bridge is a device that acts as an interface between different buses, allowing them to work together.

  • PCI-Express×16 adopts serial transmission mode : PCI-Express×16 is a high-speed serial bus standard, where “×16” indicates that it supports 16 channels. PCIe uses serial transfer, which transfers data one bit at a time, but by using multiple lanes and high frequencies, it is able to provide very high data transfer rates.

2.12 Improvement of bus transmission rate

  • Factors: The impact of bus width and operating frequency on the transmission rate, the improvement effect of burst transmission and address/data line multiplexing.

  • Metaphor: Improving the bus transmission rate is like increasing the water delivery speed of a water pipe. It can be achieved by increasing the diameter of the pipe and increasing the water pressure.
    Explanation of professional terms related to the improvement of synchronous bus data transmission rate :

  • Bus width (I) : Bus width refers to the number of data bits that can be transmitted at the same time. Increasing the bus width allows more data to be transferred at one time, thereby increasing the data transfer rate.

  • Bus operating frequency (II) : The bus operating frequency refers to the number of data transmissions on the bus within unit time. Increasing the bus operating frequency means that more data can be transferred in the same time, thus helping to increase the data transfer rate.

  • Support burst transfer (III) : Burst transfer is a way of transmitting data with multiple consecutive storage addresses in one bus cycle. By supporting burst transmission, more data can be transmitted in a shorter time and the data transmission efficiency of the bus can be improved.

  • Address/data line multiplexing (IV) : Address/data line multiplexing refers to using a set of lines on the bus to transmit address and data at the same time. This can reduce the number of bus lines, but does not directly increase the transmission rate. It mainly helps reduce system cost and complexity.

2.13 Memory bus bandwidth calculation

  • Calculation method: total memory bus bandwidth = number of channels × data width × operating frequency.
  • Metaphor: Calculating memory bus bandwidth is like calculating the total water delivery volume of a water pipe, which depends on the number, diameter, and water flow rate of the water pipes.
  1. 3-channel memory bus : The memory bus in a computer is usually divided into multiple channels, indicating that multiple data transmission paths can be carried out simultaneously. A 3-channel memory bus refers to a memory bus with three independent channels, which helps improve memory access performance.

  2. DDR3-1333 : DDR3 (Double Data Rate 3) is the third generation double data rate synchronous dynamic random access memory standard. 1333 means that the memory operates at 1333 MHz, which means 1333 data transfers per second.

  3. Bus width is 64 bits : refers to the number of data bits transmitted by the memory bus in each clock cycle. A 64-bit bus width means that 64 bits of data can be transferred per clock cycle.

  4. The total bandwidth of the memory bus : represents the amount of data transferred through the memory bus in unit time. The calculation method is the number of channels × bit width per channel / 8 (converted to bytes) × memory frequency. In this example, the total bandwidth is 3 × 64 / 8 × 1333 MB/s, or 32GB/s.

These technical terms refer to aspects of computer hardware related to memory access and transfer rates.

2.14 QPI bus bandwidth calculation

  • Features: The QPI bus is a point-to-point full-duplex synchronous serial bus that can transmit 20 bits of information simultaneously in each direction.
  • Metaphor: The characteristics of the QPI bus are like a pair of walkie-talkies, which can send and receive information at the same time.
  1. QPI bus : The QuickPath Interconnect (QPI) bus is a point-to-point full-duplex synchronous serial bus launched by Intel for connecting processors, memory and other related chips. It is a high-speed, low-latency bus architecture commonly used in servers and high-performance computing systems.

  2. Point-to-point : On a bus, point-to-point means communication between two nodes connected directly, rather than through a shared bus. QPI is a point-to-point connection that allows for more direct and efficient data transfer.

  3. Full-duplex : In full-duplex communication, a device can send and receive information at the same time without switching modes. The QPI bus supports full-duplex communication, which means connected devices can transmit data in both directions simultaneously.

  4. Synchronous Serial Bus : A synchronous serial bus is a bus that synchronizes data transmission by coordinating clock signals between sending and receiving devices. QPI is a synchronous serial bus whose data transmission is synchronized with a clock signal.

  5. Bus bandwidth : Bus bandwidth refers to the amount of data transmitted through the bus in unit time. It is usually measured in bytes per second (B/s) or gigabytes per second (GB/s). In this example, bus bandwidth is the amount of data transferred over the QPI bus, calculated as the number of transfers per second multiplied by the number of bits per transfer, divided by 8 to convert to bytes.

The above is an explanation of terms related to the QPI bus, which involves a high-speed bus for communication between processors in computer architecture.

2.15 Analysis of Bus Concept

  • Definition and role: A bus is a transmission medium for communication between two or more devices.
  • Metaphor: A bus is like transportation in a city
  • Communication Medium: A bus is a transmission medium for communication between two or more devices.
  • Metaphor: The bus is like a city's transportation network, connecting various parts so that information can flow freely.

Explanation of related professional terms about bus:

  1. Bus: In computer architecture, a bus is a physical channel or logical channel that connects various internal components of a computer (such as CPU, memory, I/O devices, etc.) for data transmission and communication.

  2. Synchronous bus : A synchronous bus is a bus that uses a shared clock signal for data transmission. Synchronized with the clock, but not necessarily every bus transaction completes within one clock cycle.

  3. Clock frequency : Clock frequency refers to the rate of the clock signal in the synchronous bus, usually in Hertz (Hz), indicating the number of clock cycles per second.

  4. Operating frequency : Operating frequency refers to the operating rate of the bus or device in actual operation. In a synchronous bus, the operating frequency may or may not be the same as the clock frequency.

  5. Asynchronous bus : An asynchronous bus is a bus that transmits data through handshake signals. The two parties in communication synchronize data transmission through handshake signals, and each handshake process completes a data exchange.

  6. Handshake signal : Handshake signal is a signal used for synchronous communication in an asynchronous bus. It indicates the start, end or other status of communication and ensures the correct transmission of data.

  7. Burst transmission : Burst transmission refers to the continuous transmission of multiple data within a bus cycle to improve the efficiency of data transmission. In this case, the transfer between address and data is continuous.

The above is an explanation of relevant professional terms about the bus, covering terms related to bus communication such as synchronous bus, asynchronous bus, clock frequency, handshake signal, etc.

There are three common centralized control priority arbitration methods:
(1) Chain query
The chain query method is shown in Figure 3.15(a). There is a line in the control bus in the figure that is controlled by the bus (BS bus busy, BR bus request, BG bus agree), in which the bus agree signal BG is sent serially from one I/0 interface to the next I/0 interface. If the interface where the BG arrives has a bus request, the BG signal will no longer be transmitted downward, which means that the interface has obtained the right to use the bus and establishes a bus busy BS signal, indicating that it occupies the bus. It can be seen that in the chain query, the device closest to the bus control component has the highest priority. The characteristics of this method are that only a few lines are needed to achieve bus control in a certain priority order, and it is easy to expand the device, but it is sensitive to circuit faults, and it may be difficult for devices with low priority to obtain requests.
(2) Counter timing query
The counter timing query method is shown in Figure 3.15 (b). Compared with Figure 3.15 (a), there is an additional set of device address lines and a missing
bus consent line BG. After the bus control component receives the bus request signal sent by BR, when the bus is not used (BS= 0), the counter in the bus control component starts counting and sends a set of messages to each device through the device address line. address signal. When the address of a device requesting to occupy the bus is consistent with the count value, the right to use the bus is obtained, and the count query is terminated at this time. The characteristics of this method are: counting can start from "0". At this time, once the priority of the device is fixed, the priority of the device will be arranged in descending order in the order of 0, 1, ..., and is fixed; counting can also Starting from the end point of the last count is a loop method. At this time, the priority of the devices using the bus is equal; the initial value of the counter can also be set by the program, so the priority can be changed. This method is not as sensitive to circuit faults as the chain query method, but it increases the number of control lines (device addresses) and makes the control more complex.
(3) Independent request method
The independent request method is shown in Figure 3.15(c). As can be seen from the figure, each device has a pair of bus request lines BR; and bus consent lines BGi. When a device requires the use of the bus, the device's request signal is sent. There is a queuing circuit in the bus control component that determines which device to respond to the request based on priority. The characteristics of this method are: fast response speed and flexible priority control (changed by program), but the number of control lines is large and the bus control is more complex. In the chain query, only two lines are used to determine which device the bus usage belongs to. In the counter query, roughly log2n lines are used, where is the maximum number of devices allowed to be admitted, while the independent request method requires 2n lines.

Insert image description here

Chapter 3 Memory

3.1 Memory type

3.1.1 Random access method

Standard definition: A memory access method that allows direct access to a storage unit at any location.

Popular explanation: Just like opening a book, you can read any page directly without having to read page by page in order.

  1. EPROM (Erasable Programmable Read-Only Memory) : EPROM is a read-only memory that stores data that can be erased and reprogrammed after being written. It uses random access, allowing the CPU to read or write data in it randomly.

  2. CD-ROM (Compact Disc Read Only) : CD-ROM is a type of read-only memory in which data is usually read optically. Unlike random access, CD-ROM uses serial access, and data is read sequentially according to the physical order on the disc.

  3. DRAM (Dynamic Random Access Memory) : DRAM is a type of random access memory used to temporarily store data and programs in use. It needs to be refreshed regularly to maintain the stored data, using random access methods.

  4. SRAM (Static Random Access Memory) : SRAM is also a type of random access memory. Unlike DRAM, its storage cells do not require refresh operations. SRAM offers faster access speeds but is generally more expensive than DRAM.

Among the options given, CD-ROM is memory that does not use random access.

3.1.2 Memory type explanation

  • EPROM (erasable programmable read-only memory):

    • Standard Definition: Non-volatile memory that requires special operations to erase and reprogram.
    • Popular explanation: Similar to a write-once electronic blackboard, if you need to modify it, you must use special tools.
  • CD-ROM (Compact Disc Read Only Memory):

    • Standard definition: read-only storage media, often used for optical disk storage.
    • Popular explanation: It's like a non-rewritable CD. You can only read the information inside, but you can't write new content on it.
  • DRAM (Dynamic Random Access Memory):

    • Standard definition: Volatile memory, which uses capacitors to store data and needs to be refreshed regularly.
    • Popular explanation: Similar to a bucket, water needs to be continuously added (refreshed) to maintain the existence of water (data).
  • SRAM (Static Random Access Memory):

    • Standard definition: Volatile memory that does not need to be refreshed but is relatively expensive.
    • Popular explanation: Like an open book, there is no need to constantly turn pages (refresh), but the cost is higher.

3.2 RAM and ROM characteristics

3.2.1 RAM: Volatile memory, using random access method

Standard definition: Random access memory is a type of volatile memory that allows random access to memory cells. Data will be lost when power is unavailable.

Popular explanation: Just like a whiteboard, you can write whatever you want, but when you turn off the power, the content will disappear.

3.2.2 ROM: Non-volatile memory, can be used for Cache and does not need to be refreshed

Standard definition: Read-only memory is a type of non-volatile memory, usually used to store fixed data. Can be used in Cache and does not require refresh operations.

Popular explanation: Similar to a printed book, the content is fixed and can be read over and over again without changing.

  1. RAM (Random Access Memory) : A type of computer main memory that enables random reading and writing of storage cells. RAM is volatile memory and its contents are lost when power is removed. Mainly divided into dynamic RAM (DRAM) and static RAM (SRAM), the former needs to be refreshed to retain data, while the latter does not.

  2. ROM (read-only memory) : A type of computer memory that stores fixed data or programs whose contents can only be read during normal operation and cannot be written randomly. ROM is non-volatile memory that retains its stored contents even when power is lost.

  3. Cache : A high-speed temporary memory used to store data that is frequently accessed or may be needed by the computer processor to increase the speed of data access. Cache is usually divided into multiple levels, including first-level cache (L1 Cache) and second-level cache (L2 Cache). SRAM is often used to make Cache because it provides faster read and write speeds.

  4. DRAM (Dynamic Random Access Memory) : A type of random access memory that uses capacitors to store each bit of data. DRAM requires periodic refreshes to prevent the capacitor charge from disappearing, making it more economical but slower than SRAM.

  5. SRAM (Static Random Access Memory) : A type of random access memory that uses flip-flop circuitry to store each bit of data. Relative to DRAM, SRAM is faster but more expensive and is often used for caches and other applications that require fast access.

3.3 Flash memory characteristics

3.3.1 Readable and writable, slow

Standard definition: A readable and writable non-volatile memory that is slower than RAM.

Popular explanation: Like an eraser, you can write content, but the erasing and writing speed is relatively slow.

3.3.2 Adopt random access method

Standard definition: Flash memory uses random access, allowing direct access to any storage location.

Popular explanation: Similar to the pages of a book, you can turn directly to the part you need without having to go through page by page.

3.3.3 Non-volatile memory

Standard definition: Unlike RAM, flash memory is a type of non-volatile memory in which data is retained when power is removed.

Popular explanation: Like a magnetic board, data will not disappear due to power outage.

Flash Memory

Flash memory is a type of semiconductor memory that is a non-volatile memory (NVM). It is widely used for its high speed, low energy consumption and durability. Flash memory is widely used in a variety of electronic devices, including mobile devices, cameras, USB flash drives, solid-state drives (SSD), and more.

Features and Properties:

  1. Readability : Unlike read-only memory (ROM), flash memory is readable and writable, allowing the stored data to be modified and updated.

  2. Composition : Storage elements are usually composed of Metal-Oxide-Semiconductor Field-Effect Transistor (MOSFET). Data is stored on the floating gate in the form of electric charges.

  3. Non-volatile : Flash memory is a type of non-volatile memory that retains stored data even when power is lost.

  4. Write-erase mechanism : A write operation requires the memory unit to be erased first and then written. This erase-write mechanism affects write speeds, which are generally slower than read speeds.

  5. Random Access : Data can be read and written with random access, which makes flash memory possible when replacing traditional disk drives.

application:

  1. Mobile devices : used to store operating systems, applications, and user data.

  2. Solid-state drive (SSD) : replaces traditional mechanical hard drives, providing faster read and write speeds and higher durability.

  3. Cameras and camcorders : Used to store photos and videos.

  4. USB flash drive : A portable storage device that plugs into a computer or other device to transfer and store data.

  5. Embedded Systems : Used in embedded computing devices to store firmware and operating systems.

Non-Volatile Memory

Non-volatile memory , or NVM for short, is a storage technology that retains stored information after a power outage. Unlike volatile memory (such as RAM), non-volatile memory does not require a continuous power supply to retain stored data. This type of memory is typically used for long-term storage and retrieval of data, such as flash memory, hard drives, and read-only memory (ROM).

NVM plays an important role in computer systems as it preserves stored data for long periods of time without the need for power. This is critical for storing bootloaders, operating systems, applications, and other critical information. Some of these techniques include:

  1. Flash Memory : As a type of non-volatile memory, flash memory is often used in mobile devices, solid-state drives (SSD), and other scenarios that require persistent storage. Its features include rewritable and programmable features.

  2. EEPROM (Electrically Erasable Programmable Read-Only Memory) : EEPROM is an electrically erasable ROM that can be reprogrammed on the circuit board. This kind of memory is often used to store device parameters, BIOS settings, etc.

  3. Hard Disk Drive (HDD) : HDD uses a magnetic coating on the surface of the disk to store data. Even if the power is turned off, the data remains on the platter.

  4. Solid State Drive (SSD) : SSD uses flash memory or other non-volatile memory technology to provide similar functionality to a traditional hard drive, but is faster and more durable.

Non-volatile memory plays a key role in computer architecture, embedded systems, and various electronic devices.

3.4 Memory chip address and data pins

3.4.1 Number of address pins: log2(4M) = 22

Standard definition: The calculation formula for the number of address pins of a memory chip is log2 (storage capacity). For a 4-megabit memory, the number of address pins is 22.

Popular explanation: Just like the number of doors in a room, it can determine the location of entering the room.

3.4.2 Number of data pins: 8

Standard definition: The number of data pins of the memory chip is 8, which means that 8 bits of data can be transmitted each time.

Popular explanation: Like the number of goods that can be transported at one time, 8 units of data can be transmitted each time.

Address pins and data pins

In the design of computer memory, address pins and data pins are two key concepts. They are used to connect the motherboard (or other processor) of the computer system and the memory device to implement read and write operations on data in the memory.

1. Address pin:

  • Definition: Address pins are pins on a computer interface or plug used to convey address information required when accessing memory.
  • Function: When the computer needs to read or write data in the memory, it passes the address of the target storage unit through the address pin, so that the system can accurately locate the required storage unit.
  • Example: If a memory chip has n address pins, it can address 2^n memory cells.

2. Data pin:

  • Definition: Data pins are pins on a computer interface or plug that are used to transmit actual data (bits).
  • Function: The data pin is responsible for transferring binary data between the computer and the memory to implement data reading and writing operations.
  • Example: If a memory chip has m data pins, and the number of data bits transmitted each time is m, m bits of data can be transmitted simultaneously through these pins.

Related concepts:

  • Multiplexing of address lines and data lines: In some memory designs, in order to reduce the number of pins, address lines and data lines may be multiplexed. This means that the same pin may be used to transmit address information and data information at different times.
  • Dynamic Random Access Memory (DRAM): It is a common type of memory that is internally composed of capacitors and data is stored in the capacitors. The number of DRAM address pins and data pins depends on the specific chip design.

Analyze the situation in the question:
For the 4M×8-bit DRAM chip mentioned in the question, 4M means that the storage capacity is 4 megabits, and 8 bits means the number of bits in each storage unit. Based on this information, the total number of address pins and data pins can be calculated. In this example, the number of address pins is log2(4M) and the number of data pins is 8. Then, considering that DRAM uses address multiplexing, the number of address pins is actually half the number of data pins. Therefore, the total number of pins is the number of address pins plus the number of data pins, which is log2(4M)/2 + 8. According to the calculation, this total is 19, so the answer is A.

Dynamic Random Access Memory (DRAM)

Dynamic Random Access Memory (DRAM) is a type of computer main memory used to temporarily store data and instructions required for computer operation. The following is an explanation of professional terms related to DRAM:

  1. Dynamic: DRAM is called "dynamic" because it uses capacitors to store data, and the charge in the capacitors gradually leaks out. Therefore, in order to retain data, DRAM needs to be refreshed periodically (refresh cycle), during which the data is rewritten.

  2. Random Access: DRAM allows a computer to access memory cells in a random manner, without requiring sequential or address order. This makes it ideal for primary memory (RAM) as it can read or write data quickly and randomly.

  3. Storage unit: The basic unit of DRAM to store information is the storage unit, which usually consists of a capacitor and a transistor. Capacitors store charge, and transistors act as switches, reading and writing charge.

  4. Refresh cycle: Due to the characteristics of capacitor leakage, DRAM requires regular refresh operations to prevent data loss. The refresh cycle refers to the time interval between DRAM chips performing refresh operations, usually measured in milliseconds.

  5. Address multiplexing: DRAM uses address multiplexing technology, in which the same set of address lines transfers different address information at different times. This reduces the number of pins required and increases memory efficiency.

  6. Bits and Words: DRAM is addressed and stored in units of bits, and multiple bits are combined to form bytes. For example, an 8-bit DRAM means that each memory cell stores 8 binary bits to form a byte.

  7. Storage capacity: The storage capacity of DRAM is usually measured in bits or bytes. For example, 4M×8 means 4 megabits of DRAM. Each storage unit stores 8 bits, and the total number of data bits is 32 megabits.

Overall, DRAM is a common main memory technology that is widely used in personal computers and other computing devices due to its high density and relatively low cost, despite its need for regular refreshes and relatively slow speeds.

3.5 Memory refresh

3.5.1 SDRAM needs to be refreshed periodically

Standard Definition: SDRAM (Synchronous Dynamic Random Access Memory) requires periodic refreshes to keep stored data valid.

Popular explanation: Just like a plant needs regular watering, SDRAM needs to be refreshed regularly to maintain data retention.

Explanation of professional terms for SRAM, SDRAM, ROM and Flash:

  1. SRAM (Static Random Access Memory): SRAM is a static random access memory. Unlike dynamic memory (such as DRAM), SRAM's memory cells use flip-flop circuits to store each bit and do not require periodic refreshes. This makes SRAM relatively fast to access, but it is generally more expensive and less dense than DRAM.

  2. SDRAM (Synchronous Dynamic Random Access Memory): SDRAM is also a dynamic random access memory, but it runs synchronously with the system clock. Its synchronicity allows for more efficient data transfer and works better with modern computer systems. SDRAM is commonly used as main memory, providing high storage density and performance.

  3. ROM (Read-Only Memory): ROM is a read-only memory in which the data stored cannot be changed during normal operations. ROM is typically used to store firmware, boot code, and other information that needs to remain unchanged. Unlike RAM, ROM is generally non-volatile and retains stored data even when power is lost.

  4. Flash Memory: Flash memory is a type of non-volatile memory, similar to ROM, but it is generally erasable and programmable. Flash memory is used in a wide variety of devices, including USB drives, solid-state drives (SSDs), and mobile devices. The main feature of Flash memory is its ability to be erased and programmed multiple times, making it suitable for applications that require frequent updates.

To sum up, each memory type has its unique characteristics and application scenarios. SRAM and SDRAM are used for main memory, providing fast read and write access. ROM is used to store immutable data, while Flash memory combines rewritable and non-volatile characteristics and is suitable for applications that require flexibility and programmability.

3.6 Memory interleaving addressing

3.6.1 Four-body cross addressing rules

Standard definition: Memory address allocation rules, cross-addressing through four banks to improve access efficiency.

Popular explanation: It's like distributing books on four bookshelves to speed up search and reading efficiency.

3.6.2 Determination of possible memory access conflicts

Standard definition: In four-body interleaved addressing, it is necessary to determine whether a memory access conflict may occur to ensure correct reading and writing of data.

Popular explanation: Similar to when looking for items on four shelves, you need to confirm whether a collision may occur to prevent confusion about the location of the items.

  1. Four-body interleaved addressing memory: This is a memory organization method in which the memory address space is divided into four modules. It is usually used to increase the memory access speed. In this architecture, four adjacent storage units respectively belong to four modules, and these four modules are cross-addressed in a round-robin manner.

  2. Memory access conflict: In multi-body interleaved addressing memory, if two or more addresses are mapped to the same module and these addresses are accessed in adjacent storage cycles, a memory access conflict may occur. Memory access conflicts may cause memory access efficiency to decrease.

  3. Storage module: The memory is divided into multiple modules, each module has a certain address range. In a four-body interleaved memory, there are four memory modules, each of which is responsible for processing one quarter of the address space.

  4. Module serial number: The number of the storage module corresponding to each memory access address. Calculate the module serial number by taking the remainder of the module number based on the address.

  5. Number of address cross modules: The number of modules into which the memory is divided, here it is four. Remainder operation used to calculate the module serial number.

In the given problem, through the module division and module number calculation rules of the four-body cross-addressed memory, the address pairs that may cause memory access conflicts are determined.

3.7 Memory burst transfer

3.7.1 64-bit bus, 8 8192×8192×8-bit DRAM chips

Standard definition: Memory burst transmission passes through a 64-bit bus and simultaneously accesses eight 8192×8192×8-bit DRAM chips to improve data transmission efficiency.

Popular explanation: It is like transporting multiple goods at one time and accessing multiple storage units at the same time through a spacious channel (64-bit bus).

3.7.2 The role of line buffer

Standard definition: The row buffer is used to temporarily store a row of data in the memory to speed up the memory burst transfer process.

Popular explanation: Just like when moving goods, a whole row of goods is temporarily placed in one area to improve handling efficiency.

  1. Main memory byte addressing: This means that each byte in the computer's main memory is assigned a unique address. Main memory is addressed in bytes.

  2. DRAM (Dynamic Random-Access Memory): DRAM is a dynamic random access memory used to temporarily store data and machine code. In the title, it refers to the chip in the memory using DRAM architecture.

  3. Cross-addressing method: This is a memory organization method that divides the main memory into multiple modules, and then performs cross-addressing on these modules according to certain rules to improve access speed.

  4. Memory bus: A bus that connects a computer's main memory to other components and is used to transfer data between computer components.

  5. 32-bit width memory bus: The width of a memory bus represents the number of bits that can be transferred in one operation. This means that 32 bits of data can be transmitted each time.

  6. Number of storage cycles: The number of storage cycles required to read or write a piece of data. The number of storage cycles depends on the memory organization and the size of the data.

  7. Double variable: In C language, double is a double-precision floating-point number type that usually occupies 64 bits (8 bytes) of storage space.

In the given problem, the organization of the main memory (interleaved addressing), the width of the memory bus, the storage structure of DRAM, and the calculation of the number of storage cycles required to read a double variable are mainly examined.

3.8 Memory design

3.8.1 Quantity calculation of ROM and RAM

Standard definition: Memory design requires calculation of the required amount of ROM and RAM to meet system requirements.

Popular explanation: Similar to planning a library, determining how many books (ROM) and notebooks (RAM) are needed to accommodate reader needs.

3.8.2 Calculation of the number of bits in the main memory address register (MAR)

Standard definition: The number of bits in the main memory address register (MAR) needs to be calculated based on the number of address pins of the memory to ensure correct addressing.

Popular explanation: Just like the house number needs enough digits to ensure that each room can be accurately found, the MAR also needs to have enough digits to accurately access the memory.

  1. Burst Transfer: This is a memory access method in which after requesting an address, several subsequent data items are also continuously transferred. This helps improve the efficiency of memory access and reduce access latency.

  2. Memory bus width: Memory bus width refers to the number of bits that can be transferred simultaneously in a memory access. A 64-bit memory bus width means that 64 bits of data can be transferred at a time.

  3. Multi-module cross-addressing method: This is a memory organization method that uses multiple storage modules to perform cross-addressing in a certain pattern to improve the access speed of the overall memory system.

  4. Address pins: The number of address pins of each memory chip indicates the number of addresses that the chip can address. It is usually used to calculate the number of bits in the address line.

  5. Row Buffer: This is the part inside the DRAM chip that stores the data of the recently accessed row. Its existence can speed up consecutive accesses to the same row of data.

In the given question, various aspects of the description of the memory stick are examined, including the memory capacity, interleaved addressing mode, number of address pins, and the row buffer inside the chip. The last item is a correction pointing to the option stating that the on-chip line buffer length is the size of a line.

3.9 Memory address calculation

3.9.1 Division of address ranges

Standard definition: Ranges of memory addresses need to be divided in order to manage data efficiently.

Popular explanation: It's like dividing a city into different areas to facilitate management and finding addresses.

3.9.2 Calculation of memory capacity

Standard definition: The number of address bits needs to be considered when calculating the memory capacity to ensure sufficient storage space.

Popular explanation: Like calculating how many rooms a house has, consider the address digits to ensure there is enough space to store items.

  1. ROM (Read-Only Memory): ROM is read-only memory, and its contents cannot be modified or written under normal conditions. In computer systems, ROM is usually used to store firmware, startup programs and other data that do not need to be modified frequently.

  2. RAM (Random Access Memory): RAM is random access memory, which allows random reading and writing of data in storage units. Unlike ROM, RAM is volatile memory, meaning its contents are lost when power is removed.

  3. Byte addressing: The byte addressing method of memory means that each storage unit has a unique address, and this address increments in bytes.

  4. 2K×8-bit ROM chip: This means that the capacity of the ROM chip is 2K bytes, and each byte has 8 bits. Similarly, a 4K×4-bit RAM chip means that the RAM chip has a capacity of 4K bytes, and each byte has 4 bits.

  5. Word expansion: Word expansion refers to extending the address lines of the memory to higher bits when designing the memory to accommodate larger capacity memory. What this refers to is adapting the 2K×8-bit ROM chip to the 4KB ROM area through word expansion.

  6. Simultaneous bit expansion: Simultaneous bit expansion refers to extending the data lines of the memory to more bits to accommodate larger data capacity when designing the memory. What this refers to is the simultaneous expansion of a 4K×4-bit RAM chip into a 60KB RAM area.

In the given question, the basic concepts of ROM, RAM, byte addressing mode and the concepts of word extension and bit simultaneous extension are explained. Finally, the number of ROM chips and RAM chips required were calculated based on the meaning of the question.

3.10 Memory byte addressing

3.10.1 Chip combinations with different digits

Standard definition: In memory byte addressing, it is necessary to consider how chips with different bits are combined to meet system requirements.

Popular explanation: Just like when putting together a jigsaw puzzle, you need to consider how different shaped pieces fit together.

3.10.2 Address calculation rules

Standard definition: Develop address calculation rules for memory byte addressing to ensure accurate access to data.

Popular explanation: It is similar to calculating the distance between two locations through rules on a map to ensure that the required data can be found accurately.

  1. Address 0B1FH: This is a hexadecimal address that represents a specific address in memory. In this problem, the address is used to determine the location of some data in memory.

  2. 2K×4-bit chip: This means that each chip has a capacity of 2K bytes and each byte has 4 bits. This notation usually represents the size of rows and columns of memory.

  3. 8K×8-bit memory: This means that the entire memory capacity is 8K bytes, and each byte has 8 bits. This notation usually represents the overall size of the memory and the number of bits of data stored at each address.

  4. Minimum address: This refers to the minimum address of the chip in the row in a memory composed of several chips in a specific row. Here, the problem involves combining multiple 2K x 4-bit chips into an 8K x 8-bit memory.

In the given question, the representation of hexadecimal address, the concept of 2K×4 bit chip and 8K×8 bit memory are explained. Then, the minimum address of the chip where address 0B1FH is located is determined through calculation.

3.11 Number of memory address register bits

3.11.1 The relationship between the main memory size and the number of address register bits

Standard definition: Calculate the relationship between the size of main memory and the number of bits in the address register to meet the addressing needs of the system.

Popular explanation: Just like ensuring that there are enough house numbers to accommodate an entire city of houses, the number of bits in the address register must be large enough to correctly locate the data in main memory.

  1. Main Memory (RAM): Main memory is the portion of memory in a computer used to store data and programs. In this problem, a 4M×8-bit RAM chip is used to form the main memory.

  2. Byte addressing: This means that each address in the computer's main memory corresponds to a byte. A byte is the smallest addressable unit of storage in a computer.

  3. Main memory address space: This refers to the range of all possible addresses in the main memory of a computer. In this problem, the main memory address space size is 64MB.

  4. Memory Address Register (MAR): Memory Address Register is a register used to store the memory address to be accessed. Here, the problem involves counting the number of bits in the MAR to determine its addressing range.

  5. Number of bits: In computers, "bit" usually refers to the number of binary bits used to represent the length of an address or data. The number of bits in a MAR represents the number of different addresses it can address.

In this problem, the MAR needs to be at least 26 bits in size to ensure that it can address the 64MB of main memory address space. This involves the relationship between the addressing range of computer memory and the number of address bits.

3.12 Memory design and number of chips

3.12.1 64KB capacity memory design

Standard definition: Design a memory that meets the 64KB capacity requirement and determine the number of chips required.

Popular explanation: Similar to planning a 64KB book library, determine how many bookshelves (chips) are needed to store all books.

3.12.2 Calculation of the number of SRAM chips

Standard definition: Calculate the number of chips required to build SRAM memory to meet system performance and capacity requirements.

Popular explanation: Like computing to build a study area with advanced features and determining how many desks (SRAM chips) are needed to meet the needs of students.

  1. SRAM (Static Random-Access Memory): SRAM is a static random access memory that can maintain stored data without power interruption. Compared with DRAM (Dynamic Random Access Memory), SRAM does not require refresh operations, but is relatively more expensive.

  2. Byte addressing: This means that each address in the computer's main memory corresponds to a byte. A byte is the smallest addressable unit of storage in a computer.

  3. ROM (Read-Only Memory): ROM is read-only memory in which the data stored cannot be modified or written during normal operation. In this question, the area with address range 4000H to 5FFFH is allocated to ROM.

  4. RAM (Random-Access Memory): RAM is a random access memory in which data can be read and written at any time. In this problem, the remaining area except the ROM area is allocated to RAM.

  5. Address range: This represents the range of addresses in computer memory that can be used to uniquely identify different storage locations.

In this problem, by calculating the capacity requirements of ROM and RAM, the required number of 8K×4-bit SRAM chips is determined to be 14. This involves calculating the capacity of the memory area and selecting an appropriately sized memory chip.

3.13 DRAM chip array design

3.13.1 Selection of minimum number of address pins

Standard definition: When designing a row of DRAM chips, choose the smallest number of address pins to reduce complexity.

Popular explanation: Just like designing a garden, choose the simplest path to make it easier for visitors to find their destination.

3.13.2 Strategies to reduce refresh overhead

Standard definition: Develop strategies to reduce refresh overhead and improve DRAM performance.

Popular explanation: Like designing an automatic watering system to reduce the number of manual waterings and improve efficiency.

  1. DRAM (Dynamic Random-Access Memory): DRAM is a type of dynamic random access memory. Compared with static RAM (SRAM), it needs to be refreshed regularly to maintain the stored data because the stored data gradually disappears. DRAM is commonly used for main memory.

  2. Memory Array: A memory array is a structure in a DRAM chip used to store data. It usually consists of rows and columns to form a two-dimensional array of storage cells.

  3. Number of rows (r) and number of columns (c): In the memory array of a DRAM chip, the number of rows and columns indicates the layout of the memory cells. The row and column selections are related to the address lines, and a data bit is stored at the intersection of the row and column. In this question, r represents the number of rows and c represents the number of columns.

  4. Refresh overhead: Data in DRAM is stored in capacitors. Due to the self-discharge characteristics of capacitors, regular refreshes are required to maintain the stored data. Refresh overhead refers to the performance overhead caused by refresh operations to prevent data loss.

In this problem, in order to minimize the number of address pins and reduce refresh overhead, the values ​​of rows and columns are selected, in which the difference between rows and columns is smaller and the number of rows is smaller, in order to optimize the address pins and reduce refresh overhead. . Option C (32 rows, 64 columns) is the choice that best meets these criteria.

3.14 Memory bus number and capacity

3.14.1 Relationship between address lines and data lines

Standard definition: Determine the number of bits on the memory bus to ensure compatibility between address lines and data lines.

Popular explanation: Just like making sure traffic roads are wide enough to accommodate different types of vehicles, memory bus bits need to be wide enough to transfer addresses and data.

3.14.2 Calculation of the number of RAM chips

Standard definition: Calculate the number of chips required to build RAM memory to meet system capacity requirements.

Popular explanation: Similar to determining a parking lot large enough, calculate how many parking spaces (RAM chips) are needed to accommodate all vehicles.

  1. Word addressing: Word addressing refers to the computer addressing in units of bytes. In this mode, each address refers to one byte of data.

  2. Word length: The word length represents the number of binary digits that the computer can process at one time. In this problem, the word length is 32 bits, that is, 32 bits of binary data can be processed at one time.

  3. RAM (Random-Access Memory): RAM is a random access memory that allows direct access to data at any location in the memory. RAM is usually used for main memory.

  4. RAM chip capacity: RAM chip capacity indicates the amount of data each RAM chip can store. In this problem, the RAM chip capacity is 512K × 8 bits, which means that each RAM chip can store 512K bytes (or 4M bits) of data.

Based on the address range in the question and the size of the RAM area, calculate the total capacity of the RAM area. Then, divide the total capacity of the RAM area by the capacity of each RAM chip to obtain the required number of RAM chips. In this problem, the number of 512K×8-bit RAM chips required is 32.

3.15 Disk read time

3.15.1 Influence of rotation speed, seek time and transmission rate

Standard definition: Analyzes how disk read time is affected by factors such as rotation speed, seek time, and transfer rate.

Popular explanation: Like calculating the time required to reach a destination, taking into account factors such as vehicle speed, road selection, and traffic conditions.

3.15.2 Calculation of average time to read sectors

Standard definition: Evaluate disk performance by calculating the average time to read a sector.

Popular explanation: Just like calculating the average time it takes a postman to deliver a letter, consider the number of letters and the distance they are delivered.

  1. Disk RPM: Disk RPM is the number of revolutions the disk makes per minute. In this problem, the disk is rotating at 10,000 rpm.

  2. Average seek time: Average seek time is the average time it takes for the head to move from one track to another on the disk. In this problem, the average seek time is 6ms.

  3. Disk transfer rate: Disk transfer rate indicates how fast data can be read from or written to a disk. In this problem, the disk transfer rate is 20MB/s.

  4. Disk controller latency: Disk controller latency is the time it takes for the disk controller to process a read or write request. In this question, the disk controller latency is 0.2ms.

  5. Sectors: Disk storage is divided into small data blocks, each data block is called a sector. In this problem, a 4KB sector is read.

Based on these parameters, calculate the average time to read a 4KB sector, including seek time, rotation delay, transfer time and controller delay. In this problem, the calculated result is 9.4ms.

3.16 RAID reliability improvement measures

3.16.1 Disk imaging

Standard Definition: Disk mirroring is implemented as a measure to improve the reliability of a RAID system.

Popular explanation: Similar to backing up important files, disk mirroring is a real-time backup of data to improve system reliability.

3.16.2 Striping

Standard definition: Using striping technology to improve the read and write speed of the RAID system.

Popular explanation: It is like dividing books into strips to read multiple books at the same time to improve reading and writing efficiency.

3.16.3 Parity check

Standard definition: Apply parity check algorithm to provide data redundancy and error detection for RAID systems.

Popular explanation: Like using a spare puzzle piece, it can be used to correct errors in other puzzle pieces to ensure a complete pattern.

3.16.4 Add Cache mechanism

Standard definition: Introducing a caching mechanism to speed up access to the RAID system.

Popular explanation: It is similar to placing the required books on the table in advance to speed up reading.

  1. RAID (Redundant Array of Independent Disks): RAID is a disk array technology that combines multiple hard drives to provide higher performance, fault tolerance, or both. RAID systems can be divided into multiple levels, such as RAID 0, RAID 1, RAID 5, etc. Each level has different data storage and redundancy methods.

  2. Disk mirroring: RAID 1 uses disk mirroring technology to write data to two disks at the same time to achieve redundant backup of data. If one disk fails, the system can still serve data from another mirrored disk.

  3. Striping: RAID 0 uses striping technology to divide data into small pieces and then write them to different hard drives. This can improve the reading and writing speed of data, but there is no redundant backup, so a hard drive failure will cause data loss.

  4. Parity Check: RAID 3, RAID 4 and RAID 5 use parity check technology. This technology adds parity bits to data blocks so that when a certain disk fails, the data can be restored using the data and parity bits on other disks to achieve redundant backup.

  5. Adding a cache mechanism: In a RAID system, adding a cache mechanism can improve read and write performance. Caching can be used to temporarily store data being read or written, thereby speeding up system response. However, it should be noted that adding cache also brings a certain risk of data loss, so battery-backed cache is usually required to prevent data loss in the event of power failure.

3.17 Disk access time calculation

3.17.1 Access time = seek time + delay time + transmission time

Standard definition: Calculates disk access time, taking into account factors such as seek time, latency, and transfer time.

Popular explanation: Just like calculating the time it takes to go to the store to buy something, taking into account the time to find the product, the time to queue and the time to pay.

  1. Average seek time: The average seek time of disk storage refers to the average time it takes for the disk head to move from the current track to the target track. This time includes seek time and positioning time. The seek time is the time to move the magnetic head to the target track, and the positioning time is the time to accurately position the magnetic head to the target track.

  2. Sector: A sector on a disk is the smallest unit for storing data, usually 512 bytes or 4KB. Sector is the basic unit for data reading and writing on the disk. Each track on the disk is divided into multiple sectors. After the magnetic head locates the target track during reading and writing, it rotates the disk to access specific sectors.

  3. Disk rotation speed: Disk rotation speed refers to the time it takes for a hard disk to rotate once, usually expressed in revolutions per minute. The speed of the rotation speed directly affects the read and write speed of the disk, which is generally described in revolutions per minute (RPM, Revolutions Per Minute).

  4. Average access time: The average access time of a disk refers to the average time required from issuing an access request to obtaining data. It includes seek time, rotation delay time and data transmission time. This is an important indicator for evaluating disk performance.

In this problem, the average seek time, rotation delay time and data transfer time are taken into account when calculating the average access time. The average access time to access a sector is obtained by accumulating these times.

3.18 Disk storage description

3.18.1 The formatted capacity of the disk is smaller than the unformatted capacity

Standard definition: Explain why the formatted disk capacity is smaller than the unformatted capacity.

Popular explanation: Similar to the capacity of a shopping bag, the actual things that can be packed are smaller than the size of the bag because part of the bag's space is used for labeling and neatly placing items.

3.18.2 Sectors contain data, address, checksum and other information

Standard definition: Describes the structure of disk sectors, including storage of data, addresses, parity and other information.

Popular explanation: Just like a page of a book, the sector contains information such as data content (text), address (page number), and verification (error correction means).

3.18.3 The minimum read and write unit of disk storage is one sector

Standard definition: Determine the minimum read and write unit of disk storage and consider its impact.

Popular explanation: Just like you can only carry one box at a time, the minimum read and write unit of a disk is a sector, and you can only read and write the contents of this sector.

3.18.4 Disk storage consists of disk controller, disk drive and disk platter

Standard definition: Analyze the structure of disk storage, including disk controllers, disk drives, platters and other components.

Popular explanation: Just like a library, there are administrators (disk controllers) who manage books, bookshelves (disk drives) where books are stored, and specific books (discs) on the bookshelves.

  1. Formatted capacity and unformatted capacity of disk:

    • Unformatted Capacity of a disk: This refers to the total storage capacity on the disk, including all tracks and sectors. At this stage, the disk has not yet been formatted and divided into logical structures usable by the file system.
    • Formatted Capacity of the disk: When the disk is formatted, part of the capacity will be used to store file system metadata, file allocation tables and other information. This part of the capacity cannot be used to store user data. Therefore, the formatted capacity of a disk is usually less than the unformatted capacity.
  2. Information contained in sectors:

    • A disk sector is the smallest unit of disk storage and usually contains a certain number of bytes of data. The information in each sector usually includes user data, sector address (used for disk addressing), verification information (such as checksum or error correction code), etc.
  3. The minimum read and write unit of disk storage:

    • The smallest read and write unit of disk storage is a sector. Even reading and writing data smaller than the sector size requires reading and writing the entire sector.
  4. Disk storage consists of:

    • Disk Controller: Controls the read and write operations of the disk and is responsible for communicating with other parts of the computer system.
    • Disk Drive: A physical storage medium, such as a hard drive or floppy drive, used to store data.
    • Platter: A circular platter in a disk drive, usually with multiple platters stacked on top of each other, with both surfaces of each platter used to store data.

The error in option C is that the smallest read and write unit of disk storage is a sector, not a byte.

3.19 Cache hit rate calculation

3.19.1 Hit rate = Number of Cache hits/Total number of accesses

Standard definition: Calculate Cache hit rate and evaluate Cache performance.

Popular explanation: Just like the shelves in a store, count how many times customers can find the goods they need here to evaluate the efficiency of the shelves.
Explanation of professional terms:

  1. Cache:

    • Cache is a type of cache memory located in the computer's storage hierarchy, between the main memory (RAM) and the central processing unit (CPU). Its purpose is to provide faster access speed than main memory and reduce the CPU access time to main memory.
  2. Cache hit rate:

    • Cache hit rate refers to the ratio of the number of times that the cache is successfully read or written to the total number of accesses when accessing the cache. This is an important performance indicator. A high hit rate usually indicates that the design and use of Cache is effective, which can reduce the number of accesses to slow main memory and improve overall computer performance.
  3. Cache miss:

    • When the CPU requests data or instructions, if the required data is found in the Cache, it is called a Cache hit; on the contrary, if it is not found, it is called a Cache miss. A miss results in the need to fetch the data from main memory, which incurs additional access time.

In this question, the Cache hit rate calculation formula is:

[ \text{Hit rate} = \frac{\text{Number of Cache hits}}{\text{Total number of visits}} ]

The question mentioned that access to the Cache misses 50 times, so the number of hits is the total number of accesses minus the number of misses, that is (1000 - 50 = 950). Enter the formula calculation:

[ \text{Hit rate} = \frac{950}{1000} \times 100% = 95% ]

So, the answer is D. 95%.

3.20 Cache mapping method calculation

3.20.1 Two-way group associative mapping

Standard definition: Analyze the two-way set associative mapping method and calculate the relationship between the main memory unit address and the Cache group number.

Popular explanation: Just like determining where books should be placed in a library, calculations are used to determine which bookshelf each book should be placed on.
Explanation of professional terms:

  1. Group associative mapping:

    • Set associative mapping is one of the cache mapping methods, which divides the main memory block into several groups, each group containing multiple cache lines. When a main memory block needs to be loaded into the Cache, it can be mapped to any row in the group. This design can reduce conflicts in direct mapping.
  2. Cache group:

    • A Cache group is a logical unit of Cache and consists of multiple Cache lines. In set-associative mapping, a group contains multiple cache lines, each line is used to store a main memory block. The number of groups determines the group associativity of the Cache.
  3. Cache line:

    • Cache line is the basic storage unit of Cache, used to store a piece of data in main memory. In set-associative mapping, one cache line corresponds to one main memory block. Each cache line has a tag that identifies the stored main memory block.

In this question, the Cache has a total of 16 blocks, using two-way set associative mapping, that is, 8 sets. The main memory block size is 32B, and the main memory unit address 129 (10000001 in binary) needs to be mapped to which group of Cache?

The last 5 bits of the main memory unit address are the block address, and the first 3 bits of the block address determine the group number. Therefore, the first 3 digits of (10000001_2) are (100_2), which is 4 when converted to decimal. Therefore, main memory block No. 129 should be loaded into group 4 of the Cache. So, the answer is C. 4.

3.21 Cache mapping and replacement strategy

3.21.1 Two-way set associative mapping

Explain the two-way set associative mapping, considering the LRU replacement strategy.
In the principles of computer composition, the following are explanations of relevant professional terms:

  1. Cache: Cache memory, used to temporarily store data and instructions frequently accessed by the computer processor to improve data access speed.

  2. LRU (Least Recently Used): LRU is a cache replacement algorithm that eliminates cache blocks that have not been used for the longest time based on recent access patterns.

  3. Set-associative mapping: A type of cache mapping in which a portion of a main memory address is mapped into multiple groups in the cache, each group containing multiple cache lines. This helps reduce conflicts and improve cache efficiency.

  4. Main memory (main memory): The memory in a computer used to store data and instructions. It is the main storage area directly accessed by the processor.

  5. Byte addressing: Computer memory is addressed in bytes, and each address unit corresponds to a byte.

  6. Block (Cache block): A block of data stored in the cache, usually corresponding to a block in main memory.

  7. Replacement policy: The rules that determine which cache block is selected for replacement in the event of a cache miss. LRU is one of the common replacement strategies.

The above is an explanation of some professional terms related to the topic. If you have any other terms that need clarification, please let me know.

3.22 Purpose of separation of instruction Cache and data Cache

3.22.1 Reduce instruction pipeline resource conflicts

Explain the purpose of separating the instruction cache and data cache to reduce instruction pipeline resource conflicts.

Separation of instruction Cache and data cache: This is a way of organizing the cache structure in a computer system, in which the instruction cache (Instruction Cache) and the data cache (Data Cache) are two independent cache parts. Each part is dedicated to storing the corresponding type of information to improve the efficiency of memory access.

  • Instruction Cache (I-Cache) : used to store instructions for computer programs. When the central processor needs to execute an instruction, it will first search the instruction Cache. If there is a hit (the corresponding instruction is found in the Cache), it will directly obtain the instruction from the Cache for execution. Otherwise, it will need to be loaded from the main memory into the Cache and then executed. .

  • Data Cache (D-Cache) : used to store data in computer programs. When the central processor needs to read or write data, it will first search the data cache. If it hits, the data will be read or written directly in the cache. Otherwise, it needs to be loaded from the main memory into the cache and then operated.

The design of separating the instruction Cache and the data Cache helps to avoid access conflicts between instructions and data, improves the cache hit rate, and thereby improves the overall computer performance. This separation is usually designed to reduce cache contention that may occur when instructions and data are accessed at the same time, improve the parallelism of instructions and data, and thereby utilize computing resources more efficiently.
Instruction Pipeline:

Instruction pipeline is an optimization technology for computer instruction execution. By dividing different stages of instruction execution into multiple pipeline stages, multiple instructions can be executed at different stages at the same time, thereby improving the throughput of the processor. Each pipeline stage performs specific tasks such as fetch, decode, execute, memory access, and writeback. When an instruction completes a certain stage, it enters the next stage, and at the same time, the next instruction begins to execute the current stage, forming parallel execution of instructions in the pipeline.

The advantage of pipeline technology is that it can make full use of processor resources and increase the execution speed of instructions. Different pipeline stages can be executed in parallel, allowing multiple instructions to complete different stages within the same time period, thereby achieving higher efficiency. However, the pipeline also faces some problems, such as pipeline pauses, data correlation, branch prediction, etc., and corresponding measures need to be taken to solve them.

3.23 Data Cache Missing Rate Calculation

3.23.1 Calculate data cache miss rate

Evaluate the system's data access efficiency by calculating the miss rate of the data cache.
Direct Mapping: Direct mapping is a cache mapping technique in which each block in main memory can only be mapped to a specific location in the cache. Specifically, in direct mapping, each block of main memory can only be mapped to a specific block of cache, and this mapping relationship is directly determined by using certain bits of the main memory address. This means that if two main memory blocks have the same bits, they will map to the same location in the cache.

LRU replacement policy (Least Recently Used): LRU is a cache replacement policy that determines which blocks in the cache are selected to be replaced to make room for new blocks. In LRU policy, the block considered to be the least recently used will be replaced. Specifically, when a block is accessed, it is marked as the most recently used block. When a block needs to be replaced, the oldest unused block in the cache is selected for replacement.

Cache Miss Rate: Cache Miss Rate is an indicator used to measure cache performance. It indicates the proportion of the number of accesses that the cache cannot satisfy to the total number of accesses in a series of accesses. The lower the cache miss rate, the better the cache's performance because it means more accesses are successfully satisfied from the cache without loading data from main memory.

Block: The smallest unit of data transferred between cache and main memory. Also called cache block or main memory block. The block size is usually a power of 2, for example, 16 bytes, 32 bytes, etc. The size of the block determines the amount of data transferred between the cache and main memory. Larger blocks can increase the memory access hit rate and reduce the cache miss rate.
This C language program loops through the array a, increasing the value of the array element a[k] by 32 each time. According to the memory access operation during program execution, a direct mapping data Cache is used. The data area size is 1KB, the block size is 16B, and the Cache is initially empty. The analysis can be obtained:

  • Accessing a[k] in this program requires two memory accesses: one to read the value of a[k], and one to write back the new value.
  • Each element in array a is of type int, occupying 4B.

First, calculate the number of elements in array a, which is the quotient of the main memory size and the element size: 1KB / 4B = 256 elements.

Then, consider that the size of each block is 16B, then each block can hold 4 elements (16B / 4B = 4).

From this we can see that array a has a total of 256 elements and each block can hold 4 elements, so 64 blocks are required (256 elements / 4 elements / block = 64 blocks).

During the execution of the program, when a[k] was accessed for the first time, a deletion occurred, causing the corresponding block (4 elements) to be loaded into the Cache. The next 7 accesses are all performed within this block that has been loaded into the Cache, so they all hit the Cache.

Therefore, the cache miss rate of accessing array a during the execution of this program segment is approximately (1 miss/8 accesses) * 100% = 12.5%. The answer option is C.

3.24 Array access locality description

3.24.1 Temporal locality and spatial locality

Describe the time locality and space locality of array access and optimize memory access efficiency.
Temporal Locality

Temporal locality refers to the tendency that once a memory location is accessed during program execution, it may be accessed again in the near future. This locality is reflected in the program's repeated use of certain data or instructions, making full use of storage hierarchies such as cache and registers in the computer system, thereby improving program execution efficiency.

In computer systems, because the execution of programs usually exhibits loop structures, sequential execution and other rules, a certain storage unit may be accessed multiple times within a period of time. Therefore, during this period, the data of the storage unit is retained in Cache reduces the need to access main memory and increases program execution speed.

Spatial Locality

Spatial locality refers to the tendency that once a certain storage unit is accessed during program execution, nearby storage units will also be accessed quickly. This locality is reflected in the program's access to adjacent memory addresses within a period of time, usually in the form of access patterns to contiguous memory areas such as arrays and data structures.

Similar to temporal locality, spatial locality also exists to make full use of caches and other storage hierarchies and reduce access to main memory. Through technologies such as pre-reading and cache pre-fetching, the system can load nearby data into the cache in advance after discovering that a certain storage unit has been accessed to meet possible subsequent access needs and improve the overall performance of the program.

3.25 Cache row number calculation

3.25.1 Direct mapping method and write-back strategy

Calculate the number of cache rows in direct mapping mode and consider the impact of the write-back strategy on performance.

  1. Direct mapping method: It is a Cache mapping strategy in which each main memory block can only be mapped to a specific line of the Cache. That is, each block in main memory can only be placed in one location in the Cache. This can be achieved by using part of the block address to select the cache line.

  2. Write Back: It is a cache write strategy in which the contents of a Cache line are written back to main memory only when it is replaced. When a write occurs, the data is first written to the Cache rather than immediately written back to main memory. This way, only when the cache line is replaced, it needs to be written back to main memory.

  3. The number of bits in a cache line: refers to the total number of bits in each line in the cache, including data bits, mark bits, dirty bits (Dirty Bit), valid bits (Valid Bit), etc. In this context, it mainly includes data bits, mark bits, dirty bits and valid bits, a total of 275 bits.

  4. Physical address: In computers, refers to the addressing of memory and peripherals in the computer. Here, it specifically refers to the main memory address, which usually consists of a tag (Tag), a group number (Index), an intra-block address (Offset) and other parts, and is used for Cache mapping and identification.

These terminology explanations cover computer architecture concepts related to Cache mapping methods, write strategies, and Cache row structures.


3.26 Calculation of the number and digits of Cache comparators

3.26.1 Cache with set associative mapping

Calculate the number and digits of comparators in the Cache under set associative mapping to meet system requirements.
set associative mapping

Set-associative mapping is a cache mapping method in which main memory addresses are divided into multiple groups, with multiple cache lines in each group. Multiple cache lines in each group share a group tag, and each cache line contains the tag, group index, and block offset.

  • Set associative mapping structure: In this structure, main memory addresses are divided into groups and blocks, with multiple cache lines in each group. Cache lines are the same size as main memory blocks.

  • The number of groups of set associative mapping: determined by the cache's set associativity. If there are 8-way set associativity, 8 cache lines share a group mark.

  • Comparator: Used to check on access whether the tag in the cache matches the tag of the main memory block to be accessed. Within a given group, the tags of each cache line need to be compared.

  • Example: If Cache uses 8-way set associative mapping and there are 8 cache lines in each group, 8 comparators are needed to compare the marks of 8 cache lines in parallel.

The advantage of the set-associative mapping structure is that it has better flexibility and higher hit rate than direct mapping, but accordingly, more hardware overhead is required to implement it.


3.27 Computer memory access

3.27.1 Virtual storage concept

Explain the concept of virtual storage and introduce how to improve system performance and flexibility through virtual storage.

3.27.2 The role of TLB (fast table) and Cache

Explain the role of TLB and Cache in computer memory access to improve memory access efficiency.

3.27.3 Page fault and missing processing during memory access

Describe in detail the page fault and missing processing mechanisms that may occur during computer memory access to ensure correct acquisition of data.
Computer clock cycles and bus bandwidth

CPU and bus clock cycles

  • CPU clock cycle: refers to the time required for the CPU to execute an instruction. It is the reciprocal of the main frequency, that is, 1/main frequency. Here, the main frequency refers to the clock frequency of the CPU. The main frequency is 800MHz, so the CPU clock cycle is 1/800MHz = 1.25ns.

  • Bus clock cycle: refers to the time required for the bus to perform a clock oscillation, and is also the reciprocal of the bus frequency. Here, the bus frequency is 200MHz, so the bus clock period is 1/200MHz = 5ns.

bus bandwidth

  • Bus bandwidth: refers to the amount of data transmitted through the bus in unit time. For a 32-bit bus, 32 bits of data are transferred per clock cycle, so the bandwidth is 32 bits × clock frequency. Here, the bus width is 32 bits and the clock frequency is 200MHz, so the bus bandwidth is 32 bits × 200MHz = 800MB/s or 4B/5ns = 800MB/s.

Read bus transactions when cache is missing

  • Cache block size: refers to the size of each block in the cache, usually in bytes. Here, the Cache block size is 32B.

  • Read burst transfer bus transaction: When a cache miss occurs, a block needs to be read from main memory into the cache. This process is called a read burst transfer bus transaction.

Memory bus read transaction time

  • Read transaction time: includes processes such as sending addresses and commands, preparing data in the memory, and transmitting data. Here, the time of each read transaction is the sum of the address transfer time, the first read data time, and the data transfer time.

CPU execution time

  • Instruction execution time: refers to the time required for the CPU to execute an instruction. It can be divided into the execution time when the cache hits and the average additional overhead when the cache is missing.

  • Average overhead: refers to the average additional overhead caused by cache misses for each instruction, including the comprehensive consideration of cache access times, miss rates, and read bus transaction times.

The explanations of these professional terms cover key concepts such as clock cycles, bus bandwidth, cache misses, and memory bus transactions in computer architecture.
Cache capacity

Cache capacity refers to the amount of data or instructions that the cache memory can hold. In computers, Cache is used to store the most frequently accessed data and instructions to improve the access speed to these data and instructions.

  • Calculation formula: The calculation of Cache capacity usually depends on the number of cached lines, the size of each line (number of bytes) and the number of ways (associativity).

  • Capacity calculation in direct mapping mode: For direct mapping mode, each row corresponds to a main memory block, and the calculation formula is Cache capacity = number of rows × row size. The row size includes data part, mark part, valid bits, etc.

  • Capacity calculation in associative mapping mode: For associative mapping mode, the capacity calculation takes into account the number of rows per group, the size of each row, and the number of groups. The formula is Cache capacity = number of rows × row size × number of groups.

Cache line number calculation

In the associative mapping method, calculating the Cache line number corresponding to a specific main memory address involves the disassembly of the address and the application of mapping rules. The general calculation formula is: Cache line number = [(first address + offset within the block) / (number of lines per group)] % number of groups. This formula includes factors such as the first address of the main memory, the offset within the block, the number of rows in each group, and the number of groups.

hit rate

Cache hit rate refers to the ratio of the accessed data or instructions that already exist in the cache during program execution. The calculation formula is hit rate = (total number of visits - number of misses) / total number of visits. For different memory access methods (row-first traversal, column-first traversal), the hit rate will be significantly different, affecting program execution efficiency.

In the above problems, the calculation of Cache capacity, determination of Cache line number, and hit rate analysis of programs A and B all involve the application of these concepts.



3.28 Memory Hierarchy and Performance

3.28.1 Memory Hierarchy Overview

Summarize the memory hierarchy, including main memory, Cache, and auxiliary memory, to improve overall system performance.

3.28.2 CPI calculation when Cache hits

Calculate the CPI (number of cycles per instruction) when the cache hits and evaluate the cache's improvement in instruction execution efficiency.

3.28.3 Memory bus bandwidth calculation

Calculate the bandwidth of the memory bus to ensure the efficiency and speed of data transmission.

3.28.4 Memory Bus Burst Transfer

Analyze the burst transmission mechanism of the memory bus and optimize the continuous transmission of data.

  1. Clock period and bus bandwidth
  • CPU clock cycle: The time it takes for the CPU to complete one clock oscillation, usually the reciprocal of the main frequency. Here, the CPU frequency is 800MHz, so the CPU clock cycle is 1/800MHz = 1.25ns.

  • Bus clock cycle: The time required for the bus to perform a clock oscillation, which is the reciprocal of the bus frequency. The bus frequency is 200MHz, so the bus clock period is 1/200MHz = 5ns.

  • Bus bandwidth: refers to the amount of data transmitted through the bus per unit time. For a 32-bit bus, 32 bits of data are transmitted per clock cycle, and the bandwidth is 32 bits × clock frequency. Here, the bus width is 32 bits, the clock frequency is 200MHz, and the bandwidth is 32 bits × 200MHz = 800MB/s or 4B/5ns = 800MB/s.

  1. Cache misses and read bus transactions
  • Cache block size: refers to the size of each block in the cache, usually in bytes. Here, the Cache block size is 32B.

  • Read burst transfer bus transaction: When a cache miss occurs, a block needs to be read from main memory into the cache. This process is called a read burst transfer bus transaction.

  1. Memory bus read transaction time
  • Read transaction time: includes processes such as sending addresses and commands, preparing data in the memory, and transmitting data. Here, the time of each read transaction is the sum of the address transfer time, the first body read data time and the data transfer time, calculated as 5ns + 40ns + 8×5ns = 85ns.
  1. CPU execution time
  • Instruction execution time: The time it takes for the CPU to execute an instruction. It includes the instruction execution time when cache hits and the average overhead when cache misses.

  • Average overhead: The average overhead caused by cache misses, taking into account the average number of memory accesses, cache miss rate and read bus transaction time. Here, the average CPU execution time of an instruction is 10.1ns, and the total CPU execution time of BP is 1010ns.

These glossaries cover key concepts in computer architecture such as clock cycles, bus bandwidth, cache misses, read bus transactions, and CPU execution time.


3.29 Page virtual storage management

3.29.1 Basic concepts of page storage

Introduces the basic concepts of page storage, including virtual pages, physical pages, and page tables.

3.29.2 Fully associative mapping of TLB

Detailed description of TLB's fully associative mapping method to improve address mapping efficiency.

3.29.3 Cache data area and page table entry size calculation

Calculate the size of the cache data area and page table entries to ensure that they match and work together.

3.29.4 Virtual address to physical address conversion process

Describe the conversion process from virtual address to physical address and reveal the working principle of page-based virtual storage.

  1. Page virtual storage management method
  • Page size: The size of each page, in bytes. Here, the page size is 8KB.

  • Virtual address and physical address: The virtual address is the address used in the program, and the physical address is the actual address in main memory. Here, the virtual address is 32 bits and the physical address is 24 bits.

  • TLB (Translation Lookaside Buffer): Cache of page table entries, used to accelerate the conversion of virtual addresses to physical addresses.

  • Fully associative mapping: Each entry in the TLB can be mapped to any virtual page, providing greater flexibility.

  • Cache data area size and group association: The size of the area where Cache stores data, and the group association mapping method. Here, the Cache data area size is 64KB, using a two-way set associative method.

  • Main memory block size: The size of the data block in main memory. Here, the main memory block size is 64B.

  1. Schematic diagram of storage access process
  • The number of bits in fields A~G: A and B represent the number of bits in the virtual address, C represents the number of physical addresses, D represents the number of offset addresses in the page, E and F represent the number of cache-related bits, and G represents the main The number of bits in the block. Here, A = B = 19, C = 11, D = 13, E = F = 9, G = 6.

  • TLB tag field B: stores the virtual page number, indicating the page table entry of which virtual page the TLB entry corresponds to.

  1. Cache mapping and block number loading
  • Cache group number: The number of each group in the Cache, used to determine which group to map to. Here, block number 4099 is mapped to Cache group number 3.

  • Content of field H: A field in Cache, related to the group number. Here, the H field content corresponding to block number 4099 is 0 0000 1000B.

  1. Cache missing and page fault handling
  • Cache Missing: Cache missing occurs when the accessed data is not in the Cache. Handling cache misses is usually less expensive than handling page faults.

  • Page fault processing: When the accessed data is not in main memory, the missing page needs to be loaded from external storage (usually the hard disk). This process is expensive because it requires disk I/O operations.

  • Write-through strategy and write-back strategy: The write-through strategy means that every time the data is modified, it is written back to the main memory immediately, while the write-back strategy means that the modified data is written back to the main memory only when it is replaced from the cache. Here, Cache can adopt a write-through strategy, but always adopt a write-back strategy when modifying page content, because disk write operations are slower than main memory write operations.

These terminology explanations cover important concepts in computer architecture such as page virtual storage management, TLB, Cache mapping, block number loading, Cache missing and page fault handling.


3.30 Write-through and write-back strategies

3.30.1 Cache write strategy overview

Summarize Cache's write strategies, including write-through and write-back, and their advantages and disadvantages.

3.30.2 Differences between write-through and write-back strategies

Compare the differences between write-through and write-back strategies to help choose a write strategy suitable for the system.

3.30.3 Why use the write-back strategy when modifying page content?

Explain why you choose the write-back strategy when modifying page content, and analyze its advantages and application scenarios.

  1. Cache line mark, LRU bit, modification bit
  • Tag: Used to identify the uniqueness of the main memory block in the Cache, and determine whether it is a hit by comparing it with the high bits of the main memory address.

  • LRU bit: Least Recently Used, indicating the least recently used bit, used to implement the LRU replacement algorithm. The LRU bits of each row are used to record the last time this row was accessed to determine which row is the least recently used when replacing.

  • Modification bit: also called dirty bit, used to identify whether the data in the cache has been modified. In the write-through strategy, the modified data will be written back to the main memory immediately, so this bit is generally not needed.

  1. Number of data cache misses when accessing array s
  • Main memory block size: The size of the data block in main memory, here is 64B.

  • Number of Cache groups: The number of groups in the Cache, based on 8-way group associative mapping, is calculated as 32KB/(64B×8) = 64.

  • LRU replacement algorithm: Least Recently Used replacement algorithm, used to select the least recently used row in the Cache row for replacement.

  1. The process of accessing instructions from the Cache
  • Main memory address division: Divide the main memory address according to different parts of the main memory address (tag, group number, block address).

  • Group index: Cache group index calculated based on the main memory address, used to locate the corresponding Cache group.

  • Valid bit: The valid bit of the Cache line identifies whether the line stores valid data.

  • LRU replacement process: Select the least recently used row in the cache for replacement, usually by updating the LRU bit.

These terminology explanations cover computer architecture concepts related to cache line structure, LRU replacement algorithm, and cache misses.


3.31 Memory access hit rate

3.31.1 Memory access hit rate calculation

Evaluate the system's data access efficiency by calculating the memory access hit rate.

3.31.2 Cache, TLB and Page hit analysis

Analyze Cache, TLB and Page hits in memory access to reveal the impact of different components on performance.

3.31.3 Impact of hit rate on system performance

Discuss the impact of memory access hit rate on system performance and provide guidance for system optimization.

  1. TLB (Translation Lookaside Buffer): It is a cache used to store the translation mapping from virtual addresses to physical addresses. TLB accelerates the conversion process of virtual memory addresses to physical memory addresses and reduces the time to access main memory. When the CPU accesses a virtual address, the TLB is first queried. If the virtual page number is in the TLB, the TLB is hit and the corresponding physical page frame number is directly obtained. Otherwise, a TLB miss will occur.

  2. Cache Miss: When performing memory access in the Cache, if the required data or instructions are not found in the Cache, a Cache Miss occurs. At this time, the system needs to load the corresponding data block or instruction block from the main memory into the Cache so that the CPU can continue execution.

  3. Page miss (Page Fault): In the virtual memory system, if the CPU tries to access a virtual page that has not been transferred into the main memory, a Page miss will occur. At this time, the operating system will load the corresponding page from the disk to the main memory to satisfy the CPU's memory access request.

In the given options, the TLB hit, Cache hit, Page miss situation mentioned in option D cannot happen because when the Page miss occurs, the TLB hit is impossible because the virtual page number is not in the TLB.

3.32 Virtual address translation

3.32.1 Meaning of TLB tag field

Explain the meaning of the TLB tag field and its role in virtual address translation.

3.32.2 TLB and Cache access process

Describe the access process of TLB and Cache in detail, and reveal their cooperative work in virtual address translation.

3.32.3 Analysis of virtual address translation results

Analyze the results of virtual address translation to ensure that the system can correctly access the corresponding physical address.

  1. Page storage management (Paging): It is a virtual memory management technology that divides main memory and virtual memory into fixed-size pages (Page), usually 4KB or other powers of size. This division causes the program's address space to be divided into pages of the same size. Page storage management allows pages of different programs to be stored dispersedly in main memory without requiring consecutive physical addresses.

  2. TLB (Translation Lookaside Buffer): It is a cache used to store the translation mapping from virtual addresses to physical addresses. TLB accelerates the conversion process of virtual memory addresses to physical memory addresses and reduces the time to access main memory. When the CPU accesses a virtual address, the TLB is first queried. If the virtual page number is in the TLB, the TLB is hit and the corresponding physical page frame number is directly obtained.

In this problem, through the fully associative mapping of TLB, the corresponding TLB entry is searched according to the page number of the virtual address 03FFFH, and the page frame number 0153H is obtained. It is spliced ​​with the intra-page offset address of the virtual address to obtain the physical address 0153180H. .

3.33 Computer instruction execution process

3.33.1 Memory access during instruction execution

Reveal the memory access process during computer instruction execution and emphasize the importance of memory to instruction execution.

3.33.2 Cache access count estimation

Estimate the number of Cache accesses to provide basic data for system performance analysis.

3.33.3 TLB hit and Cache hit analysis

Analyze TLB and Cache hits to gain an in-depth understanding of their contribution to instruction execution efficiency.
Explain professional terms:

  1. Page virtual storage management method: A computer memory management method that divides main memory and auxiliary memory into fixed-size pages (usually 4KB or 8KB), and divides programs and data into pages of the same size. Virtual storage space is also divided into pages of the same size. When a program is executed, only the pages currently needed are loaded into main memory, while the remaining pages remain in secondary storage. This allows the size of the program to exceed the size of physical memory.

  2. TLB (Translation Lookaside Buffer, fast table): a hardware cache used to store virtual address to physical address conversion information to accelerate the address translation process. The TLB is usually a small and fast cache that stores recently accessed page table entries, reducing the time to obtain translation information from main memory.

  3. Cache: A cache in the memory hierarchy used to temporarily store the most commonly used data and instructions to increase the speed of CPU access to memory. The Cache stores recently used data. If the data the CPU needs is found in the Cache (hit), it can be obtained faster; otherwise, it needs to be obtained from main memory or other slower memory levels (missing).

  4. Write Through: A cache writing strategy that writes data to the main memory immediately when it is written to the cache. This ensures that the data in main memory and cache are consistent, but the write overhead is larger.

  5. Memory Address: An address used to uniquely identify a unit in computer storage. In this context, xaddr represents the address of the storage unit corresponding to variable x.

  6. Number fetch, operation and write back process: three basic steps when executing instructions. The fetch phase involves reading data from memory, the operation phase performs arithmetic or logical operations, and the write-back phase writes the results back to memory.

Hopefully the above explanation will help you understand the relevant computer architecture and operating system concepts.

3.34 Cache in direct mapping mode

3.34.1 Explanation of Cache related concepts

Explain the related concepts of Cache in direct mapping mode, including row tags, indexes, blocks, etc.

3.34.2 Direct mapping address structure

Describe in detail the address structure of Cache in direct mapping mode, including the location and role of fields such as tags, indexes, and block offsets.

3.34.3 Cache capacity calculation

Calculate the Cache capacity in direct mapping mode to ensure that it meets the system's storage capacity requirements.

3.34.4 Cache line tag entry structure

Describe the structure of Cache line tag items in direct mapping mode, and reveal the function and storage structure of the tag.
Explain professional terms:

  1. Direct-mapped mode: A cache mapping mode in which each main memory block can only be mapped to a specific line of the cache. This means that a specific block of main memory can only be stored in one location in the cache and has no alternative location.

  2. Main memory block: A fixed-size block of data in main memory, usually a copy of the corresponding data in the cache. The size of the main memory block is determined by the computer architecture.

  3. Word: In computers, a word is a group of bits, usually 32 or 64 bits, that represents the size of an integer or other data type. Here, the word size is 32 bits.

  4. Write-back mode: A cache write strategy in which data in the cache is written back to main memory only after it has been modified. This can reduce the number of write operations to main memory and improve performance.

  5. Cache line: A storage unit in the cache that usually corresponds to a block in main memory. Each Cache line contains a portion of the main memory block's data and additional information for marking and control.

  6. Cache capacity: The total amount of data that the cache can store. Here, it refers to the total capacity of the Cache that can store 4K words of data.

  7. Flag bit: In the cache line's flag array, the flag bit is used to store identification information of the main memory block to determine whether the requested data is cached.

  8. Valid bit: In the cache line's tag array, the valid bit is used to indicate whether the data in the cache line is valid. If the valid bit is 1, it means that the data in the cache line is valid.

  9. Consistency maintenance bit (dirty bit): In cache, the consistency maintenance bit is used to mark whether the data in the cache line has been modified. When using write-back mode, the dirty bit is used to indicate whether the data in the cache needs to be written back to main memory.

  10. Replacement algorithm control bit: In the cache, the replacement algorithm control bit is used to record which replacement algorithm is used. Here, the question does not mention relevant information about the replacement algorithm, so it will not be discussed further.

Based on the above explanations of professional terms, you can better understand the related concepts of computer architecture and cache management.

3.35 Page fault handling and exceptions

3.35.1 Causes of page fault processing exceptions

Explain the causes of page fault processing exceptions, including the situation where the virtual page is not in main memory.

3.35.2 Overview of page fault processing process

Outline the page fault handling process, including loading the missing page from auxiliary storage to main memory.

3.35.3 Execution process of page fault handler

Describe the execution process of the page fault handler in detail to ensure that page fault exceptions are handled effectively and the correct data state is maintained.
When it comes to page fault handling in request paging systems, the following jargon explains these statements:

  • Page fault : A page fault refers to an abnormal situation when the CPU detects that the required page is not in main memory during program execution. This means that the required page is not currently in memory, resulting in a page fault interrupt.

  • Page fault handler : A piece of code provided by the operating system that handles page fault interrupts. Its function is to load the required pages from external storage (such as hard disk) into memory based on the address information provided by the page fault interrupt.

  • Page fault interrupt : When the CPU detects that the required page is not in memory, a page fault interrupt is generated. This is an abnormal situation detected by the CPU. The CPU will temporarily stop executing the program and ask the operating system for help to solve the page fault problem.

  • Address Translation Exception : A page fault is an address translation exception that occurs when the CPU attempts to access a page that does not exist in main memory.

Option D described in the description is wrong because after the page fault processing is completed, the CPU will not directly return to the next instruction execution of the instruction that caused the page fault, but will re-execute the instruction that caused the page fault. The CPU needs to re-execute this instruction because the required page may have been loaded into memory when the page fault occurs.

3.36 TLB and Cache features

3.36.1 Relationship between hit rate and locality of TLB and Cache

Analyze the relationship between the hit rate and locality of TLB and Cache, and consider the impact of locality on address mapping.

3.36.2 Analysis of memory access process when missing

When TLB or Cache is missing, analyze the additional memory access process performed by the system to understand the missing processing mechanism.

3.36.3 Feasibility of missing hardware implementation

Evaluate the feasibility of missing processing hardware implementation, taking into account factors such as cost and performance.

  1. TLB (Translation Lookaside Buffer) : TLB is a hardware cache used to store the most recent or most commonly used virtual address to physical address translation information. It accelerates the conversion process from virtual address to physical address, caches some page table entries, reduces frequent access to the page table, and improves the speed of address conversion.

  2. Cache : Cache is a cache that stores copies of the most commonly used data and instructions in main memory. It is located between the CPU and the main memory and has a faster access speed, which can alleviate the contradiction between the slower main memory access speed and the faster CPU speed, and improve the overall performance of the system.

  3. DRAM (Dynamic Random Access Memory) : DRAM is a dynamic random access memory used for main memory. Its memory cells are composed of capacitors and resistors and need to be refreshed regularly to maintain stored data. Compared with SRAM, DRAM has slower access speed, but has higher storage density and low cost.

  4. SRAM (Static Random Access Memory) : SRAM is a static random access memory, commonly used in Cache and some high speed caches. It consists of flip-flops, does not require regular refreshes, is faster but more expensive, and is usually used when faster access speeds and lower power consumption are required.

3.37 Virtual address translation and storage management

3.37.1 Paging virtual storage management method

3.37.1.1 Conversion between virtual address and physical address
3.37.1.2 The impact of page size on the number of address bits

Explain the paging virtual storage management method, analyze the conversion between virtual address and physical address, and the impact of page size on the number of address bits.

3.37.2 The role and organization of page tables

3.37.2.1 Explanation of virtual page number, real page number, and existence bit

Describe the role of the page table and explain the meaning of fields such as virtual page number, real page number, and existence bit.

3.37.3 Process of CPU accessing virtual address

3.37.3.1 Page table query and TLB mapping
3.37.3.2 Virtual address to physical address mapping

Describes the process of CPU accessing virtual addresses, including page table query, TLB mapping and virtual address to physical address mapping.
In computer virtual storage management, the following are explanations of related terms:

  1. Paging virtual storage management method : A memory management technology that divides the process memory into fixed-sized pages (page frames), and divides the process's virtual address space into virtual pages of the same size. When a process needs to access a virtual page, the system maps it to a physical page (page frame) in main memory (physical memory).

  2. Page size : refers to the capacity of one page in memory. It determines the number of bytes in a single page of memory. In this example, the page size is 4KB (2^12 bytes).

  3. Virtual address space size : Indicates the total number of virtual addresses provided by the operating system for each process. In this example, the virtual address space size is 4GB (2^32 bytes).

  4. Virtual address : An address in the address space assigned to a process by the operating system. Typically, it is a value starting from 0 and going up to the size of the virtual address space.

  5. Page table : A data structure used for mapping between virtual addresses and physical addresses. Entries in the page table indicate the relationship between virtual pages and real pages.

  6. Virtual address translation : The process of converting virtual addresses in a process to corresponding physical addresses. This process involves querying the page table to find the physical page corresponding to the virtual page.

In this problem, according to the given page table content, through virtual address translation, the virtual page number corresponding to the virtual address 0008 2840H is 00082H. By looking up the page table, we can know that the corresponding real page number (page frame number) is 018H. The actual main memory address 01 8840H is obtained by concatenating the page frame number and the page address.

3.38 Cache structure and page virtual storage management

3.38.1 Cache direct mapping method

3.38.1.1 The location and meaning of Cache line number, tag, valid bit and other fields
3.38.1.2 Virtual address to physical address mapping
3.38.1.3 Judgment of Cache hits and misses

Describe in detail the structure of the Cache direct mapping method, including the location and meaning of fields such as line numbers, tags, and valid bits, the mapping process from virtual addresses to physical addresses, and the judgment conditions for Cache hits and misses.

3.38.2 Cooperative work of Cache and page virtual storage

3.38.2.1 How Cache and page virtual storage work together
3.38.2.1 Selection of page replacement algorithm
3.38.2.2 Synchronization of data in Cache and physical memory

Detailed explanation of how Cache and paged virtual storage work together, including key aspects such as selecting the page replacement algorithm and ensuring synchronization of data in Cache with physical memory.

3.38.3 Maintaining cache consistency

3.38.3.1 Basic principles of MESI protocol
3.38.3.2 Challenges in cache consistency maintenance

Explain how to maintain cache consistency through methods such as the MESI protocol, and analyze the challenges faced in maintaining cache consistency.

3.39 Memory hierarchy optimization

3.39.1 Data transfer process in hierarchical structure

3.39.1.1 Data transfer between main memory, cache and registers
3.39.1.2 Timing and synchronization control of data transfer

Detailed description of the data transfer process in the memory hierarchy, including the transfer of data between main memory, cache and registers, as well as the timing and synchronization control of data transfer.

3.39.2 Optimization strategies for memory hierarchy

3.39.2.1 Mechanism of data prefetching
3.39.2.2 Selection of replacement algorithm

Discuss optimization strategies for memory hierarchies, including data prefetching mechanisms and selection of replacement algorithms, to improve system performance.

3.39.3 Memory bus conflict resolution

3.39.3.1 Bus separation and parallel transmission
3.39.3.2 Conflict detection and scheduling strategy

Methods to solve memory bus conflicts include bus separation and parallel transmission, as well as conflict detection and optimization of scheduling strategies.

3.39.4 The role of MMU and the mapping method of TLB

The role of MMU: The Memory Management Unit (MMU) is an important part of the computer architecture and is responsible for the mapping of virtual addresses to physical addresses. It implements the concept of virtual memory, allowing programs to use a larger address space than actual physical memory.

TLB mapping method: TLB (Translation Lookaside Buffer) is a cache used to accelerate the conversion of virtual addresses to physical addresses. The TLB stores the mapping relationship between virtual page numbers and physical page frame numbers.

3.39.4.1 Explanation of TLB fully associative mapping

In the fully associative mapping of the TLB, any virtual address can be mapped to any cache line of the TLB without fixed location constraints. The advantage of this mapping method is that it can handle different virtual addresses more flexibly, but the disadvantage is that it may cause substitution conflicts because any virtual address can be mapped to any TLB location.

3.39.5 Cache mapping method and replacement strategy

3.39.5.1 Two-way group associative mapping and LRU replacement strategy

Two-way set associative mapping: In two-way set associative mapping, the Cache is divided into several groups, each group containing two cache lines. A given block can be mapped to either of two different groups.

LRU replacement policy: LRU (Least Recently Used) is a cache replacement policy that selects the least recently used cache block for replacement. This means that the oldest cache block will be replaced.

3.39.5.2 Physical address translation process of virtual address

The physical address translation process of virtual addresses involves the collaborative work of the MMU and TLB. The brief steps are as follows:

  1. When a program accesses a virtual address, the MMU first checks whether there is a virtual address to physical address mapping in the TLB.

  2. If there is a mapping in the TLB, the physical address provided by the TLB is used directly.

  3. If there is no mapping in the TLB, the MMU matches the page number of the virtual address with the page table to find the corresponding physical page frame number.

  4. If the TLB is used, a new mapping is added to the TLB, or if the TLB is full, an entry in the TLB is replaced according to the replacement policy.

  5. Finally, the physical address is obtained and the program can access the corresponding data.
    Below is an explanation of related terms:

  6. Paged Virtual Memory Management : This management method divides the virtual address space and the physical address space into fixed-size pages or frames. The virtual address is mapped to the physical address through the mapping relationship of the page table.

  7. TLB (Translation Lookaside Buffer) : TLB is a cache used to store the most recent or most commonly used virtual address to physical address translation information. It accelerates the conversion process from virtual address to physical address, caches some page table entries, reduces frequent access to the page table, and improves the speed of address conversion.

  8. Cache : Cache is a cache memory used to store copies of the most frequently used data and instructions. It is located between the CPU and main memory, providing faster data access speeds.

  9. Direct Mapping : It is a common mapping method in Cache, which maps each block in main memory to a unique row in Cache. This means that a block in main memory can only be mapped to a specific location in the Cache.

  10. LRU replacement algorithm (Least Recently Used Replacement Algorithm) : LRU is a replacement algorithm in Cache. It will replace the cache line that has not been used for the longest time. This algorithm retains the least recently used data to improve Cache hit rate.

  11. Write Back Policy : This is a data update strategy in Cache, that is, before the data is replaced out of the Cache, the data will only be written back to the main memory after it has been modified, rather than written back to the main memory immediately.

3.40 Future Trends in Memory Technology

3.40.1 Research on new storage media

3.40.1.1 Innovation based on non-volatile storage media
3.40.1.2 Balance between storage density and read and write speed

Research the development trends of new storage media, including innovations based on non-volatile storage media and the balance between storage density and read and write speeds.

3.40.2 Evolution of memory architecture

3.40.2.1 Multi-channel, multi-level memory architecture
3.40.2.2 Trend of integrated storage and computing

Discuss the future evolution of memory architecture, including the trends of multi-channel, multi-level memory architecture and integrated storage and computing.

3.40.3 Challenges and solutions to memory security

3.40.3.1 Physical attacks and side channel attacks
3.40.3.2 Application of encryption and isolation technology

Analyze the challenges facing memory security, including physical attacks and side-channel attacks, and explore the application of encryption and isolation technologies.

3.40.4 TLB two-way group associative method

Allocation of TLB tags and group numbers: In the TLB two-way group association method, the virtual address is divided into tags, group numbers, and offsets. The tag is used to uniquely identify the mapping of a virtual address in the TLB, the group number is used to select the group in the TLB, and the offset indicates the specific location within the group.

3.40.5TLB replacement strategy

Replacement process and LRU algorithm in TLB: When the required mapping is not hit in the TLB, replacement is required. When using the LRU algorithm, the least recently used TLB entry is selected for replacement. This ensures that relatively rarely used mappings are replaced, improving the TLB hit rate.

3.40.6 TLB hit and replacement of virtual address

TLB hit of virtual address: When the virtual address accessed by the program finds a mapping in the TLB, it is a TLB hit. At this time, the MMU can directly use the physical address provided by the TLB without looking up the page table.

TLB replacement of virtual address: When the mapping of the virtual address is not found in the TLB, TLB replacement is required. The replacement policy determines which TLB entry is selected for replacement, usually using the LRU algorithm to ensure that the least used entries are replaced. After the replacement is completed, the new mapping relationship will be added to the TLB to improve future access efficiency.

  1. Paged Memory Management : This is a virtual storage management method that divides the virtual address and physical address space into fixed-size pages or frames. By using page tables, virtual addresses are mapped to physical addresses for memory management and address translation.

  2. TLB (Translation Lookaside Buffer) : TLB is a cache used to store the most recent or most commonly used virtual address to physical address translation information. It accelerates the conversion process from virtual address to physical address and reduces frequent access to the page table, thus improving memory access speed.

  3. Two-way Set-Associative : This is a cache mapping technology that specifies a capacity of two cache lines for each group in the Cache. It allows a main memory block to be mapped to either of two cache lines, improving the Cache hit rate in this way.

  4. LRU replacement policy (Least Recently Used Replacement Policy) : LRU is a cache replacement policy that is used to select cache lines to be replaced when the cache is full. It replaces the oldest unused cache lines on a recently used basis to make room for new data.

  5. Virtual Address : refers to the address used in the program, usually the address used in the code written by the programmer. Virtual addresses require address translation to map to actual addresses in physical memory.

  6. Page Size : refers to the size of a page in paging storage management, indicating the unit size of the virtual address space and the physical address space. Page size determines how the program is stored in memory and the size of the page table.

These terms cover key concepts common in computer memory management and storage systems, including virtual storage, caching technology, and address translation.

Chapter 4 Input/Output Systems

1. Display memory and bandwidth calculations

  • The relationship between display resolution, color depth, and frame rate
  • Calculation of bandwidth required to refresh screen
  • Allocation and calculation of total video memory bandwidth
  1. DRAM chip: Dynamic Random Access Memory chip, a type of memory used in computer systems. DRAM stores data as capacitors and needs to be refreshed regularly to maintain the stored information.

  2. Resolution: The number of pixels on the display, usually expressed as the number of horizontal pixels × the number of vertical pixels, used to describe the clarity of the screen and the details of the image.

  3. Color depth: describes the number of colors that can be represented by each pixel in an image, usually expressed in bits. 24-bit color depth means there are 2^24 different color choices for each pixel.

  4. Frame rate: refers to the number of image frames displayed per second, usually measured in Hertz (Hz). The higher the frame rate, the smoother the image appears on the screen.

  5. Total memory bandwidth: Used to describe the rate at which the graphics card or display memory transfers data, usually measured in bits (or bytes) transferred per second. The total bandwidth of the video memory determines the transfer speed of image data from the video memory to the display.

  6. Refresh the screen: Transfer image data in the video memory to the monitor to update the image displayed on the screen, usually at a certain frame rate.

  7. Mb/s: Megabit per second, a unit of network or data transmission rate, indicating the number of bits transmitted per second.

Please note that these terms are common concepts in the computer and display technology fields.

2. Data transmission on I/O bus

  • The functions of data lines, control lines, and address lines
  • The command word, status word, and interrupt type number in the I/O interface are transmitted on the data line.
    This question involves the information transmitted by the data line of the I/O (Input/Output) bus. The following are explanations of related professional terms:
  1. I/O bus: A communication channel used to connect the computer's central processing unit (CPU) and external devices for input and output operations.

  2. Data Cable: Part of the I/O bus, an electronic signal line used to transmit data between a computer and external devices. The data lines are responsible for the actual data transfer.

  3. I/O interface: A hardware or electronic interface used to connect a computer system to a peripheral device, allowing data to be transferred between the computer and the peripheral device.

  4. Command Word: A control signal in the I/O interface, used to instruct peripherals to perform specific operations or commands.

  5. Status Word: A signal in the I/O interface that reflects the status of the peripheral, such as whether the operation is completed or whether an error occurred.

  6. Interrupt type number: used to indicate the type or source of an interrupt, helping the computer system determine how to handle a specific interrupt event.

In this question, the correct answer is D, because the information transmitted on the data line of the I/O bus includes the command word, status word and interrupt type number in the I/O interface. This information is transmitted through data lines to enable efficient communication between the computer and external devices.

3. I/O access

  • Combined use of status port and control port
  • I/O port definition
  • The difference between independent addressing and unified addressing
  1. Status port and control port: Two registers in the I/O interface, used to transmit status information and control information of peripherals respectively. They can share the same register or can be used independently, depending on the system design.

  2. I/O port: In the I/O interface, the registers accessible to the CPU are used for input and output operations. By reading and writing I/O ports, the CPU can communicate with external devices, execute specific commands or obtain device status information.

  3. Independent addressing method: A design method of the I/O interface in which the address space of the I/O port and the main memory address space are separate and do not overlap with each other. This means using different address lines to address main memory and I/O ports.

  4. Unified addressing: A design method for I/O interfaces in which the address space of the I/O port may be the same as the main memory address space. In this mode, the same address lines are used to address main memory and I/O ports, and the I/O ports can be accessed through memory access instructions.

These professional terms involve related concepts of input/output operations and interface design in computer systems, including the purpose of registers, the allocation of address space, and the communication mechanism between the CPU and external devices.

4. Data transfer by I/O instructions

  • How I/O instructions transfer data
  • Where data transfers occur, such as between general-purpose registers and I/O ports
  1. I/O instructions: Instructions issued by the CPU to perform input and output operations. These instructions include reading and writing operations on I/O ports to realize data transfer between the computer and external devices.

  2. I/O port: Register in the I/O interface, used for the transmission of buffered information. The CPU interacts with these ports through I/O instructions, reads or writes data, and communicates with external devices.

  3. General register: A register in the CPU used to store temporary data and operation results. During I/O operations, general purpose registers can be used to temporarily store or process data exchanged with external devices.

These technical terms involve concepts related to computer instructions, I/O operations, and registers.

5. I/O access

  • Classification of disk drives, printer adapters, network controllers, and programmable interrupt controllers
  • Functions and functions of I/O interfaces
  1. I/O interface (or I/O controller): A hardware or electronic interface responsible for connecting a computer system to external devices and managing input and output operations. Its main function is to receive I/O control signals sent by the host and implement information exchange between the host and external devices.

  2. Disk drive: A hardware device used to read and write data to a disk. Disk drives usually include components such as heads, disks, and related read and write circuits.

  3. Printer Adapter: An I/O controller that connects a computer to a printer and manages the printer's input and output operations. Printer adapters act as a bridge between your computer and printer, allowing them to work together.

  4. Network Controller: An I/O controller that connects computers to a network and handles input and output for network communications. The network controller is responsible for managing the transmission of data between the computer and the network, and enabling communication between the computer and other devices.

  5. Programmable Interrupt Controller: I/O controller that manages and distributes interrupt signals to help handle system interrupts. Interrupts are a mechanism that allow external devices to notify the CPU that an event has occurred, and a programmable interrupt controller helps the system handle these interrupts efficiently.

These technical terms involve concepts related to computer architecture, input/output operations, and external device connections.

6. External interruption

  • Events that cause external interrupts
  • Interrupt service routine execution sequence
  1. External interrupts: refer to interrupts generated by events other than when the CPU executes instructions. They are usually interrupts from outside the CPU and memory, such as interrupts caused by external events such as input devices and timers.

  2. Keyboard input: An external event involving an input device. Each keyboard input may cause the CPU to interrupt execution to read the input data.

  3. The divisor is 0: It is an abnormal situation, that is, an error occurs during the operation. Exceptions usually occur within the CPU rather than from external devices.

  4. Floating-point operation underflow: Underflow occurs when the result of a floating-point operation is smaller than the minimum representation range of a floating-point number. Usually no interrupt is caused, but machine zero or other processing is followed.

  5. Memory page fault: An interrupt that may occur when the CPU attempts to access a page that does not exist in the memory while executing instructions. This is not an external interrupt either.

These professional terms involve concepts related to computer interrupt handling, abnormal situations, and external device events.

7. Single-level interrupt system

  • Execution sequence of single-level interrupt system
  • Operations within the interrupt service routine
  1. Single-level interrupt system: Only one interrupt is allowed in the interrupt system, and interrupt nesting is not allowed.

  2. Protect the scene: Before the interrupt service routine is executed, the current state of the CPU registers needs to be saved so that it can be restored correctly after the interrupt service routine is executed.

  3. Turn on interrupts: After the interrupt service routine is executed, interrupts need to be turned on to allow the system to respond to other interrupt events.

  4. Save breakpoints: Before the interrupt service routine is executed, the breakpoint position of the current program needs to be saved so that it can correctly return to the original execution position when the interrupt returns.

  5. Interrupt event processing: The main stage of interrupt service program execution, processing specific interrupt events, which may include identifying the source of the interrupt and performing corresponding operations.

  6. Restoring the scene: After the interrupt service routine is executed, the previously saved CPU register state needs to be restored in order to restore the program execution state before the interruption.

  7. Interrupt return: After the interrupt service routine is executed, return to the breakpoint position of the original program through the interrupt return instruction.

These professional terms involve the concepts of interrupt handling, interrupt service routines and single-level interrupt systems.

8. Multi-level interrupt system

  • The meaning and settings of interrupt mask words
  • Judgment and setting of interrupt priority
  1. Interrupt priority: Indicates the order in which the computer system selects the interrupts to be processed according to certain rules when multiple interrupt requests arrive at the same time. Higher priority interrupts are usually processed first.

  2. Interrupt mask word: A mechanism used to control the priority of interrupt processing. By setting different bits of the interrupt mask word, interrupts of different priorities can be masked or allowed. The setting of the mask word determines which interrupts can be responded to and which interrupts will be masked.

These professional terms involve the management of computer interrupt systems and the concept of interrupt priority.

9. Device I/O time calculation

  • Calculation of the I/O time used by the CPU as a percentage of the entire CPU time
  1. Clock frequency: refers to the clock frequency of a computer processor, measured in Hertz (Hz), indicating the number of clock cycles per second of the processor.

  2. Scheduled query method: A device I/O control method that periodically checks the device status through scheduled query to decide whether to perform data transmission.

  3. Clock cycle: In computers, a clock cycle is the time it takes for a processor to execute a basic instruction and is the reciprocal of the main frequency.

10. Handling of external interrupts

  • Interrupt the operation of implicit instructions
  • Interrupt service routine execution steps
  1. External interrupts: refer to interrupts generated by events other than when the CPU executes instructions. They are usually interrupts from outside the CPU and memory, such as interrupts caused by external events such as input devices and timers.

  2. Interrupt implicit instructions: refer to some operations that the system automatically performs when processing interrupts, usually including turning off interrupts, saving breakpoints, forming the entry address of the interrupt service program and sending it to the PC, etc.

  3. Turn off interrupts: The operation is to prevent other interrupts from occurring during the execution of the interrupt service routine to ensure the complete execution of the interrupt service routine.

  4. Saving the contents of general-purpose registers: After entering the interrupt service routine, in order to prevent the execution of the interrupt service routine from affecting the general-purpose registers, it is usually necessary to save the contents of the general-purpose registers.

  5. Form the entry address of the interrupt service routine and send it to the PC: The interrupt implicit instruction will set the program counter (PC) to the entry address of the interrupt service routine, thereby ensuring that control is transferred to the correct handler.

11. Comparison of interrupt I/O and DMA methods

  • The difference between interrupt I/O and DMA methods
  • Applicable scenarios for interrupt I/O and DMA methods
  1. Interrupt I/O mode: In this mode, every time the I/O device inputs each data, an interrupt request is sent to the CPU, which is processed in the interrupt handler and then continues to perform other tasks.

  2. DMA mode: In this mode, a dedicated DMA controller is responsible for transferring data directly between the CPU and I/O devices without the intervention of the CPU. The DMA controller performs data transfers by requesting bus usage rights.

12. Interrupt response time and CPU utilization

  • Calculation of interrupt response time
  • Calculation of the percentage of time the CPU spends on device I/O as a percentage of the entire CPU time
  1. Interrupt request response time: refers to the time from when the device issues an interrupt request, the system detects the interrupt request to starts executing the interrupt service routine.

  2. Interrupt processing time: refers to the time required to execute the interrupt service program, including processing the interrupt request, saving the scene, executing the interrupt service program and restoring the scene, etc.

  3. The maximum delay time of interrupt response: indicates the maximum waiting time for interrupt response that the system can tolerate, that is, after the interrupt request is issued, the system needs to respond and start processing the interrupt within this time.

13. Interrupt I/O information exchange

  • Information exchanged between the CPU and the I/O ports in the print control interface
  1. Interrupt I/O mode: In this mode, the CPU handles requests from external devices through interrupt response. In the case of printout, communication between the print control interface and the CPU occurs through the I/O port.

  2. I/O port: refers to a specific port used for input and output operations, through which data can be exchanged with external devices.

14. Characteristics of DMA method

  • Settings before DMA transfer
  • The process by which the DMA controller requests bus usage rights and data transfers
  1. Multiple interrupt system: refers to the system that can handle multiple interrupt requests at the same time and manage multiple interrupt sources through mechanisms such as interrupt controllers or interrupt vector tables.

  2. Interrupt response: refers to the response measures taken by the system when an interrupt request occurs, usually including saving the current process state, executing an interrupt service routine, etc.

  3. Off interrupt state: refers to turning off interrupts during interrupt processing to prevent interference from other interrupts.

  4. Open interrupt state: refers to the state that allows interrupts after the interrupt processing is completed so that the system can respond to other interrupts.

15. Calculation of CPU time usage percentage

  • Calculation of the percentage of time the CPU spends on device input/output as a percentage of the entire CPU time
  1. External I/O interrupt: refers to an interrupt request from an external device, which usually requires the CPU to respond and execute the corresponding interrupt service routine.

  2. Interrupt controller: It is a hardware device responsible for managing and scheduling multiple interrupt sources. It determines which interrupt will be responded to based on the set interrupt priority.

  3. Interrupt implicit instructions: refer to some operations that the system automatically performs when processing interrupts, usually including turning off interrupts, saving breakpoints, initiating interrupt service routines, etc.

  4. Interrupt enabled state: refers to the state in which the CPU is allowed to respond to interrupts, usually expressed by setting the interrupt state to on.

16. CPU utilization in DMA mode

  • Calculation of the percentage of time the CPU spends on device input/output in DMA mode as a percentage of the entire CPU time
  1. Interrupt mode: means that the device sends an interrupt request to the CPU, and the CPU responds to the interrupt and executes the corresponding interrupt service program to complete the data exchange between the device and the CPU.

  2. Interrupt overhead: refers to the time required for each interrupt, including interrupt response and interrupt processing time.

  3. Data buffer register: It is a register used to store data in the device interface. The data in it can be transferred to the CPU through interrupts.

  4. Data transfer rate: Indicates the amount of data the device can transfer per second.

  5. CPU frequency: refers to the number of clock cycles executed by the CPU per second.

17. DMA mode and external I/O interrupt

  • Who completes the processing after the DMA transfer is completed?
  • Direct control method of data transfer in DMA mode
  1. DMA mode: means that the DMA controller in the system is responsible for directly controlling the transmission of data between the device and the memory, reducing the burden on the CPU.

  2. Device driver: is a software module that controls the operation of a specific device. In DMA mode, the device driver is usually responsible for setting DMA transfer parameters.

  3. DMA controller: is a hardware device used to manage DMA transfers. It requests bus usage rights and directly controls the transfer of data between memory and the device.

  4. Interrupt service routine: It is a program executed by the CPU after the DMA transfer is completed. It is used to handle related work after the DMA transfer is completed.

18. External interrupt event

  • Knowledge points: Types and events of external interrupts
  1. External interrupt event: refers to an interrupt caused by events other than the CPU executing instructions. These events are usually related to changes in the status of external devices or systems.

  2. Internal exception: refers to an abnormal situation caused by a program error or specific conditions, which is usually detected and processed internally by the CPU.

19. Characteristics of external interrupts

  • Knowledge point: Characteristics of maskable interrupts and non-maskable interrupts
  1. External interrupt: refers to an interrupt request from outside the CPU, usually triggered by the interrupt request line INTR and the non-maskable interrupt line NMI.

  2. Non-maskable interrupt (NMI): It is a special external interrupt that has a higher priority than maskable interrupts and can be responded to even if the CPU is in an interrupt-off state.

  3. Maskable interrupt: It is an interrupt caused by an external device, and its processing priority can be changed through the interrupt mask word.

20. Periodic call DMA method

  • Knowledge point: Periodic call DMA data transfer
  1. DMA mode: means that the DMA controller in the system is responsible for directly controlling the transmission of data between the device and the memory, reducing the burden on the CPU.

  2. Cycle transfer DMA mode: It is a working method of DMA. After each time a certain amount of data is prepared, the DMA controller initiates a bus request, and then transfers the main memory cycle to transfer the data to the main memory.

  3. Data block: refers to a piece of data transferred by DMA, usually composed of multiple bytes.

  4. Bus usage rights: means that the DMA controller obtains the bus usage rights when transmitting data, preventing the CPU from accessing the bus.

21. Multiple interrupt system

  • Knowledge point: Response and detection conditions of multiple interrupt systems
  1. Multiple interrupt system: refers to the existence of multiple interrupt sources in the system, and multiple interrupt requests may occur at the same time. The CPU needs to select the interrupt to respond according to the priority and mask word settings.

  2. User mode and kernel mode: refer to the running status of the CPU. User mode executes user programs, while kernel mode executes operating system kernel code.

  3. Interrupt response cycle: refers to the entire process of interrupt response by the CPU after detecting an interrupt request, including steps such as saving breakpoints, selecting interrupt service routines, and executing interrupt service routines.

  4. Interrupt enable status: refers to whether the CPU allows interrupts to occur, usually controlled by the setting of the interrupt mask word.

22. Interrupt I/O mode

  • Knowledge point: Characteristics and applicable scenarios of interrupt I/O method
  1. Interrupt I/O mode: It is an I/O operation mode in which the peripheral notifies the CPU to complete the data transfer by issuing an interrupt request, without the need for the CPU to continuously poll and check the status of the peripheral.

23. Calculation of CPU time usage percentage

  • Knowledge point: Formula to calculate CPU time usage percentage

  • CPU frequency: refers to the number of oscillation cycles completed by the central processing unit (CPU) per second, usually in Hertz (Hz), indicating the running speed of the processor.

  • CPI (Clocks Per Instruction): Indicates the average number of clock cycles required to execute each instruction. The lower the CPI value, the fewer clock cycles it takes to execute an instruction on average and the higher the performance.

  • Interrupt mode: A working mode of a computer system in which the peripheral device requests the CPU to suspend the currently executing task by issuing an interrupt request, and instead executes the service program related to the interrupt, and then returns to the original task after handling the interrupt.

  • Interrupt service routine: A special program used to respond to interrupt requests, usually provided by the operating system. When an interrupt occurs in the system, the CPU will jump to execute the corresponding interrupt service routine to handle the interrupt event.

  • DMA (Direct Memory Access): Direct memory access is a data transmission method that allows peripherals to directly transfer data to the memory without direct participation of the CPU. DMA can improve the efficiency of data transmission and reduce the burden on the CPU.

  • DMA preprocessing and postprocessing: During the DMA transfer process, some preprocessing and postprocessing steps are involved, which are used to initialize the DMA controller and clean up after completing the transfer.

  • Memory access conflict: A conflict that may occur when two or more devices (such as CPU and DMA) try to access the same memory area at the same time. Memory access conflicts may lead to data inconsistency or transmission errors.

  • Data block transfer: In I/O operations, it refers to the unit of data transferred at one time. DMA usually transfers data in block sizes.

The above explanation provides the basic concepts of related professional terms and helps to understand the concepts of hardware and data transmission involved in computer systems.

24. Computer performance parameters

  • Knowledge point: Calculation of computer performance parameters

  • MIPS (Million Instructions Per Second): A unit for measuring computer performance, indicating the number of million instructions executed per second. The calculation method is the CPU frequency divided by the average number of clock cycles per instruction (CPI).

  • Cache hit rate: In a computer system, it refers to the probability that the data or instructions accessed by the CPU can be found in the cache (Cache). Expressed as a percentage, a high hit rate indicates efficient cache utilization.

  • Memory access bandwidth: refers to the rate of data transmission between the CPU and main memory, usually measured in the amount of data transmitted per second. High memory access bandwidth helps meet the CPU's high-speed reading and writing requirements for memory.

  • Page missing rate: In the virtual storage system, it indicates the frequency of page missing during program running, that is, the probability that the page accessed by the CPU is not in the memory.

  • DMA request: During direct memory access (DMA), the peripheral device sends a request to the DMA controller to notify the DMA controller to perform a data transfer operation. DMA requests typically include requests to the memory bus.

  • Cross-storage mode: The main memory storage unit is divided into several banks, and each bank works independently. The cross-storage mode improves the concurrent access performance of the main memory by using the storage units of each body in turn.

The above explanation provides the basic concepts of related professional terms and helps to understand concepts related to performance, storage access, etc. in computer systems.

25. Asynchronous serial communication of peripherals

  • Knowledge points: Asynchronous serial communication methods and serial port data transmission

  • Asynchronous serial communication: A data transmission method in which data bits are transmitted bit by bit, there is no fixed time interval between characters, and the clocks of both communicating parties are not synchronized.

  • Odd parity bit: A verification method used in communications to ensure the correctness of data transmission. Make the sum of data bits (including odd parity bits) odd to detect errors during transmission.

  • Stop bit: In serial communication, it specifies the number of bits after the data bit and is used to inform the receiving end of the end of the data bit. Generally, 1 or 2 stop bits are used.

  • Interrupt mode: A method of handling I/O events in a computer system. When a peripheral needs processing, it notifies the CPU through an interrupt signal, and the CPU suspends the current task and switches to the interrupt service routine.

  • Interrupt response time: The time from when the CPU receives an interrupt request to when it starts executing the interrupt service routine, usually including the preparation time for interrupt response.

  • Interrupt Service Routine: A program that handles specific interrupt events. Including responding to interrupt sources, saving breakpoints and program status, and performing interrupt service related operations.

  • Clock cycle: The basic unit of time in a computer system, indicating the time required for the CPU to execute a basic instruction. It is related to the CPU frequency, usually in nanoseconds (ns).

The above explanation provides the basic concepts of related professional terms and helps to understand concepts related to asynchronous serial communication, interrupt processing, etc. in computer systems.

26. Calculation of time occupied percentage in I/O mode

  • Knowledge point: Calculate the time occupied percentage of I/O mode

1) Scheduled query I/O method: An I/O data transmission method that checks whether there is a need for data transmission by periodically polling the device status. The CPU periodically queries the device status and, if the device is ready, performs the corresponding input/output operations.

2) Interrupt I/O mode: An I/O data transmission mode. When the device is ready, it notifies the CPU of data transmission through an interrupt signal. After the CPU responds to the interrupt, it executes the interrupt service routine and completes the data exchange with the device.

3) DMA method: Direct Memory Access (DMA) is a computer data transmission method that allows peripherals to exchange data directly with main memory without the need for the CPU to participate in each data transfer cycle.

The above explanation provides the basic concepts of related professional terms and helps to understand concepts related to scheduled query I/O, interrupt I/O, DMA, etc. in computer systems.

27. Structure and performance parameters of disk drives

  • Knowledge points: The composition of disk drives and calculation of performance parameters

1) Address field:

  • Cylinder number (track number): Indicates a cylindrical surface on the disk, that is, tracks with the same track number are located on multiple disks.
  • Magnetic head number (disk surface number): Indicates the read and write heads of the disk and is used to select the magnetic surface on the disk.
  • Sector number: represents a sector on the track, that is, the track is divided into multiple equal-length data blocks.

The number of bits in each field is determined by the disk capacity and structure. For example, the cylinder number occupies at least [log2 (total number of tracks)] digits, the head number occupies at least [log2 (total number of disks × number of disk surfaces)] digits, and the sector number occupies at least [log2(sectors per track)] bits.

2) Average access time:

  • Seek time: The time it takes to move the head to the target track.
  • Delay time: the time to wait for the target sector to rotate under the head.
  • Transfer time: The time from reading/writing data on the disk to the completion of data transfer.

Average access time is a combination of these three, affected by disk performance and mechanical structure.

3) DMA controller:

  • Bus request: The DMA controller notifies the CPU through a bus request, requesting the use of the system bus.
  • Cycle diversion DMA method: The DMA controller implements data transfer by diverting main memory cycles.

In the data transfer between the disk and the host, the number of bus requests sent by the DMA controller to the CPU depends on the size of the data buffer, that is, one request occurs every time the buffer is full. In the cycle transfer DMA mode, the DMA controller can obtain bus usage rights first to ensure timely data transfer.

The above explanation helps to understand the related concepts involving disk structure, access time and DMA controller.

Guess you like

Origin blog.csdn.net/qq_42531954/article/details/135438170