Chapter 3 Bus and Memory

Chapter 3 Bus and Memory

Bus overview

Definition : Information transmission channel, a set of shared electrical wires that transmit information between various components in a computer system (even between systems).

The bus system generally consists of a transmission line + interface + bus controller .

Electrical characteristics : The transmission direction and effective level range of signals on each transmission line are divided into unidirectional bus (simplex) and bidirectional bus (full duplex, half duplex).

The most basic electrical characteristic : at any moment, there can only be information flow in one direction, and opposite behavior is not allowed. You can send and receive more, send more and receive more; you can't send more and receive more, send more and receive more.

For the master devices connected to the bus, only one master device is allowed to send data to the bus at any time.

Three-state logic : high level, low level, high impedance state

  • High impedance is equivalent to removing the influence of the output from the subsequent circuit.
  • There are often multiple bus masters connected to the system bus, and only one bus master can occupy the bus at a time, and the address, data, and control signals output by other bus masters must be in a high-impedance state

Bus arbitration : When multiple candidate master components on the bus apply for using the bus at the same time, there must be a bus arbitration mechanism to arbitrate the application according to the policy.

Centralized bus arbitration:

  • Serial/Chained Queries

    • Advantages: Chained query method with fixed priority and simple design
    • Disadvantages: Sensitive to hardware circuit failures, and the priority cannot be changed. When components with high priority use the bus frequently, components with low priority cannot use the bus for a long time.
  • Counter timing query

    • Pros: Can change priorities, no chain sensitivity to circuit failures

      If you start from device 0 each time, the priority order of each device is the same as the chain query method, and the order of priority is fixed at this time; if you start from the device after the stop point each time, each device have equal priority

    • Disadvantage: increase the number of control lines, if the device has NNN , the number of control lines required islog ⁡ 2 N + 2 \log_2 N+2log2N+2

  • Parallel/independent on-demand queries

    • advantage:

      • The response speed is fast, the bus allows the signal BG to be sent directly from the controller to the related equipment, without passing or querying between the equipment
      • Very flexible control over priorities
    • shortcoming:

      • The number of control lines is large, if the device has nnn , you need2 n + 1 2n+12 n+1 control line. Wherein, +1 is the BS line, which is used for the device to feedback to the bus control unit that the bus has been used.
      • The control logic of the bus is more complex

    image-20220104101858423

Bus communication :

In bus communication, the component that sends data is called the source component, and the component that receives the data is called the destination component.

The communication that performs bus communication control based on the viewpoint of "no awareness" is called synchronous communication .

Communication based on "need awareness" for bus communication control is called asynchronous communication .

Synchronous communication : It means that the information transmission between the two components participating in the communication is controlled by a fixed-width and fixed-distance time scale.

Every other tick, the source unit sends a data to the bus without acknowledging whether the destination unit received the data. The destination component samples the bus every other time stamp to obtain data, without confirming that the data has been sent, and without confirming to the source component that the received data is correct.

Asynchronous communication : It means that the two components participating in the communication need to "perceive" the operation of the other party. This "perception" is realized through the "handshake signal". Generally, a set of responses is used to send a double contact signal.

It is suitable for communication between components with different working speeds and occasions where communication lines are disturbed (long distance).

The "handshake" protocol for asynchronous communication is divided into:

  • Unilateral control means that the communication process is controlled by the source component or the destination component

    image-20220104104301652

    The sending end sends data and is in the data ready state after a period of delay. Ideally, the receiving end receives the ready state and takes the data.

    image-20220104104311751

    The receiving end sends out a data request and takes the data after a certain delay

  • Bilateral control means that the communication process is jointly controlled by the source component and the destination component

    No interlocking : There is only a recursive relationship , that is, the response signal is sent after receiving the request signal from the other party, and the requesting end does not receive the response signal, but automatically cancels the request signal after a period of delay.

    In essence, it is a kind of unilateral control.

    [External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-8b7RqyhA-1655600236284)(https://raw.githubusercontent.com/wjrzm/PicGO/main/img/202206190855154 .png)]

    Semi-interlock : There are two recursive relationships , that is, the response signal is sent after receiving the request signal from the other party, and the request signal is canceled after receiving the response signal from the other party.

    Security is better.

    image-20220104104833731

    Full interlock : There are three recursive relationships , that is, the response signal is sent after receiving the request signal from the other party, and the request signal is canceled after receiving the response signal from the other party, and the final response signal is sent after receiving the request signal from the other party Cancellation after signal cancellation.

    The security is the highest, but the control is troublesome, and it is often used in network communication.

    image-20220104104856402

bus performance index

  • Bus width: the number of data bus bits, usually the number of data buses.

  • The clock frequency of the bus: that is, how many clocks there are in one second, that is, the reciprocal of the clock cycle T.

  • Bus bandwidth (standard transmission rate): the number of data bits that can be transmitted on the bus per unit time.
    Bus Bandwidth = Bus Operating Frequency × Bus Width ( bit / s ) = Bus Operating Frequency × Bus Width8 ( B / s ) = Bus Width8 × Bus Cycle ( B / s ) \begin{aligned}Bus Bandwidth &= Bus Operating Frequency \times bus width\: (bit/s) \\&= bus operating frequency\times \frac{bus width}{8} \: (B/s) \\&= \frac{bus width}{8 \ times bus cycle} \: (B/s)\end{aligned}bus bandwidth=bus operating frequency×bus width(bit/s)=bus operating frequency×8bus width(B/s)=8×bus cyclebus width(B/s)

  • Number of single data transfer cycles

    • Normal mode: when transmitting, send the address first, and then send the data (that is, send a number in two working cycles)

    • Burst mode: Only the first number adopts the normal method (two working cycles), and the subsequent data only needs one working cycle (usually satisfying the principle of locality, no need to send the address, the address is automatically incremented by 1)

      image-20220104110410145

  • RS232-C: low-speed single-ended asynchronous serial communication

    In the unbalanced transmission mode, the data signals at the receiving and sending ends are relative to the signal ground, and the ability to resist common mode interference is poor. Short-distance communication can only be used for point-to-point communication. Near-distance communication often uses a three-wire connection method (RXD, TXD, GND), which can achieve two-way transmission and full-duplex communication. Since RS232-C uses negative logic, it is often necessary to add a level conversion chip when connecting with a microcontroller.

    In RS232-C transmission, it is in a high level state when idle, the data on the left is LSB, and the data on the right is MSB.

memory

Memory: It is a collection of many storage units used to store currently running programs and data.

According to the principle of von Neumann architecture, all programs must be loaded into memory before execution. Therefore, when the CPU executes the program, it fetches instructions and data from the memory.

Basic memory operations:

  • Read - take out data from memory without destroying the contents of the original storage unit
  • Write - store information in a storage unit, replacing the original part with existing information

Storage classification:

WeChat picture_20220104115232

Main memory: used to store instructions and data, which can be directly addressed by the CPU.

Auxiliary memory: unable to directly interact with the CPU, it is characterized by a large capacity and low cost

Cache: Located between the main memory and the CPU, it mainly stores the program segments being executed and related data for high-speed use by the CPU; the speed of Cache is the highest among all memories, with small capacity and high cost.

Storage capacity = number of stored words × \times× word length, each memory unit can store 8 bits, that is, 8 binary bits, called 1 byte

  • The hierarchical relationship between main memory and auxiliary memory: solving the problem of small computer capacity
  • Hierarchical relationship between cache and main memory: solve the problem of slow speed inside the host

When storing more than one byte of data, use little-endian mode and big-endian mode.

Little endian mode: Use low low, high high mode.

Each memory unit can store 8 Bits (binary bits), which become a byte.

Each storage bit is composed of a circuit with a memory function similar to a flip-flop (Flip Flop).

RAM (Random Access Memory) random read-write memory : it can randomly read and write each storage unit in it, and it is the main storage component of the computer system. That is what we often call memory.

  • The time required to read and write the content of any cell is independent of the location of the cell
  • The contents of the storage unit are all lost after power failure and cannot be recovered
  • According to the working principle of the storage circuit, RAM can be divided into static RAM and dynamic RAM

**SRAM (Static RAM)** is based on a bistable flip-flop (complex circuit, high cost), as long as the power is not lost, the information will never be lost, and there is no need to refresh the circuit.

  • The stored information is stable, as long as the power is not turned on, there is no read and write operations, and the saved information is constant
  • Fast read and write operations, close to CPU speed
  • Large power consumption, low integration, expensive, generally used as Cache

**DRAM (Dynamic RAM)** relies on capacitors to store information. The circuit is simple and highly integrated, but the capacitor leaks and the information will be lost. Therefore, a dedicated circuit is required to refresh regularly.

  • It has large capacity, low power consumption, and slow speed, and is widely used as memory.
  • Refresh: The process of reading out and rewriting every bit of information stored in DRAM.
  • Features: The stored information is unstable and needs to be refreshed regularly, with high integration and low price. It is generally used as the main memory of the system-memory.

ROM (Read Only Memory) read-only memory : the content can only be read out, not written in. The biggest advantage is that the stored information can be stored for a long time. When the power is turned off, the information in the ROM will not disappear. It is mainly used to store fixed programs and data. , which is usually used to store the boot loader.

The storage capacity of 2764 is 8K*8Bits, which means that the address line bits of the chip are 13 (2^13), and the data lines are 8.

  • Type of memory

    ROM stores system programs, standard subroutines and various constants

    RAM is set for user programming

Memory expansion :

  • Bit extension: increase the storage word length

    A 1024$\times 8-bit memory is required, two 1024 8-bit memories can be used, two 1024The 8 -bit memory can be expanded with two 1024 \times$4 chips, where the address lines are connected in parallel and the data lines are connected in series.

    image-20220104150818843

  • Word expansion: increase the capacity of memory sub

    A 64K$\times 8-bit memory is required, and 4 pieces of 16K 8-bit memory can be used, and 4 pieces of 16K can be usedFor an 8 -bit memory, four 16 K \times$8-bit memories can be used, and a 2-4 decoder is used to connect the chip select signal for the two extra address lines . The original fourteen address lines are connected in parallel.

    image-20220104150905860

  • Judging the parity address: To judge the parity attribute of a word/byte, you only need to look at the 0th bit, 1 is odd, and 0 is even.

Memory and CPU interface

X86 access to peripherals is independent, ARM is not.

The I/O interface is the bridge between the CPU and peripherals, and the memory is the warehouse of data and programs.

80x86 access I/O mode: independent I/O address, isolated from access memory

MIPS/PowerPC/ARM access I/O mode: memory-mapped I/O address (unified addressing)

Memory and I/O chip addressing issues:

image-20220104184221244

From the truth table of the 138 decoder, A 15 = 1 A_{15}=1A15=1 A 14 = 0 A_{14}=0 A14=0 , outputY 4 ‾ = 0 \overline{Y_4}=0Y4=0 thereforeA 13 A 12 A 11 = 100 A_{13}A_{12}A_{11}=100A13A12A11=100 , so the memory range is1010000000000000 ∼ 1010011111111111 1010000000000000 \sim 101001111111111110100000000000001010011111111111 isA 000 ∼ A 7 FF A000 \sim A7FFA000A7FF _ _

cache memory

The large-capacity memory that constitutes the main memory of the system is DROM, which is much slower than the CPU, and the price of SROM that matches the speed of the CPU is very high. The most important operation of the computer is to fetch instructions and data from the memory, and the slow memory severely limits the performance of the CPU. A small amount of static memory is added between the CPU and the main memory, and this static memory is a high-speed cache .

  • Temporal locality: If an information item is being accessed, it is likely to be accessed again in the near future. Such as program loop, stack, etc.
  • Spatial locality: The information that will be used in the near future is likely to be close to the information being used in the space address, such as sequence codes, elements of arrays.

Cache and memory are the relationship between desks and bookcases.

Cache and memory are divided into pages and blocks of the same size, so to realize the conversion of the main memory address and the Cache address is to design how to map the main memory page to the Cache page.

Specifically, see the statement on the PPT:

  • Direct mapping (Direct-mapped): one-to-one mapping according to the serial number
  • Fully associative: Any page in the main memory can be mapped to any page in the cache
  • Set associative: Divided into several groups, direct mapping between groups, fully associative mapping within a group

topic:

  • Bus bandwidth = bus operating frequency * bus width
  • In the address mapping of the Cache, if any block in the main memory can be mapped to any block in the Cache, this method is called - fully associative mapping
  • In the centralized bus arbitration, the independent request mode has the fastest response time
  • The bus cycle refers to the time required for the BIU to complete an access memory or I/O port operation
  • Taking the same number of words as the comparison condition, the highest read data transfer rate is - SRAM
  • Use 2K*4-bit SRAM chips to form a 16K- byte memory, and a total of 16 SRAM chips are required
  • There is a microcomputer system that uses the lower 10-bit address lines A0~A9 of the CPU as the address lines of the input and output ports. There are 16 port addresses inside the interface chip in the system. The chip selection signal of the interface chip is generated by the address decoder. The input address lines of the address decoder are generally A4~A9. (corresponding to 4 address lines, the highest bit is generally selected, but a chip select signal is also required)
  • A DRAM chip with a capacity of 1M*1bit has 10 address lines and 1 data line besides OE, WE, RAS, CAS, power supply and ground on the pins
  • When reading and writing a word (32 bits) on a 32-bit data bus, the efficiency is the highest when the lowest two bits of the word start address are 00
  • Generally, the RAM that constitutes the memory of a microcomputer system mainly includes SRAM and DRAM. Compared with SRAM, the main technical advantage of DRAM is its high storage density.
  • After power failure, SRAM, DRAM and SDRAM cannot effectively retain data
  • The advantage of dynamic memory over static memory is its low cost
  • In the case of address/data bus multiplexing, the memory address line and the CPU address line pin need to be connected through the address latch
  • Static memory has the advantage of being faster than dynamic memory
  • The technical characteristics of memory and external memory are mainly reflected in the connection method with the CPU
  • In the memory hierarchy, the order of performance from high to low is Reg>SRAM>DRAM>HDD
  • The data bus transmits data and status information and is bi-directional and tri-state.
  • The address bus transmits address information and is unidirectional and tri-state.
  • The control bus transmits control signals and is unidirectional.

Cache block

  • Direct mapping: One-to-one mapping according to the serial number, first divide the main memory and cache into blocks of the same size (also called rows), and then divide the main memory into several areas according to the size of the entire cache. Mapping rule: block Y of a certain area of ​​main memory can only be loaded into block Y of Cache.
  • Group connection: divided into several groups, direct mapping between groups, and full connection mapping within a group. That is, the main memory and the Cache are divided into blocks of the same size, the Cache is divided into several groups, such as two groups, and the main memory is partitioned according to the number of Cache groups. Mapping rules: The nth block in any area of ​​the main memory can only be mapped to the nth group of the Cache (direct method), which can be any one of the two blocks in the nth group (full associative method)
  • Fully associative mode: Any page in the main memory can be mapped to any page in the cache. First divide the main memory and Cache into blocks (also called rows) of the same size. Mapping rules: A block in memory can be loaded into any block in Cache. (need to record the corresponding relationship). Address conversion table TAG: The number of units is consistent and corresponding to the number of Cache blocks, and the content is the number of the main memory block associated with the Cache block.

Direct mapping between groups and fully connected mapping within groups. That is, the main memory and the Cache are divided into blocks of the same size, the Cache is divided into several groups, such as two groups, and the main memory is partitioned according to the number of Cache groups. Mapping rules: The nth block in any area of ​​the main memory can only be mapped to the nth group of the Cache (direct method), which can be any one of the two blocks in the nth group (full associative method)

  • Fully associative mode: Any page in the main memory can be mapped to any page in the cache. First divide the main memory and Cache into blocks (also called rows) of the same size. Mapping rules: A block in memory can be loaded into any block in Cache. (need to record the corresponding relationship). Address conversion table TAG: The number of units is consistent and corresponding to the number of Cache blocks, and the content is the number of the main memory block associated with the Cache block.

Guess you like

Origin blog.csdn.net/wjrzm2001/article/details/125354481