Computer Composition and Design - Chapter 5 Memory Hierarchy (1)

5.1 Introduction

Memory access is the most common action in computers. We hope to have infinite storage space and fast access speed, but the two are contradictory. Generally speaking, the smaller the memory and the closer it is to the CPU, the faster the access speed; the larger the memory and the farther it is from the CPU, the slower the access speed.

Memory access has two localities (principle of locality): here directly copy the content of the book

Temporal locality (locality in time): if an item is referenced, it will tend to be referenced again soon.

Spatial locality (locality in space): if an item is referenced, items whose

addresses are close by will tend to be referenced soon

Temporal locality is often manifested in loops.

Examples of spatial locality are sequentially executed programs, block storage structures such as arrays.

Cache utilizes both spatial locality and temporal locality. Spatial locality means that each cache line in the cache contains a piece of data with adjacent addresses; temporal locality means that the data cached in the cache will be fetched directly from the cache the next time it is accessed.

According to the principle of locality, the memory hierarchy is introduced, as shown in the figure below

Top-level memory: small capacity; fast speed; expensive; low storage density

The underlying memory: large capacity; slow speed; low price; high storage density

The content in the top-level memory must be included in the bottom-level memory; usually, data exchange will only occur between two adjacent levels of memory.

hit rate The fraction of memory accesses found  in a level of the memory hierarchy.

miss rate The fraction of memory accesses not found in a level of the memory hierarchy.

hit time The time required to access a level of the memory hierarchy, including the time needed to determine whether the access is a hit or a miss.

miss penalty The time required to fetch a block into a level of the memory hierarchy from the lower level, including the time to access the block, transmit it from one level to the other, insert it in the level that experienced

the miss, and then pass the block to the requestor.

 

5.2 memory technology

SRAM (static  random access memory)

Each bit is composed of 6-8 transistors; the read operation will not affect the data and does not need to be refreshed; the access speed is fast; as long as the power is not lost, the data will not be lost.

DRAM(dynamic random access memory)

- the value kept in a cell is stored as a charge in a capacitor.

- A single transistor is then used to access this stored charge。 So one transistor per

bit of storage

- must periodically be refreshed

Refresh method: read out, then write in.

The data retention time is a few ms, so a refresh needs to be done every few ms. DDR particles can be refreshed for each bank, or all banks can be refreshed together.

 

DDR SDRAM。

Each bank has a buffer.

Address interleaving: Send addresses to multiple banks, access multiple banks at the same time, and double the bandwidth.

reference:

1. Computer Composition and Design Fifth Edition ARM Edition

Guess you like

Origin blog.csdn.net/m0_38037810/article/details/126638169