5.1 Introduction
Memory access is the most common action in computers. We hope to have infinite storage space and fast access speed, but the two are contradictory. Generally speaking, the smaller the memory and the closer it is to the CPU, the faster the access speed; the larger the memory and the farther it is from the CPU, the slower the access speed.
Memory access has two localities (principle of locality): here directly copy the content of the book
Temporal locality (locality in time): if an item is referenced, it will tend to be referenced again soon.
Spatial locality (locality in space): if an item is referenced, items whose
addresses are close by will tend to be referenced soon
Temporal locality is often manifested in loops.
Examples of spatial locality are sequentially executed programs, block storage structures such as arrays.
Cache utilizes both spatial locality and temporal locality. Spatial locality means that each cache line in the cache contains a piece of data with adjacent addresses; temporal locality means that the data cached in the cache will be fetched directly from the cache the next time it is accessed.
According to the principle of locality, the memory hierarchy is introduced, as shown in the figure below
Top-level memory: small capacity; fast speed; expensive; low storage density
The underlying memory: large capacity; slow speed; low price; high storage density
The content in the top-level memory must be included in the bottom-level memory; usually, data exchange will only occur between two adjacent levels of memory.
hit rate The fraction of memory accesses found in a level of the memory hierarchy.
miss rate The fraction of memory accesses not found in a level of the memory hierarchy.
hit time The time required to access a level of the memory hierarchy, including the time needed to determine whether the access is a hit or a miss.
miss penalty The time required to fetch a block into a level of the memory hierarchy from the lower level, including the time to access the block, transmit it from one level to the other, insert it in the level that experienced
the miss, and then pass the block to the requestor.
5.2 memory technology
SRAM (static random access memory)
Each bit is composed of 6-8 transistors; the read operation will not affect the data and does not need to be refreshed; the access speed is fast; as long as the power is not lost, the data will not be lost.
DRAM(dynamic random access memory)
- the value kept in a cell is stored as a charge in a capacitor.
- A single transistor is then used to access this stored charge。 So one transistor per
bit of storage
- must periodically be refreshed
Refresh method: read out, then write in.
The data retention time is a few ms, so a refresh needs to be done every few ms. DDR particles can be refreshed for each bank, or all banks can be refreshed together.
DDR SDRAM。
Each bank has a buffer.
Address interleaving: Send addresses to multiple banks, access multiple banks at the same time, and double the bandwidth.
reference:
1. Computer Composition and Design Fifth Edition ARM Edition