Architecture 16_ memory hierarchy

From single-level memory to multi-level memory

There is a huge difference between processor performance and storage system performance.

1. Why?

   (1) The speed gap between main memory and CPU is getting bigger and bigger, and the storage wall problem seriously restricts the improvement of computer performance.

   (2) The scale of the system and application continues to expand, requiring larger memory to support the operation of the program

    (3) The capacity/speed/price of various types of memories are not available at the same time, such as SRAM, DRAM, magnetic disks, etc. It is impossible to build a viable storage system with a single existing storage device. The speed of SRAM is basically the same as that of CPU, but it occupies a larger volume and is more expensive

2. What to do?

  1. Use a variety of storage devices to complement each other and build a hierarchical storage system

       Fast but expensive memory: less capacity, try to allow more CPU access

       Slow but large-capacity memory: Larger capacity, less CPU access as possible

3. Can the effect be achieved?

   Access speed: use fast memory, try to let the CPU access the content in fast memory as much as possible (increase the Cache level)

        Principle of program locality:

            Time locality: the currently accessed data is stored in the Cache

            Spatial locality: Put the data adjacent to the current access address into the Cache (load from memory in block units)

   In terms of capacity: using slow but large capacity memory, data can be placed in external memory when the memory is not enough (increase auxiliary memory level)

Cache-main memory and main memory-auxiliary memory level

The two main levels

From the perspective of main memory:

   Cache-main memory level: make up for the lack of main memory speed

   Main memory-auxiliary memory level: make up for the lack of main memory capacity

Four problems with storage hierarchy

1. When loading a block into a higher layer (near the CPU) memory, where can it be placed?

   (Image rules Where can the imported block be placed)

2. When the block to be accessed is in the higher-level memory, how to find the block?

   (How to search algorithm in the candidate location specified by the mapping rule)

3. When failure occurs, which one should be replaced?

   (The candidate positions specified by the replacement algorithm are all occupied by other blocks)

4. What should be done when writing access?

    (Writing strategy how to handle write operations)

Storage level performance parameters

 

Guess you like

Origin blog.csdn.net/weixin_42596333/article/details/104237407