[In-depth understanding of computer systems] Chapter 6-Memory Hierarchy

I still remember that the mid-term exam was confused by the calculations in this chapter ... I did the questions in the book again, and I thought it was quite interesting.

  1. [DRAM, SRAM] dynamic / static random access memory.
  2. [Disk] Sector (generally has 512B / sector). The gap between sectors does not store data bits, only used to identify sector formatting bits. Disk operation, the three main parts of sector access time, seek time (usually 6 ~ 9ms), rotation time (depending on the rotation speed and the number of sectors per track), transfer time (less time).
  3. [Locality] The locality of time and space. In the case of poor temporal locality, each vector element is only accessed once. Reference mode with step size k: Every k-th element of a continuous vector is accessed. Generally, as k increases, the spatial locality decreases.
  4. [Memory Hierarchy]. Central idea: For each k, the faster and smaller storage device located at the k + 1 layer serves as a cache for the larger and slower storage device located at the k + 1 layer. The data in the memory is divided into blocks. The data transfer between layers is always in units of blocks. L0 <=> L1: 1B, L1 <=> L2 L2 <=> L3: 4 ~ 8B, L3 <=> L4: hundreds and thousands of B. The lower the layer, the slower the data transmission, and the larger the block size.
    • Cache about read:
      • When the program needs the data in the k + 1th layer, first look in the kth layer. If it is found, it is called a cache hit; if it is not found, it is a cache miss. At this time, if the kth layer is full, you need to use a replacement strategy to overwrite a block with the block transferred from the k + 1th layer The strategy LRU has been used least recently.
      • Cache misses are: cold cache (layer k is empty, also known as mandatory miss / cold miss), conflict miss (restrictive placement of data blocks, such as the ith data block of layer k + 1 is placed At the i% 4th position of the kth layer, although the kth layer of the cache is large enough, the referenced data objects may have been occupying the same position alternately), the capacity is not hit (the cache is too small to handle).
      • Note: t = msb. E and t are not directly related. The advantage of putting the group index in the middle: if the high-order index is used, the data of consecutive addresses will be mapped to the same cache block, which reduces the efficiency of cache use.
      • According to E (the number of cache lines per group), the cache is divided into:
        • Direct mapping cache (E = 1)
        • Group associative cache (1 <E <C / B)
        • Fully associative cache (E = C / B, S = 1, only suitable for small caches, such as TLB, cache page table entries)
      • Direct mapping caches have the problem of jitter conflict misses. For example, two variables in the loop will be mapped to the same cache group. A simple way is to "Gasai stagger" (in my own words). 
    • Cache about write
      • The cached value (write hit) is changed in the cache, and the update to the memory is done:
        • Write through
        • Delayed writes (additional modification bits are required, and only write to memory when the replacement algorithm wants to evict the updated block).
      • Handling of missed writes
        • Write allocation
        • Non-write allocation (to avoid telling the cache, write directly to memory)
      • Write-through caches are usually non-write allocations, and delayed writes are usually write allocations.
      • Performance impact of cache parameters
        • Miss rate
        • Hit rate
        • Hit time
        • Missed punishment
    • Out of selfishness, save this picture

 

Guess you like

Origin www.cnblogs.com/zhouys96/p/12702758.html