lmbench test knowledge (for personal use)

Definitions:
LMBench: memory test delay
memory: computer memory.
cache: a small, fast cache memory. Meaning of existence is to close the speed gap between Memory and CPU.
TLB: the essence is in the page table Cache.

Single Dual usually refers to the number of threads.
Dual-threaded, which means that a physical CPU into two virtual CPU. There are two threads to run simultaneously. Relatively speaking, the performance will be improved. Also displayed are two CPU in the system.

,
In order to speed up the access speed, in the Cache, a table is created using the page lately configuration, corresponding to the part of the page into the Cache table, to reduce loss of memory page table in read performance. And page table is different, TLB virtual page number will be stored in the tag in, data is put in the physical page number, you will also have Dirty, ref, valid bit. TLB page table corresponds to a subset of, if Virtual address in the TLB hit, in the page table must also hit.
CPU There are several levels of cache. And L2 caches are typically one for each CPU L1, L1 buffers into L1i and L1d, respectively, for storing instructions and data. L2 cache is no distinction between instructions and data. A plurality of shared L3 cache core is not usually distinguish between instructions and data. There is also a cache called the TLB, which is mainly used MMU page table cache use, we usually speak cache (cache) when it is not.

L1 / L2 Cache SRAM is used as a storage medium, why L1 L2 faster than it? And there are three reasons:

  1. Speed ​​differences cause different memory capacity

Typically the capacity of L1, L2 smaller than the capacity of the SRAM access time is longer, and the case where the same design process, access latency and capacity is substantially proportional to the square root.

  1. CPU speed from the distance due to differences

L1 Cache generally away from the CPU core needs closer data, while the side relief in the L2 Cache location, accessing data, L2 Cache need further copper, or even more circuit, thereby increasing the delay.

ICache divided L1 Cache (instruction cache) and DCache (data cache), the instruction cache is typically placed ICache CPU core single instruction prefetch away near DCache data cache is typically placed near the core of the CPU load / store unit. And L2 Cache is placed outside of the CPU pipeline.

Why not also on the local L2 Cache close it? Since Cache larger capacity, the larger the area, the corresponding longer side length (assuming it is square), there is always far from the nucleus.

(Data from the Internet to find, if infringement please contact the author deleted)

Published 31 original articles · won praise 21 · views 8409

Guess you like

Origin blog.csdn.net/weixin_42366630/article/details/103969949