CPU Cache理解与性能优化

版权声明:转载请关注我的公众号-青儿创客基地 https://blog.csdn.net/Zhu_Zhu_2009/article/details/89213039

参考

cache结构与工作原理
Cache的组成结构
浅析x86架构中cache的组织结构
x86架构里的cache
7个示例科普CPU CACHE
Gallery of Processor Cache Effects
x64内核内存空间结构

Cache

Cache物理结构有三种:

  1. 直接相连(Direct Mapped),一个内存地址只在Cache的一个位置出现,相当于组(Set)只有一个Way的组相连Cache
  2. 组相连(Set-Associative),Cache被分为N组,一个内存地址可以出现组(Set)内的任意一个位置
  3. 全相连(Fully-Associative),一个内存地址可以在Cache的任意位置出现,相当于只有一个组(Set)的组相连Cache

下面看一下我的虚拟机Cache结构,以L1 D-Cache为例,

zc@ubuntu:~/project/fdk/mwm197$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 158
Model name:            Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz
Stepping:              11
CPU MHz:               3600.007
BogoMIPS:              7200.01
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0,1
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch epb invpcid_single pti retpoline rsb_ctxsw fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid rdseed adx smap xsaveopt dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp              
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/
cpu0/            cpufreq/         hotplug/         isolated         modalias         online           power/           uevent           
cpu1/            cpuidle/         intel_pstate/    kernel_max       offline          possible         present          vulnerabilities/ 
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/
cache/            crash_notes_size  firmware_node/    node0/            subsystem/        uevent            
crash_notes       driver/           hotplug/          power/            topology/         
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/
index0/ index1/ index2/ index3/ power/  uevent  
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/
coherency_line_size      level                    physical_line_partition  shared_cpu_list          size                     uevent                   
id                       number_of_sets           power/                   shared_cpu_map           type                     ways_of_associativity    
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/type 
Data
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/level 
1
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/type 
Data
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/size 
32K
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/number_of_sets 
64
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/ways_of_associativity 
8
zc@ubuntu:~/project/fdk/mwm197$ cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size 
64

L1 D-Cache大小为32K(0x8000),64个组,每组8路,Cache行为64字节,CPU地址空间为64位(不确定)?此时内存地址拆分如下(不确定)?8路这个参数不体现在地址拆分里,因为组内相当于全相连。

|---- 索引 ---|----- 组 -----|---------- 块内偏移 -----------|------ 字节偏移 ------|
|---- tag ----|---- set ----|-----block internal offset-----|-----Byte offset-----|
|----64:12----|----11:6-----|------------5:3----------------|---------2:0---------|
|----- 52 ----|----- 6 -----|------------ 3 ----------------|--------- 3 ---------|

猜你喜欢

转载自blog.csdn.net/Zhu_Zhu_2009/article/details/89213039