关于NUMA

  1. NUMA问题
    1. Cause

The Linux operating system includes algorithms that attempt to keep memory objects close to the CPU that accesses them. However, an application’s tasks can migrate over time to CPUs in other NUMA nodes and away from their memory objects, resulting in reduced performance.

       最好的方法是使用numactl命令,如下:

#numactl –m 1 –N 1 numactl –show

       此外还有taskset和cpuset

       Numactl使用libnuma接口来实现,taskset控制任务线程使用默认的内存分配,CPUSET命令是cgroups的子集,能绑定任务到特定的核和相关的内存对象到特定的内存节点。与numactl不同,CPUSET可以动态移动绑定。

    1. 确定任务内存地点

/proc/<pid>/numa_maps文件确定任务内存分配的对象和地址。

显示当前SHELL进程的堆和栈

#grep -e heap -e stack /proc/$$/numa_maps

第一列是虚拟地址,第二列是内存分配策略,第三列是映射文件的路径,anon和dirty显示了页数量。N<node>显示了每个<node>分配的页数。

    1. 自动NUMA平衡

                 3.8内核后引入了NUMA平衡功能。将任务移动到最近的内存或移动内存到任务执行最近的地方。

包括如下:

•  Periodic unmapping of process memory pages a little at a time. These unmapped pages result in NUMA-hintint faults allowing the kernel to track and process memory location.

•  Migrate-on-Fault (MoF): This function moves memory pages to where the task accessing them is executing

•  task_numa_placement(): This kernel routine which moves running tasks closer to their memory objects.

                 可以让很多应用收益,但是有些应用可能会增加也缺失延时(ORACLE公司在OEL上完全禁用NUMA 平衡)

通过如下命令可以禁用numa_balancing

#sysctl –w kernel.numa_balancing=0

或 #echo 0 >/proc/sys/kernel/numa_balancing

  1. 电源管理
    1. c-state

判断硬件C-state状态,如下命令:

# cat /sys/devices/system/cpu/cpu0/cpuidle/state<C-state number>/name

使能 intel_idle 驱动来使用 C-state1, 在启动参数上加入“ intel_idle.max_cstate=1 “. 也可以使用tuned-adm命令

utilites tuned-adm profile “latency-performance”, or with the cpupower idle-set command.

    1. P-state

P-state控制CPU时钟频率在一个范围。低的CPU时钟频率会影响性能,锁定在高频可以提高性能。

使用tuned-adm的 latency-performance 。

或者:cpupower frequency-set –g performance

    1. 管理电源使用的命令

C-STATE通过内核级别的驱动,RHEL7是intel_idle,早些版本是acpi_idle

如果indel_idle被禁止,那么acpi_idle驱动会被使用。

#dmesg | grep acpi_idle

或者

#cpupower idle-info

禁用C-STATE使用:

启动参数加:intel_idle.max_cstate=1

如果是acpi_idle驱动

则加:processor.max_cstate=1

设置intel_idle.max_cstate=0可以禁止acpi_idle驱动。

也可以禁止所有C-STATE,禁用acpi_idle设置:processor.max_cstate=0

      1. 用户层   

可以在用户层控制C-STATES,不需要在启动参数修改。

  1. 查看/dev/cpu_dma_latency,设置一个值来满足C-STATE状态。
  2. 使用文件系统来使能或禁止C-STATE,每个CPU,路径/sys/devices/system/cpu/cpu<n>/cpuidle/state<n>/disable
    1. cpupower管理命令

RHEL7之后使用intel_pstate驱动,之前使用acpi_cpufreq

通过cpupower可以同时控制c-states和p-state

频率控制通过命令 cpupower frequency-info和cpupower frequency-set命令。

C-STATE控制

查看命令:cpupower idle-info

设置命令:cpupower idel-set –d (-e) <C-state>

例如:cpupower idle-set –d 2

    1. tuned-adm命令

tuned是红帽指定的控制C-state和P-state的工具。定义了一组profiles.从RHEL6  到 RHEL7 存在不同,包括C-STATE,PSTATE,还可以控制IO调度,透明大页,调度参数等。

 

  1. 数据库调优
    1. 禁用numa-balanceing

sysctl kernel.numa_balancing=0

 

    1. 大页

内核必须保存虚拟地址和物理地址转换的映射。

减轻虚拟到物理内存转换状态(TLB使用效率),大页特性可以修改这个问题。

此外,使用内存大页可以防止SGA被换出。

    1. 多个数据库监听进程

配置多个网口的IP地址

配置多个监听

绑定启动监听如下:

/bin/numactl --cpunodebind=0 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1601

/bin/numactl --cpunodebind=1 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1602

/bin/numactl --cpunodebind=2 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1603

/bin/numactl --cpunodebind=3 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1604

/bin/numactl --cpunodebind=4 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1605

/bin/numactl --cpunodebind=5 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1606

/bin/numactl --cpunodebind=6 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1607

/bin/numactl --cpunodebind=7 /oracle/product/12.1.0/grid/bin/lsnrctl start ORCL_1608

配置客户端的tnsname.ora

ORCLML =

  (DESCRIPTION =

    (ADDRESS_LIST =

      (LOAD_BALANCE = on)

      (FAILOVER = off)

      (ADDRESS = (PROTOCOL = TCP)(HOST = 11.1.1.1)(PORT = 1601))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 12.1.1.1)(PORT = 1602))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 13.1.1.1)(PORT = 1603))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 14.1.1.1)(PORT = 1604))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 15.1.1.1)(PORT = 1605))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 16.1.1.1)(PORT = 1606))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 17.1.1.1)(PORT = 1607))

      (ADDRESS = (PROTOCOL = TCP)(HOST = 18.1.1.1)(PORT = 1608))

    )

    (CONNECT_DATA =

      (SERVER = DEDICATED)

      (SERVICE_NAME = orclml)

)

 

 

 

 

 

 

 

 

 

 

猜你喜欢

转载自blog.csdn.net/notbaron/article/details/81147778