- posted NUMA NUMA set of trade-offs and optimization trade-offs and optimization settings

https://www.cnblogs.com/tcicy/p/10191505.html

 

  When os layer numa closed, open numa BIOS layer affects performance, QPS fall 15-30%;

  When closed bios level numa, numa level regardless os is open, it will not affect performance. 

      : Installation numactl  
      #yum the install numactl -Y
      #numastat equivalent to the cat / sys / devices / system / node / node0 / numastat, the detail record related to all the memory nodes in the system / sys / devices / system / node / folder information.        --hardware #numactl   NUMA nodes on the system include

      #numactl --show view binding information

 

 

 

      Redhat or Centos system bios layer by determining whether the open command numa
      # numa grep -i / var / log / the dmesg
      If the output is: No NUMA configuration found 
      described numa to disable, if not numa The above description is enable, for example, display : NUMA: Using 30 for the hash shift.
      You can view the machine by lscpu command NUMA topology.

When they find numa_miss value is relatively high, indicating the need to adjust the allocation strategy. For example, associated with the specified process is bound to a CPU, a memory to improve the hit rate.


---------------------------------------------

     There are a plurality of CPU blocks and the plurality of memory present on the machine. Before we are all as the memory block is a chunk of memory, all the CPU to the shared memory access message is the same. This is the SMP model before widespread use. But as processors, shared memory can cause more severe memory access violation, and if the memory access bottleneck reached when the performance can not be increased. NUMA (Non-Uniform Memory Access) is a model into such an environment. Such a machine is two processors, there are four memory blocks. We one processor and two memory blocks together, referred to as a NUMA node, so that the machine will have two NUMA node. Physically distributed, physical distance NUMA node processor and the memory block is smaller, and therefore faster access. For example, this machine will be divided into two left and right processors (cpu1, cpu2), placed on both sides of each of the two memory blocks in the processor (memory1.1, memory1.2, memory2.1, memory2.2), such a NUMA node1 cpu1 access memory1.1 and memory1.2 than the access memory2.1 and memory2.2 faster. So if we can use NUMA mode try to ensure that the CPU within this node only access memory blocks within this node, that such efficiency is the highest.

Use numactl -m and -physcpubind when running the program will be able to develop this program which runs in which cpu and memory in. Fun cpu-topology  to a table, when the program uses only one node resource comparison table node and use multiple resources (almost 38s and 28s of the gap). So limited run in numa node is meaningful.

But then, then again, we must develop numa okay? --numa trap. SWAP's Crime and Punishment article spoke of the problem of numa a trap. Phenomenon is when your server as well as memory and found that it had at the beginning of the swap, and even has led to stagnation of the machine phenomenon. This is likely due to limitations numa, if a process can only use memory limit its own numa node, then when the light itself numa node memory usage, they will not use other numa node memory, and will start using swap, the situation even worse, when the machine is not set up swap, may have a direct crash! So you can use numactl --interleave = all to cancel the restrictions numa node.

 

In summary it concluded that the decision to use NUMA based on specific business.

If your application will take up a large-scale memory, you should choose the most closed numa node limit (from hardware or closed numa). Because this program is useful when your chance will come across numa trap.

In addition, if your program does not occupy large memory, but requires a faster running time. Most of you should choose to restrict access to only this numa node approach to treatment.

---------------------------------------------------------------------

Kernel parameters overcommit_memory:

It is the memory allocation strategy

Optional values: 0,1,2.

0: The kernel will check whether there is enough supply of available memory by using the process; if there is enough free memory, memory allocation allows; otherwise, memory allocation fails, and the error is returned to the application process.

1: Indicates the kernel allows distribution of all physical memory, regardless of the current memory status.

2: represents the core allows the memory allocation exceeds the sum of physical memory and swap space all

Kernel parameters zone_reclaim_mode:

Optional value 0,1

a, When a node is insufficient memory available:

1, if 0, then the system will tend to allocate memory from other nodes

2. If it is 1, then the system will tend to recover most of the time from the local node Cache Memory

b, Cache performance is very important, so 0 is a better choice

----------------------------------------------------------------------

mongodb of NUMA issues

mongodb log is shown below:

WARNING: You are running on a NUMA machine.

We suggest launching mongod like this to avoid performance problems:

numactl –interleave=all mongod [other options]

Solution, temporarily modify numa memory allocation strategy for the interleave = all (allocation strategy intertwined in all node node):

1. In front of the original start-up command plus numactl -interleave = all

如# numactl --interleave=all ${MONGODB_HOME}/bin/mongod --config conf/mongodb.conf

2. Modify kernel parameters

echo 0 > /proc/sys/vm/zone_reclaim_mode ; echo "vm.zone_reclaim_mode = 0" >> /etc/sysctl.conf

----------------------------------------------------------------------

A, NUMA and SMP

NUMA and SMP are two CPU-related hardware architecture. SMP architecture in which all the CPU contention with a bus to access all the memory, the advantage of shared resources, but the disadvantage is bus contention intense. As the number of CPU on the server PC increases (not just the number of CPU cores), bus contention malpractice slowly more and more obvious, so on Intel Nehalem CPU launched a NUMA architecture, AMD also launched based on the same architecture the Opteron CPU.

The greatest feature is the introduction of the concept of NUMA node and the distance. For the CPU and memory of these two most valuable hardware resources, NUMA with almost strict mode split the resource group (node) belongs, and CPU and memory resources within each group are almost equal. Depends on the number of physical CPU resource group number (most conventional PC server has two physical CPU, each CPU has four cores); Distance This concept is used to define the cost between each call resource node, provide data support for resource scheduling optimization algorithm.

Two, NUMA-related policies

1, each process (or thread) will inherit NUMA policy from the parent process, and assigned a priority node. If NUMA policy permits, the process can call other resources on the node.

2, NUMA CPU allocation strategies cpunodebind, physcpubind. cpunodebind specified processes running on certain node, and can be more finely defined physcpubind run on which cores.

3, NUMA memory allocation strategies localalloc, preferred, membind, interleave.

localalloc prescribed process requests to allocate memory from the current node;

The preferred relatively loosely specify a recommended node to get the memory, if there is not enough memory on the recommended node, the process may try other node.

membind can specify the number of node, the process can only allocate memory requests from the specified node.

predetermined interleave process from the specified node to a plurality of RR (Round Robin scheduling polling) interleaving algorithm allocates memory request.

 

 

Because NUMA default memory allocation strategy is a priority in the process where the local memory of the CPU allocation, memory allocation can lead to imbalance between CPU nodes, when the lack of a CPU node memory, swap will lead to produce, rather than from a remote node Allocate memory. This phenomenon is called the swap insanity.

MySQL uses a threading model, support for NUMA characteristics is not good, if you run only a single instance of MySQL, we can choose to turn off NUMA, closed in three ways:

1. The hardware layer, the BIOS is provided in the closed

2.OS core disposed numa = off at startup;

3. numactl command can be used to modify the memory allocation strategy to interleave (cross).

If a single run multiple MySQL instances, we can bind the MySQL nodes on different CPU and memory allocation strategy using binding, mandatory allocation of memory within the node, both to take advantage of NUMA characteristics of the hardware, but also to avoid a single instance of MySQL multi-core CPU utilization is not high

Three, NUMA and swap relationship

We may have found, NUMA memory allocation policy is between processes (or threads), it is not fair. In the existing Redhat  Linux in, localalloc is the default NUMA memory allocation strategy, this configuration option causes the resource exclusive program is very easy to run out of a node memory. When a node is running out of memory, Linux also happens to be the node is assigned to a need to consume a lot of memory processes (or threads), swap it properly due to produce. Although at this time there are many page cache can be released, and even a lot of free memory.

Fourth, to solve the problem swap

Although the principle of NUMA is relatively complex, in fact, solve the swap is very simple: just use numactl -interleave before starting MySQL to modify the NUMA policy can be.

It is noteworthy that, numactl this command only to adjust the NUMA policy, can also be used to view the current resource usage of each node, is a very worthy of study command.

 

 

A, CPU
  first start with the CPU.
  You checked it more carefully, there will be some servers on an interesting phenomenon: when you cat / proc / cpuinfo, will find that the CPU frequency is saying even the nominal frequency is not the same:
  #cat / proc / cpuinfo
  Processor: 5
  Model name : Intel (R) Xeon (R) CPU @ 2.00 GHz E5-2620 0
  the CPU MHz: 1200.000
  this is Intel E5-2620 CPU, the CPU 2.00G * 24 he is, but we found that the frequency of the CPU for the first five 1.2G.
  ? This is why
  these are actually derived from the latest technology CPU: power-saving mode. CPU with the operating system and hardware, the system is not busy, in order to save energy and reduce the temperature, it will CPU frequency. This is a boon for environmentalists and for resisting global warming, but for MySQL, it could be a disaster.
  In order to ensure that MySQL can make full use of CPU resources, it is recommended to set the maximum CPU performance mode. This setting can be set in the BIOS and operating system, of course, set the option better, more thoroughly in the BIOS. Due to the difference between various types of BIOS, set the maximum CPU performance mode vary, here we do not specifically show how to set up.
  Then we look at memory, we have what may be optimized.
  i) We take a look at numa
  non-uniform memory access architecture (NUMA: Non-Uniform Memory Access ) is the latest memory management techniques. It symmetric multiprocessor architectures (SMP: Symmetric Multi-Processor) is the corresponding. The team simply do the following:
  As shown, the NUMA detailed information is not presented here. But we can visually see: SMP access memory is the price is the same; but in NUMA architecture, local and non-local memory access memory access cost is not the same. According to the corresponding feature on the operating system, we can set the memory allocation process. Currently supported methods include:
  --interleave = Nodes
  --membind = Nodes
  --cpunodebind = Nodes
  --physcpubind = cpus
  --localalloc
  --preferred = the Node
  In short, that is, you can specify memory allocation in the local, in a few CPU allocation polling assignment or node. Unless it is set to a polling distribution --interleave = nodes, i.e., other than the memory can be allocated in this way on any NUMA node. Other way even if there are other NUMA node memory remaining, Linux nor will the rest of the memory allocated to the process, instead of using SWAP way to get memory. Experienced system administrator or DBA knows SWAP database performance degradation caused by how pit father.
  So the easiest way, or shut off this feature.
  The method of closed properties, respectively: You can temporarily turn off this feature when BIOS, operating system, start the process from.
  a) The difference between the various types of BIOS, how to turn off NUMA vary widely, we here do not specifically show how to set up.
  b) closed in the operating system, may be directly added last /etc/grub.conf numa = off in the kernel line, as follows:
  kernel /vmlinuz-2.6.32-220.el6.x86_64 ro root = / dev / mapper / VolGroup-root rd_NO_LUKS LANG = en_US.UTF-8 rd_LVM_LV = VolGroup / root rd_NO_MD quiet SYSFONT = latarcyrheb-sun16 rhgb crashkernel = auto rd_LVM_LV = volGroup / swap rhgb crashkernel = auto quiet KEYBOARDTYPE = pc KEYTABLE = us rd_NO_DM numa = off
  may additionally disposed vm.zone_reclaim_mode = 0 try to reclaim memory.
  c) start MySQL when closing NUMA characteristics:
  numactl --interleave = All mysqld
  course, the best way is off in the BIOS.
  ii) We look at vm.swappiness.
  vm.swappiness operating system control strategy is swapped out of physical memory. It allows the value is a percentage value, 0 is the minimum, the maximum run 100, the default value is 60. Set to 0 vm.swappiness minimize swap, 100 indicates the possible inactive memory pages swapped out.
  Specifically: When the basic memory with full, the system will be judged based on this parameter is rarely used in the swap memory out of inactive memory, or cache data release. the disk cache to cache data read out from the program in accordance with the principle of locality, these data may have to be read next; inactive memory as the name suggests, is an application that is mapped with, but a long time without memory.
  We can use vmstat see the number of inactive memory:
  -an 1 #vmstat
  procs Memory ---------- --- ----------- ----- swap-- IO ---- --system-- - CPU ----- ----
  RB swpd Free Active Si INACT SO BI BO in CS SY ID WA US ST
  . 1 326928 0 0 27,522,384 1,704,644 10 0. 11 0 0 153 0 0 100 0 0
  0 0 0 0 1,704,164 27.5233 million 326936 0 0 74 784 590 is 0 0 100 0 0
  0 0 0 27,523,656 326936 1,704,692,008,843,916,860 0 100 0 0
  0 0 0 27.5243 million 326 916 1,703,412 0 0 4,521,982,620,010,000
  through the / proc / meminfo you you can see more detailed information:
  #cat / proc / meminfo | grep -i INACT
  Inactive: 326 972 kB
  Inactive (anon): 248 kB
  Inactive (File): 326 724 kB
  Here we further in-depth discussion of the inactive inactive memory. Linux, the memory may be in three states: free, active and inactive. As we all know, Linux Kernel internally maintains a lot of the LRU list for managing memory, such as LRU_INACTIVE_ANON, LRU_ACTIVE_ANON, LRU_INACTIVE_FILE, LRU_ACTIVE_FILE, LRU_UNEVICTABLE. Which LRU_INACTIVE_ANON, LRU_ACTIVE_ANON used to manage anonymous page, LRU_INACTIVE_FILE, LRU_ACTIVE_FILE for managing page caches the page cache. System kernel based on available memory pages, from time to time will be active in active memory is moved to the inactive list, these inactive memory can be swapped to swap to go.
  In general, MySQL, InnoDB especially memory cache management, it takes up more memory, the memory will be a lot less frequently accessed, this memory error if the Linux swap out, will waste a lot of CPU and IO resources. InnoDB to manage cache, cache file data is taking up memory for InnoDB almost no benefit.
  So, we'd better set vm.swappiness on the MySQL server = 1 or 0

  When os layer numa closed, open numa BIOS layer affects performance, QPS fall 15-30%;

  When closed bios level numa, numa level regardless os is open, it will not affect performance. 

      : Installation numactl  
      #yum the install numactl -Y
      #numastat equivalent to the cat / sys / devices / system / node / node0 / numastat, the detail record related to all the memory nodes in the system / sys / devices / system / node / folder information.        --hardware #numactl   NUMA nodes on the system include

      #numactl --show view binding information

 

 

 

      Redhat or Centos system bios layer by determining whether the open command numa
      # numa grep -i / var / log / the dmesg
      If the output is: No NUMA configuration found 
      described numa to disable, if not numa The above description is enable, for example, display : NUMA: Using 30 for the hash shift.
      You can view the machine by lscpu command NUMA topology.

When they find numa_miss value is relatively high, indicating the need to adjust the allocation strategy. For example, associated with the specified process is bound to a CPU, a memory to improve the hit rate.


---------------------------------------------

     现在的机器上都是有多个CPU和多个内存块的。以前我们都是将内存块看成是一大块内存,所有CPU到这个共享内存的访问消息是一样的。这就是之前普遍使用的SMP模型。但是随着处理器的增加,共享内存可能会导致内存访问冲突越来越厉害,且如果内存访问达到瓶颈的时候,性能就不能随之增加。NUMA(Non-Uniform Memory Access)就是这样的环境下引入的一个模型。比如一台机器是有2个处理器,有4个内存块。我们将1个处理器和两个内存块合起来,称为一个NUMA node,这样这个机器就会有两个NUMA node。在物理分布上,NUMA node的处理器和内存块的物理距离更小,因此访问也更快。比如这台机器会分左右两个处理器(cpu1, cpu2),在每个处理器两边放两个内存块(memory1.1, memory1.2, memory2.1,memory2.2),这样NUMA node1的cpu1访问memory1.1和memory1.2就比访问memory2.1和memory2.2更快。所以使用NUMA的模式如果能尽量保证本node内的CPU只访问本node内的内存块,那这样的效率就是最高的。

在运行程序的时候使用numactl -m和-physcpubind就能制定将这个程序运行在哪个cpu和哪个memory中。玩转cpu-topology 给了一个表格,当程序只使用一个node资源和使用多个node资源的比较表(差不多是38s与28s的差距)。所以限定程序在numa node中运行是有实际意义的。

但是呢,话又说回来了,制定numa就一定好吗?--numa的陷阱。SWAP的罪与罚文章就说到了一个numa的陷阱的问题。现象是当你的服务器还有内存的时候,发现它已经在开始使用swap了,甚至已经导致机器出现停滞的现象。这个就有可能是由于numa的限制,如果一个进程限制它只能使用自己的numa节点的内存,那么当自身numa node内存使用光之后,就不会去使用其他numa node的内存了,会开始使用swap,甚至更糟的情况,机器没有设置swap的时候,可能会直接死机!所以你可以使用numactl --interleave=all来取消numa node的限制。

 

综上所述得出的结论就是,根据具体业务决定NUMA的使用。

如果你的程序是会占用大规模内存的,你大多应该选择关闭numa node的限制(或从硬件关闭numa)。因为这个时候你的程序很有几率会碰到numa陷阱。

另外,如果你的程序并不占用大内存,而是要求更快的程序运行时间。你大多应该选择限制只访问本numa node的方法来进行处理。

---------------------------------------------------------------------

内核参数overcommit_memory :

它是 内存分配策略

可选值:0、1、2。

0:表示内核将检查是否有足够的可用内存供应用进程使用;如果有足够的可用内存,内存申请允许;否则,内存申请失败,并把错误返回给应用进程。

1:表示内核允许分配所有的物理内存,而不管当前的内存状态如何。

2:表示内核允许分配超过所有物理内存和交换空间总和的内存

内核参数zone_reclaim_mode:

可选值0、1

a、当某个节点可用内存不足时:

1、如果为0的话,那么系统会倾向于从其他节点分配内存

2、如果为1的话,那么系统会倾向于从本地节点回收Cache内存多数时候

b、Cache对性能很重要,所以0是一个更好的选择

----------------------------------------------------------------------

mongodb的NUMA问题

mongodb日志显示如下:

WARNING: You are running on a NUMA machine.

We suggest launching mongod like this to avoid performance problems:

numactl –interleave=all mongod [other options]

解决方案,临时修改numa内存分配策略为 interleave=all (在所有node节点进行交织分配的策略):

1.在原启动命令前面加numactl –interleave=all

如# numactl --interleave=all ${MONGODB_HOME}/bin/mongod --config conf/mongodb.conf

2.修改内核参数

echo 0 > /proc/sys/vm/zone_reclaim_mode ; echo "vm.zone_reclaim_mode = 0" >> /etc/sysctl.conf

----------------------------------------------------------------------

一、NUMA和SMP

NUMA和SMP是两种CPU相关的硬件架构。在SMP架构里面,所有的CPU争用一个总线来访问所有内存,优点是资源共享,而缺点是总线争用激烈。随着PC服务器上的CPU数量变多(不仅仅是CPU核数),总线争用的弊端慢慢越来越明显,于是Intel在Nehalem CPU上推出了NUMA架构,而AMD也推出了基于相同架构的Opteron CPU。

NUMA最大的特点是引入了node和distance的概念。对于CPU和内存这两种最宝贵的硬件资源,NUMA用近乎严格的方式划分了所属的资源组(node),而每个资源组内的CPU和内存是几乎相等。资源组的数量取决于物理CPU的个数(现有的PC server大多数有两个物理CPU,每个CPU有4个核);distance这个概念是用来定义各个node之间调用资源的开销,为资源调度优化算法提供数据支持。

二、NUMA相关的策略

1、每个进程(或线程)都会从父进程继承NUMA策略,并分配有一个优先node。如果NUMA策略允许的话,进程可以调用其他node上的资源。

2、NUMA的CPU分配策略有cpunodebind、physcpubind。cpunodebind规定进程运行在某几个node之上,而physcpubind可以更加精细地规定运行在哪些核上。

3、NUMA的内存分配策略有localalloc、preferred、membind、interleave。

localalloc规定进程从当前node上请求分配内存;

而preferred比较宽松地指定了一个推荐的node来获取内存,如果被推荐的node上没有足够内存,进程可以尝试别的node。

membind可以指定若干个node,进程只能从这些指定的node上请求分配内存。

interleave规定进程从指定的若干个node上以RR(Round Robin 轮询调度)算法交织地请求分配内存。

 

 

因为NUMA默认的内存分配策略是优先在进程所在CPU的本地内存中分配,会导致CPU节点之间内存分配不均衡,当某个CPU节点的内存不足时,会导致swap产生,而不是从远程节点分配内存。这就是所谓的swap insanity 现象。

MySQL采用了线程模式,对于NUMA特性的支持并不好,如果单机只运行一个MySQL实例,我们可以选择关闭NUMA,关闭的方法有三种:

1.硬件层,在BIOS中设置关闭

2.OS内核,启动时设置numa=off;

3.可以用numactl命令将内存分配策略修改为interleave(交叉)。

如果单机运行多个MySQL实例,我们可以将MySQL绑定在不同的CPU节点上,并且采用绑定的内存分配策略,强制在本节点内分配内存,这样既可以充分利用硬件的NUMA特性,又避免了单实例MySQL对多核CPU利用率不高的问题

三、NUMA和swap的关系

可能大家已经发现了,NUMA的内存分配策略对于进程(或线程)之间来说,并不是公平的。在现有的Redhat Linux中,localalloc是默认的NUMA内存分配策略,这个配置选项导致资源独占程序很容易将某个node的内存用尽。而当某个node的内存耗尽时,Linux又刚好将这个node分配给了某个需要消耗大量内存的进程(或线程),swap就妥妥地产生了。尽管此时还有很多page cache可以释放,甚至还有很多的free内存。

四、解决swap问题

虽然NUMA的原理相对复杂,实际上解决swap却很简单:只要在启动MySQL之前使用numactl –interleave来修改NUMA策略即可。

值得注意的是,numactl这个命令不仅仅可以调整NUMA策略,也可以用来查看当前各个node的资源使用情况,是一个很值得研究的命令。

 

 

一、CPU
  首先从CPU说起。
  你仔细检查的话,有些服务器上会有的一个有趣的现象:你cat /proc/cpuinfo时,会发现CPU的频率竟然跟它标称的频率不一样:
  #cat /proc/cpuinfo
  processor : 5
  model name : Intel(R) Xeon(R) CPU E5-2620 0 @2.00GHz
  cpu MHz : 1200.000
  这个是Intel E5-2620的CPU,他是2.00G * 24的CPU,但是,我们发现第5颗CPU的频率为1.2G。
  这是什么原因呢?
  这些其实都源于CPU最新的技术:节能模式。操作系统和CPU硬件配合,系统不繁忙的时候,为了节约电能和降低温度,它会将CPU降频。这对环保人士和抵制地球变暖来说是一个福音,但是对MySQL来说,可能是一个灾难。
  为了保证MySQL能够充分利用CPU的资源,建议设置CPU为最大性能模式。这个设置可以在BIOS和操作系统中设置,当然,在BIOS中设置该选项更好,更彻底。由于各种BIOS类型的区别,设置为CPU为最大性能模式千差万别,我们这里就不具体展示怎么设置了。
  然后我们看看内存方面,我们有哪些可以优化的。
  i) 我们先看看numa
  非一致存储访问结构 (NUMA : Non-Uniform Memory Access) 也是最新的内存管理技术。它和对称多处理器结构 (SMP : Symmetric Multi-Processor) 是对应的。简单的队别如下:
  如图所示,详细的NUMA信息我们这里不介绍了。但是我们可以直观的看到:SMP访问内存的都是代价都是一样的;但是在NUMA架构下,本地内存的访问和非 本地内存的访问代价是不一样的。对应的根据这个特性,操作系统上,我们可以设置进程的内存分配方式。目前支持的方式包括:
  --interleave=nodes
  --membind=nodes
  --cpunodebind=nodes
  --physcpubind=cpus
  --localalloc
  --preferred=node
  简而言之,就是说,你可以指定内存在本地分配,在某几个CPU节点分配或者轮询分配。除非 是设置为--interleave=nodes轮询分配方式,即内存可以在任意NUMA节点上分配这种方式以外。其他的方式就算其他NUMA节点上还有内 存剩余,Linux也不会把剩余的内存分配给这个进程,而是采用SWAP的方式来获得内存。有经验的系统管理员或者DBA都知道SWAP导致的数据库性能 下降有多么坑爹。
  所以最简单的方法,还是关闭掉这个特性。
  关闭特性的方法,分别有:可以从BIOS,操作系统,启动进程时临时关闭这个特性。
  a) 由于各种BIOS类型的区别,如何关闭NUMA千差万别,我们这里就不具体展示怎么设置了。
  b) 在操作系统中关闭,可以直接在/etc/grub.conf的kernel行最后添加numa=off,如下所示:
  kernel /vmlinuz-2.6.32-220.el6.x86_64 ro root=/dev/mapper/VolGroup-root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=VolGroup/root rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto rd_LVM_LV=VolGroup/swap rhgb crashkernel=auto quiet KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM  numa=off
  另外可以设置 vm.zone_reclaim_mode=0尽量回收内存。
  c) 启动MySQL的时候,关闭NUMA特性:
  numactl --interleave=all mysqld
  当然,最好的方式是在BIOS中关闭。
  ii) 我们再看看vm.swappiness。
  vm.swappiness是操作系统控制物理内存交换出去的策略。它允许的值是一个百分比的值,最小为0,最大运行100,该值默认为60。vm.swappiness设置为0表示尽量少swap,100表示尽量将inactive的内存页交换出去。
  具体的说:当内存基本用满的时候,系统会根据这个参数来判断是把内存中很少用到的inactive 内存交换出去,还是释放数据的cache。cache中缓存着从磁盘读出来的数据,根据程序的局部性原理,这些数据有可能在接下来又要被读 取;inactive 内存顾名思义,就是那些被应用程序映射着,但是 长时间 不用的内存。
  我们可以利用vmstat看到inactive的内存的数量:
  #vmstat -an 1
  procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
  r b swpd free inact active si so bi bo in cs us sy id wa st
  1 0 0 27522384 326928 1704644 0 0 0 153 11 10 0 0 100 0 0
  0 0 0 27523300 326936 1704164 0 0 0 74 784 590 0 0 100 0 0
  0 0 0 27523656 326936 1704692 0 0 8 8 439 1686 0 0 100 0 0
  0 0 0 27524300 326916 1703412 0 0 4 52 198 262 0 0 100 0 0
  通过/proc/meminfo 你可以看到更详细的信息:
  #cat /proc/meminfo | grep -i inact
  Inactive: 326972 kB
  Inactive(anon): 248 kB
  Inactive(file): 326724 kB
  Here we further in-depth discussion of the inactive inactive memory. Linux, the memory may be in three states: free, active and inactive. As we all know, Linux Kernel internally maintains a lot of the LRU list for managing memory, such as LRU_INACTIVE_ANON, LRU_ACTIVE_ANON, LRU_INACTIVE_FILE, LRU_ACTIVE_FILE, LRU_UNEVICTABLE. Which LRU_INACTIVE_ANON, LRU_ACTIVE_ANON used to manage anonymous page, LRU_INACTIVE_FILE, LRU_ACTIVE_FILE for managing page caches the page cache. System kernel based on available memory pages, from time to time will be active in active memory is moved to the inactive list, these inactive memory can be swapped to swap to go.
  In general, MySQL, InnoDB especially memory cache management, it takes up more memory, the memory will be a lot less frequently accessed, this memory error if the Linux swap out, will waste a lot of CPU and IO resources. InnoDB to manage cache, cache file data is taking up memory for InnoDB almost no benefit.
  So, we'd better set vm.swappiness on the MySQL server = 1 or 0

Guess you like

Origin www.cnblogs.com/jinanxiaolaohu/p/11420751.html