When Transparent hugepage encounters fork

        The online counting system encountered a strange problem. When the process was backing up, the system memory quickly became smaller, and the 25G memory was eaten up. Finally, the process took up a large amount of swap, resulting in a slow service response and a serious drop in SLA.

        Finally, it is found that it is related to Transparent hugepage, and the specific records are as follows.

 

1 Counting System Backup Instructions

        The cache system occupies a memory level of 10-100G. The backup is performed during the low-peak hours in the early morning every day. The backup logic: the main process forks, then the child process executes the landing task, and the parent process continues to respond to the request. During the backup period, the amount of system changes is very small. The byte of a single kv is less than 64btes, and the changed tps is less than 100.

 

2 os version with thp

        Online Linux kernel 2.6.38 - 2.7, and after 2.6.38, the Linux kernel enables Transparent hugepage by default. Those applications that could not benefit from complex hugepage before can get performance improvement without any modification. , the performance improvement is expected to be around 10%.

 

3 memory destination analysis

        Log analysis found that the entire backup process lasted 10 minutes, the number of modified kvs in the entire process was 100*60*10=60000, and the effective memory changes were 60000*64=about 4M;

        In the fork process, the change of the parent process is a cow process, and the most extreme case can be calculated first: that is, the kv of each change is scattered in a page to calculate the maximum value;

        For ordinary pages, the memory that should be occupied: 4M * 4K=16G; so this process takes up to 16G;

        The system turns on THP by default, and each page becomes 2M, so the maximum memory occupied by this process is 4M*2M=8192G;

        Since the size of kv is very small, many kvs are on the same page, so the actual memory occupied is 25G mem & 20G+ swap;

 

 

4 Turn off THP

        Close THP, restart the counting service, backup process of 100G data, update without pressure.

        小结一下,fork 备份,主进程变更流程变为cow,而Linux默认启用THP,于是出现写放大(即便你更新1 byte,实际内存也可能需要新占用2M bytes),于是内存就成为了瓶颈,swap、响应慢,极端情况下被os干掉而crash 等都会出现。

 

5 如何关闭THP

1)boot时生效 修改grub.cfg

 

transparent_hugepage=never

2) runtime 是修改

 

# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

 

6 More about THP

        Transparent hugepage是一个hugepage管理的简化版,在Linux 2.6.38之后自动生效,只对anonymous mem生效,可以通过  grep -i AnonHugePages /proc/meminfo 查看实际使用情况。

        对于普通的cache应用,默认的THP有性能提升,但如果有fork 且 fork的时间较长,需要观测资源占用情况,必要时关闭THP。

        runtime 期间 关闭THP,之前已经分配的hugepage继续生效,只对新分配的page不再采用hugepage,os也不再scan & merge pages。

        进程可以通过madvise 进行设置是否启用THP:

#include <sys/mman.h>

int madvise(void *addr, size_t length, int advice);

 

 参考:

https://access.redhat.com/solutions/46111

http://lwn.net/Articles/423584/

http://lwn.net/Articles/423592/

 

        

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326687019&siteId=291194637