Memory management of analytical curve resource occupation

foreword

This article mainly talks about the memory management of Curve from the perspective of the user. It does not require some memory management theories in the development of developers. The purpose is that we share the memory management and memory problem analysis of Linux in the software process with the Curve practice process . Two related questions.

  • The memory on the chunkserver cannot be released

  • mds shows slow growth

The memory usage in most development phases is very serious. In the discovery phase, the high-pressure stability test (7*7*24 hours or more), and test exceptions are usually more prone to problems during runtime. Of course, this also requires us to suddenly appear during the testing phase. In addition to paying attention In addition to io-related test indicators, monitor the usage of resources such as server memory/CPU/network card and whether the collected indicators meet the conditions. For example, the above problem md grows slowly  . If you only pay attention to whether it is normal, it will not be found during the testing phase. It's also not easy after memory issues arise, especially in the case of large software.

This article mainly talks about Curve's memory management from the perspective of the user. It does not require some memory management theories in the development of developers. The purpose is to share with you some of our views on Linux memory management and memory problem analysis in the software process. This article will expand from the following aspects:

  • Curve software instructions calculate the distribution.

  • memory policy. Explain the necessity of the allocator, as well as the problems that need to be solved and its characteristics, and illustrate the memory management method including an allocator through an example.

  • Curve's memory management. Describes the current Curve software memory allocator selection and why.

Curve is a blockchain storage cloud computing foundation) sandbox, which is composed of Yijian.com projects (CNCF, Yiwei, cloud storage operation system), and consists of two parts: block Curve and cloud file storage operation system.

2

geographic distribution

Before talking about computing management, first briefly introduce the relevant knowledge of distribution. The physical memory is the real memory stick; the existence of virtual memory hides the concept of physical memory for the process, and provides a convenient interface and complexity for the process . Need to abstract virtual memory? How are virtual memory and physical memory mapped to the management layer? What is the scope of the physical processing discussion? The problem with these virtual memory is not here.

Linux 为每个进程维护了一个单独的虚拟地址空间,包括两个部分进程虚拟存储器(用户空间)和内核虚拟存储器(内核空间),本文主要讨论进程可操作的用户空间,形式如下图。

Now we use our 

pmap View the distribution of the running curve-mds virtual space.

pmap View memory screen information for a process, the command reads 

Information in /proc/[pid]/maps.

// pmap -X {进程id} 查看进程内存分布sudo pmap -X 2804620
// pmap 获取的 curve-mds 内存分布有很多项Address Perm   Offset Device    Inode    Size   Rss   Pss Referenced Anonymous ShmemPmdMapped Shared_Hugetlb Private_Hugetlb Swap SwapPss Locked Mapping
// 为了方便展示这里把从 Pss 后面的数值删除了, 中间部分地址做了省略2804620:   /usr/bin/curve-mds -confPath=/etc/curve/mds.conf -mdsAddr=127.0.0.1:6666 -log_dir=/data/log/curve/mds -graceful_quit_on_sigterm=true -stderrthreshold=3         Address Perm   Offset Device    Inode    Size   Rss   Pss   Mapping      c000000000 rw-p 00000000  00:00        0   65536  1852  1852      559f0e2b9000 r-xp 00000000  41:42 37763836    9112  6296  6296   curve-mds    559f0eb9f000 r--p 008e5000  41:42 37763836     136   136   136   curve-mds    559f0ebc1000 rw-p 00907000  41:42 37763836       4     4     4   curve-mds    559f0ebc2000 rw-p 00000000  00:00        0   10040  4244  4244    559f1110a000 rw-p 00000000  00:00        0    2912  2596  2596   [heap]    7f6124000000 rw-p 00000000  00:00        0     156   156   156    7f6124027000 ---p 00000000  00:00        0   65380     0     0    7f612b7ff000 ---p 00000000  00:00        0       4     0     0    7f612b800000 rw-p 00000000  00:00        0    8192     8     8    7f612c000000 rw-p 00000000  00:00        0     132     4     4    7f612c021000 ---p 00000000  00:00        0   65404     0     0    .....    7f6188cff000 ---p 0026c000  41:42 37750237    2044     0     0    7f61895b7000 r-xp 00000000  41:42 50201214      96    96     0    libpthread-2.24.so    7f61895cf000 ---p 00018000  41:42 50201214    2044     0     0    libpthread-2.24.so    7f61897ce000 r--p 00017000  41:42 50201214       4     4     4    libpthread-2.24.so    7f61897cf000 rw-p 00018000  41:42 50201214       4     4     4    libpthread-2.24.so    7f61897d0000 rw-p 00000000  00:00        0      16     4     4        7f61897d4000 r-xp 00000000  41:42 50200647      16    16     0    libuuid.so.1.3.0    7f61897d8000 ---p 00004000  41:42 50200647    2044     0     0    libuuid.so.1.3.0    7f61899d7000 r--p 00003000  41:42 50200647       4     4     4    libuuid.so.1.3.0    7f61899d8000 rw-p 00004000  41:42 50200647       4     4     4    libuuid.so.1.3.0    7f61899d9000 r-xp 00000000  41:42 37617895    9672  8904  8904    libetcdclient.so    7f618a34b000 ---p 00972000  41:42 37617895    2048     0     0    libetcdclient.so    7f618a54b000 r--p 00972000  41:42 37617895    6556  5664  5664    libetcdclient.so    7f618abb2000 rw-p 00fd9000  41:42 37617895     292   252   252    libetcdclient.so    7f618abfb000 rw-p 00000000  00:00        0     140    60    60        7f618ac1e000 r-xp 00000000  41:42 50201195     140   136     0    ld-2.24.so    7f618ac4a000 rw-p 00000000  00:00        0    1964  1236  1236        7f618ae41000 r--p 00023000  41:42 50201195       4     4     4    ld-2.24.so    7f618ae42000 rw-p 00024000  41:42 50201195       4     4     4    ld-2.24.so    7f618ae43000 rw-p 00000000  00:00        0       4     4     4        7fffffd19000 rw-p 00000000  00:00        0     132    24    24    [stack]    7fffffdec000 r--p 00000000  00:00        0       8     0     0    [vvar]    7fffffdee000 r-xp 00000000  00:00        0       8     4     0    [vdso]ffffffffff600000 r-xp 00000000  00:00        0       4     0     0    [vsyscall]                                               ======= ===== =====                                                1709344 42800 37113 上面输出中进程实际占用的空间是从 0x559f0e2b9000 开始,
  • In the above output, the process actually starts from 0x559f0e2b9000, not 0x400000000 in the address allocation picture above. This is because of the address allocation space (ASLR), which is used to generate process space (such as stack, library or heap). Known address attack. The difference between 1 and 2 is that the key part of modernization is different; 0 means off.

  • Next 0x559F0E2B90000x559F0EB9F000 0x559F0EBC1000 three three three address yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes but but but but this has different different curve-mds is sprite from related From the point of view, it includes code, data, BSS segment, etc.; from the perspective of multiple memory types, the operation does not include the content involved in each segment, but only related to the content. The representative, mainly, so the segment with the same authority as the operation problem can be loaded here, that is, the code segment we use as the representative, the code segment as the authority, the data segment as the outstanding segment and the data segment as the authority segment The permission of the BSS segment, with the data segment as the writable segment.

  • Then start at 0x559f1110a000 to meet the runtime heap in the above figure, and the memory location dynamically allocated at runtime will be carried out during this process. We found that there is also a frequent offset in the .bss section.

  • Execute 0x7f612400000000 to start, which is the image on the mmap dynamic mapping library, the user, the large memory contained in this area, etc. Memory, which will be expanded in the next section on memory allocation strategy, is also the focus of this article.

  • Turn on 0x7fffffd19000 to begin with the stack space, generally several megabytes.

  • The vvar and vsys calls are to implement virtual function calls to speed up some system calls, and call a program so you can call the system call without entering the kernel state. The zone is not here.

3

allocation strategy

We do this with two weekly system calls called malloc and Memory Mapping Region: brk and mmmap

  • The region heap allocated by brk

  • mmap-allocated area memory-mapped area

If developers are allowed to use the system call brk and mmap to allocate and release memory in software development, it is easy to use it directly in development, and generally it is rarely used. Memory management library, memory management manager for currently available resource allocators: there is a free interface to the resource library tcmloc, and glibc uses system calls to these libraries' regions. A generic memory allocator should have the following characteristics:

  • Additional application space is only consumed as simply as needed. For example, for 5k memory, the allocator allocates 10k to him, which will cause a waste of space.

  • Assigned acceleration distance.

  • Below we combine the picture to feel that there is no trace.

  • Versatility, versatility, portability, and easy debugging.

We show the memory management method of glibc through the following picture

  • malloc(30k) allocates memory by extending the top of the heap through the system call brk.

  • malloc(20k) continues to expand the top of the heap through the system call brk.

  • malloc(200k) requests memory greater than 128K by default

    (determined by M_MMAP_THRESHOLD, the default size is 128K, which can be adjusted), then use the system call mmap to allocate memory.

  • free (this part of the space is also managed by ptmalloc. The two steps of malloc in steps 1 and 2 can call brk to expand when we allocate space without the top of the heap. The space for the system is that this space is a bunch of simultaneous operations The top of the heap. The 10k space is from the descending space, without the need for brk to apply. Consider a situation where the top space has been occupied, and some of the heap space is released by the application here, but because this space is no longer used , on it will form memory fragmentation.

  • free(20) After this part of space is freed by the application program, the 2-segment and 30k-segment areas of ptmalloc are merged. If the block at the top of the heap exceeds M_TRIM_THREASHOLD, the block area will be split and assigned to the operation.

  • The space allocated by free(200k) mmap will be directly returned to the system.

How does ptloc allocate multiprocessing programs? Top, multiple application space of HEAP_MAX_Z, the application in most areas is the application, for the operation, allocation, competition area of ​​most areas, it is easy to apply. ptmalloc contains multiple allocation areas using deployment types: main allocation area and dynamic allocation area.

  • Main allocation area: memory will be allocated in two areas: the heap and the memory mapping area;

  • Dynamic allocation area: Allocate memory in the memory mapping area, and the default is the size of each application in 64 systems. The main thread bit and the thread that executes malloc first use different dynamic allocation areas, and the number of dynamic allocation areas will not increase. The number of dynamic allocation areas is at most (2 cores + 1) for 32-bit systems, and for 64-bit systems The maximum is (8 cores + 1).

Let's look at the space allocation in this case as an example of the following problem:

// 共有三个线程// 主线程:分配一次 4k 空间// 线程1: 分配 100 次 4k 空间// 线程2: 分配 100 次 4k 空间# include <stdio.h># include <stdlib.h># include <pthread.h># include <unistd.h># include <sys/types.h>
void* threadFunc(void* id) {
   
   std::vector<char *> malloclist;for (int i = 0; i < 100; i++) {
   
     malloclist.emplace_back((char*) malloc(1024 * 4));}
sleep(300); // 这里等待是为查看内存分布}int main() {
   
       pthread_t t1,t2;int id1 = 1;int id2 = 2;    void* s;    int ret;    char* addr;
    addr = (char*) malloc(4 * 1024);    pthread_create(&t1, NULL, threadFunc, (void *) &id1);    pthread_create(&t2, NULL, threadFunc, (void *) &id2);
pthread_join(t1, NULL);pthread_join(t2, NULL);    return 0;}

Let's use pmap to view the cost distribution of the program:

741545:   ./memory_test         Address Perm   Offset Device    Inode   Size  Rss Pss  Mapping    56127705a000 r-xp 00000000  08:02 62259273      4    4   4  memory_test    56127725a000 r--p 00000000  08:02 62259273      4    4   4  memory_test    56127725b000 rw-p 00001000  08:02 62259273      4    4   4  memory_test    5612784b9000 rw-p 00000000  00:00        0    132    8   8  [heap]    **7f0df0000000 rw-p 00000000  00:00        0    404  404 404      7f0df0065000 ---p 00000000  00:00        0  65132    0   0      7f0df8000000 rw-p 00000000  00:00        0    404  404 404      7f0df8065000 ---p 00000000  00:00        0  65132    0   0**      7f0dff467000 ---p 00000000  00:00        0      4    0   0      7f0dff468000 rw-p 00000000  00:00        0   8192    8   8      7f0dffc68000 ---p 00000000  00:00        0      4    0   0      7f0dffc69000 rw-p 00000000  00:00        0   8192    8   8      7f0e00469000 r-xp 00000000  08:02 50856517   1620 1052   9   libc-2.24.so    7f0e005fe000 ---p 00195000  08:02 50856517   2048    0   0   libc-2.24.so    7f0e007fe000 r--p 00195000  08:02 50856517     16   16  16   libc-2.24.so    7f0e00802000 rw-p 00199000  08:02 50856517      8    8   8   libc-2.24.so    7f0e00804000 rw-p 00000000  00:00        0     16   12  12       7f0e00808000 r-xp 00000000  08:02 50856539     96   96   1   libpthread-2.24.so    7f0e00820000 ---p 00018000  08:02 50856539   2044    0   0   libpthread-2.24.so    7f0e00a1f000 r--p 00017000  08:02 50856539      4    4   4   libpthread-2.24.so    7f0e00a20000 rw-p 00018000  08:02 50856539      4    4   4   libpthread-2.24.so    7f0e00a21000 rw-p 00000000  00:00        0     16    4   4       7f0e00a25000 r-xp 00000000  08:02 50856513    140  140   1   ld-2.24.so    7f0e00c31000 rw-p 00000000  00:00        0     16   16  16       7f0e00c48000 r--p 00023000  08:02 50856513      4    4   4   ld-2.24.so    7f0e00c49000 rw-p 00024000  08:02 50856513      4    4   4   ld-2.24.so    7f0e00c4a000 rw-p 00000000  00:00        0      4    4   4       7ffe340be000 rw-p 00000000  00:00        0    132   12  12   [stack]    7ffe3415c000 r--p 00000000  00:00        0      8    0   0   [vvar]    7ffe3415e000 r-xp 00000000  00:00        0      8    4   0   [vdso]ffffffffff600000 r-xp 00000000  00:00        0      4    0   0   [vsyscall]                                               ====== ==== ===                                                153800 2224 943 

Pay attention to the part with thick lines, the blue area is 65536K, of which 404K is rw-p (can be added and written) permission, 65132K is --p (not writable) permission; when the yellow area is allocated, ptmalloc gives one respectively Dynamic partition, apply for 64M memory each time, and then split it to the application from 64M.

There is also an anti-program phenomenon: we use s -f -e "brk, mmap, munmap" -p {pid} to track and check the malloc system call:

mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f624a169000strace: Process 774601 attached[pid 774018] mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f6249968000[pid 774601] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f6241968000[pid 774601] munmap(0x7f6241968000, 40468480strace: Process 774602 attached) = 0[pid 774601] munmap(0x7f6248000000, 26640384) = 0[pid 774602] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f623c000000[pid 774602] munmap(0x7f6240000000, 67108864) = 0

Here [774018] 8m+4K space thread thread 1 [774601] first MMAP the 128m space, and then return the part of 0x7f6241968000 is 0x7F6244000000 ~ 0x7F6248000000. The purpose of applying first and then returning is to let the start and end addresses of the allocated memory be byte-focused.

3

Curve's memory management

The allocators are selected in the curve: ptmalloc and terminal jemalloc. Among them, MDS uses the default ptmalloc, and Chunkserver and Client use jemalloc.

The two mentioned in this article are explained here. The first is the slow growth of MDS, the phenomenon is the daily growth of 3G. The analysis process for this problem is as follows:

1. The first is to use pmap memory allocation. From here it is suspected that there is a motive.

2815659:   /usr/bin/curve-mds -confPath=/etc/curve/mds.conf -mdsAddr=*.*.*.*:6666 -log_dir=/data/log/curve/mds -graceful_quit_on_sigterm=true -stderrthreshold=3         Address Perm   Offset Device    Inode    Size    Rss    Pss Referenced Anonymous ShmemPmdMapped Shared_Hugetlb Private_Hugetlb Swap SwapPss Locked Mapping      c000000000 rw-p 00000000  00:00        0    8192   4988   4988       4988      4988              0              0               0    0       0      0      c000800000 rw-p 00000000  00:00        0   57344      0      0          0         0              0              0               0    0       0      0    557c5abb6000 r-xp 00000000  41:42 55845493    9112   6488   6488       6488         0              0              0               0    0       0      0 /usr/bin/curve-mds    557c5b49c000 r--p 008e5000  41:42 55845493     136    136    136        136       136              0              0               0    0       0      0 /usr/bin/curve-mds    557c5b4be000 rw-p 00907000  41:42 55845493       4      4      4          4         4              0              0               0    0       0      0 /usr/bin/curve-mds    557c5b4bf000 rw-p 00000000  00:00        0   10040   2224   2224       2224      2224              0              0               0    0       0      0    557c5cce2000 rw-p 00000000  00:00        0    5604   5252   5252       5252      5252              0              0               0    0       0      0 [heap]    7f837f7ff000 ---p 00000000  00:00        0       4      0      0          0         0              0              0               0    0       0      0    7f837f800000 rw-p 00000000  00:00        0    8192      8      8          8         8              0              0               0    0       0      0    7f8380000000 rw-p 00000000  00:00        0     132     12     12         12        12              0              0               0    0       0      0......    7fbcf8000000 rw-p 00000000  00:00       0     65536    65536    65536      65536     65536              0              0               0    0       0      0    7fbcfc000000 rw-p 00000000  00:00       0     65536    65536    65536      65528     65536              0              0               0    0       0      0    7fbd04000000 rw-p 00000000  00:00       0     65536    65536    65536      65520     65536              0              0               0    0       0      0    7fbd08000000 rw-p 00000000  00:00       0     65536    65536    65536      65536     65536              0              0               0    0       0      0    7fbd0c000000 rw-p 00000000  00:00       0     65536    65536    65536      65528     65536              0              0               0    0       0      0    7fbd10000000 rw-p 00000000  00:00       0     65536    65536    65536      65524     65536              0              0               0    0       0      0    7fbd14000000 rw-p 00000000  00:00       0     65536    65536    65536      65532     65536              0              0               0    0       0      0    7fbd18000000 rw-p 00000000  00:00       0     65536    65536    65536      65536     65536              0              0               0    0       0      0    7fbd1c000000 rw-p 00000000  00:00       0     65536    65536    65536      65524     65536              0              0               0    0       0      0    7fbd20000000 rw-p 00000000  00:00       0     65536    65536    65536      65524     65536              0              0               0    0       0      0    7fbd24000000 rw-p 00000000  00:00       0     65536    65536    65536      65512     65536              0              0               0    0       0      0    7fbd28000000 rw-p 00000000  00:00       0     65536    65536    65536      65520     65536              0              0               0    0       0      0    7fbd2c000000 rw-p 00000000  00:00       0     65536    65536    65536      65520     65536              0              0               0    0       0      0    7fbd30000000 rw-p 00000000  00:00       0     65536    65536    65536      65516     65536              0              0               0    0       0      0......                                               ======= ====== ====== ========== ========= ============== ============== =============== ==== ======= ======                                               7814504 272928 263610     272928    248772              0              0               0    0       0      0 KB

Check the relevant pressure monitoring indicators on the MDS, and find that the pressure on the MDS is small, and some surface iops are in the test, which should not be caused by the pressure control on the PC.

Use gdb -p {pid} attach to track the thread, dump memory mem.bin {addr1} {addr2} to get the memory of the specified address segment, and then check some of the memory. The content below the basic address is part of the dumped content, many of which are 01 Strings, as well as some measurement information and other information, can basically be divided into several pieces.

Arrange the code according to these points to see if there is any display. 01 is the fragment information of our file information, string information, file and mapping information, which is converted into a key-value median for encoding. Those that will allocate memory are GetFileInfo, GetFileInfo, and other interfaces, and other interfaces, and use the configuration information configured by egg after being found in these interfaces. MDS data transfer to the memory needs to be released by C management. The following part of the memory was forgotten to be released when writing the code.

During the trial period, I also tried it once. ,Order:

valgrind --tool=memcheck --leak-check=full --show-reachable=yes --trace-children=yes --track-origins=yes /usr/bin/curve-mds -confPath=/etc/curve/mds.conf -mdsAddr=*.*.*.*:6666 -log_dir=/data/log/curve/mds -graceful_quit_on_sigterm=true -stderrthreshold=3

The result is very long, part of it is intercepted, and there is a more obvious prompt in the yellow marked area:

==1559781== 13,440 bytes in 40 blocks are possibly lost in loss record 2,296 of 2,367==1559781==    at 0x4C2DBC5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)==1559781==    by 0x4011E31: allocate_dtv (dl-tls.c:322)==1559781==    by 0x40127BD: _dl_allocate_tls (dl-tls.c:539)==1559781==    by 0x628A189: allocate_stack (allocatestack.c:584)==1559781==    by 0x628A189: pthread_create@@GLIBC_2.2.5 (pthread_create.c:663)==1559781==    by 0x54F88D: bthread::TaskControl::add_workers(int) (in /usr/bin/curve-mds)==1559781==    by 0x54147B: bthread_setconcurrency (in /usr/bin/curve-mds)==1559781==    by 0x35A0C9: brpc::Server::Init(brpc::ServerOptions const*) (in /usr/bin/curve-mds)==1559781==    by 0x35AAD7: brpc::Server::StartInternal(in_addr const&, brpc::PortRange const&, brpc::ServerOptions const*) (in /usr/bin/curve-mds)==1559781==    by 0x35BCBC: brpc::Server::Start(butil::EndPoint const&, brpc::ServerOptions const*) (in /usr/bin/curve-mds)==1559781==    by 0x35BD60: brpc::Server::Start(char const*, brpc::ServerOptions const*) (in /usr/bin/curve-mds)==1559781==    by 0x19442B: curve::mds::MDS::StartServer() (in /usr/bin/curve-mds)==1559781==    by 0x194A61: curve::mds::MDS::Run() (in /usr/bin/curve-mds)==1559781==**==1559781== 85,608 bytes in 4,125 blocks are definitely lost in loss record 2,333 of 2,367==1559781==    at 0x4C2BBAF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)==1559781==    by 0x56D6863: _cgo_728933f4f8ea_Cfunc__Cmalloc (_cgo_export.c:502)==1559781==    by 0x51A9817: runtime.asmcgocall (/home/xuchaojie/github/curve/thirdparties/etcdclient/tmp/go/src/runtime/asm_amd64.s:635)==1559781==    by 0xC00000077F: ???==1559781==    by 0x300000005: ???**==1559781====1559781== LEAK SUMMARY:==1559781==    definitely lost: 85,655 bytes in 4,128 blocks==1559781==    indirectly lost: 0 bytes in 0 blocks==1559781==      possibly lost: 94,392 bytes in 125 blocks==1559781==    still reachable: 27,012,904 bytes in 13,518 blocks==1559781==                       of which reachable via heuristic:==1559781==                         newarray           : 3,136 bytes in 2 blocks==1559781==                         multipleinheritance: 1,616 bytes in 1 blocks==1559781==         suppressed: 0 bytes in 0 blocks==1559781== Reachable blocks (those to which a pointer was found) are not shown.==1559781== To see them, rerun with: --leak-check=full --show-leak-kinds=all==1559781====1559781== For counts of detected and suppressed errors, rerun with: -v==1559781== ERROR SUMMARY: 10331 errors from 988 contexts (suppressed: 0 from 0)

Chunkserver did not use jemalloc from the beginning, and initially used the default ptmalloc. Changing to jemalloc is the problem that the memory of Chunkserver cannot be released during the test process mentioned at the beginning of this article . The phenomenon of this problem is: the memory of the chunkserver increases within 2 but quickly, a total of about 50 hours, and the next 50 hour release.

  • This is different from MDS, which is full of control requests and some metadata caches. First of all, the memory growth on the Chunkserver generally comes from two places: one is the request sent by the user, and the other is the data synchronization between the leader and the follower of the replica set. These two involve brpc. The memory management of brpc has two IOBuf and ResourcePool. The space module in IOBuf is generally used to store user data. ResourcePool module socket, bthread_id and other objects manage memory object units of 64K. The detailed data structure of this module will not be explained here. brpc friends can read the document: brpc -innternal, ResourcePool can see two source codes.

  • Look at the metrics of the trend indicators IOBuf and ResourcePool of these two modules to discover different memory sizes in different ways. IOBuf will return the occupied memory to the allocated memory, and the management trend in ResourcePool will not be returned to ptmalloc and managed by itself.

  • Combined with the memory allocation strategy in Section 2, if the space at the top of the heap has been occupied, the space below the top of the heap cannot be released. You can still look at the size of memory used on the current heap and the permissions of the memory (if there are many —-p memory) to determine what you want. Therefore, jemalloc can be used in the following Chunkserver. The problem.

Here is an MDS, two different problems in the project code and Chunkserver, I want to select different memory allocators for different situations and description curves of the project. , If you have a good experience with more memory, start to decide and solve it, but if not, you can choose first, or analyze it after evaluation; it is a problem that needs to be improved to find a solution; you can give a small Partners solve some ideas~

Guess you like

Origin blog.csdn.net/m0_72650596/article/details/126163846