Precious experience! Spring Boot memory leaks again, it is too difficult to troubleshoot!

background

In order to better manage the project, we migrated a project in the group to the MDP framework (based on Spring Boot), and then we found that the system frequently reported excessive usage of the Swap area. The author was called to help check the reason, and found that 4G heap memory was configured, but the actual physical memory used was as high as 7G, ​​which is indeed abnormal. JVM parameter configuration is "-XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=256M -XX:+AlwaysPreTouch -XX:ReservedCodeCacheSize=128m -XX:InitialCodeCacheSize=128m, -Xss512k -Xmx4g -Xms4g,-XX:+UseG1GC -XX: G1HeapRegionSize=4M", the actual physical memory used is shown in the figure below:

Memory status displayed by top command

Investigation process

1. Use Java-level tools to locate the memory area (in-heap memory, Code area, or off-heap memory requested by unsafe.allocateMemory and DirectByteBuffer)

It added in the project -XX:NativeMemoryTracking=detailJVM parameters to restart the project, use the command jcmd pid VM.native_memory detailto view the memory distributed as follows:

Memory status displayed by jcmd

It is found that the committed memory displayed by the command is less than the physical memory, because the memory displayed by the jcmd command includes the memory in the heap, the Code area, and the memory requested through unsafe.allocateMemory and DirectByteBuffer, but does not include the off-heap memory requested by other Native Code (C code) . So guess is the problem caused by using Native Code to apply for memory.

In order to prevent misjudgment, I used pmap to check the memory distribution and found a large number of 64M addresses; and these address spaces are not in the address space given by the jcmd command, basically it is concluded that these 64M memory are caused.

Memory status displayed by pmap

2. Use system-level tools to locate off-heap memory

Because the author has basically determined that it is caused by Native Code, and Java-level tools are not convenient for troubleshooting such problems, and can only use system-level tools to locate the problem.

First, use gperftools to locate the problem

For the usage of gperftools, please refer to gperftools. The monitoring of gperftools is as follows:

gperftools monitoring

It can be seen from the above figure that the memory requested by malloc is released after up to 3G, and then remains at 700M-800M. The author's first reaction is: Didn't the Native Code use malloc to apply, and directly use mmap/brk to apply? (The principle of gperftools uses dynamic linking to replace the operating system's default memory allocator (glibc).)

Then, use strace to track system calls

Because the memory was not traced using gperftools, I directly used the command "strace -f -e" brk, mmap, munmap" -p pid" to track the memory request to the OS, but no suspicious memory request was found. The strace monitoring is shown in the following figure:

strace monitoring

Next, use GDB to dump the suspicious memory

Because the use of strace did not track the suspicious memory request; so I wanted to look at the memory situation. Command is used directly gdp-pid pidafter entering the GDB, and then use the command dumpmemory mem.bin startAddress endAddressdump memory, which can be found startAddress and endAddress from / proc / pid / smaps in. Then strings mem.binview the contents of the dump, as follows:

gperftools monitoring

From the content point of view, it looks like the JAR package information after decompression. The JAR package information should be read when the project is started, so using strace after the project is started is not very useful. Therefore, strace should be used when the project is started, not after the start is completed.

Again, strace is used to track system calls when the project starts

The project started using strace to track system calls, and found that it did apply for a lot of 64M memory space. The screenshot is as follows:

strace monitoring

The address space applied for using this mmap corresponds to the following in pmap:

pmap address space corresponding to strace application content

Finally, use jstack to view the corresponding thread

Because the thread ID that applies for memory is already displayed in the strace command. Directly on the command jstack pidto view the thread stack, find the corresponding thread stack (note 10 decimal and hexadecimal conversion) as follows:

Thread stack for strace application space

Basically, the problem can be seen here: MCC (Meituan Unified Configuration Center) uses Reflections to scan packages, and the bottom layer uses Spring Boot to load JARs. Because the Inflater class is used to decompress the JAR, off-heap memory is needed, and then Btrace is used to trace this class. The stack is as follows:

btrace trace stack

Then check the place where MCC is used, and found that there is no package scanning path configured, and the default is to scan all packages. So modify the code, configure the package scanning path, and solve the memory problem after the release is online.

3. Why is the off-heap memory not released?

Although the problem has been solved, there are several questions:

  • Why is there no problem using the old framework?

  • Why is the off-heap memory not released?

  • Why is the memory size all 64M, and the JAR size cannot be so big, and they are all the same size?

  • Why does gperftools finally show that the used memory size is about 700M, does the decompression package really not use malloc to apply for memory?

With doubts, the author directly looked at the source code of Spring Boot Loader. It is found that Spring Boot wraps the InflaterInputStream of the Java JDK and uses Inflater, and Inflater itself needs to use off-heap memory for decompressing JAR packages. The wrapped class ZipInflaterInputStream did not release the off-heap memory held by Inflater. So I thought I found the reason, and immediately reported this bug to the Spring Boot community. But after the feedback, the author found that the Inflater object itself implements the finalize method, in which the logic to release off-heap memory is called. In other words, Spring Boot relies on GC to release off-heap memory.

When I used jmap to view objects in the heap, I found that there was basically no Inflater object. So I suspected that finalize was not called during GC. With this suspicion, I packaged the Inflater in Spring Boot Loader and replaced it with the Inflater packaged by myself, and performed some monitoring in finalize, and the finalize method was indeed called. So I went to look at the C code corresponding to Inflater and found that malloc was used to apply for memory for initialization, and free was called to release memory at the end.

At this moment, the author can only suspect that the memory is not really released when free, so I replaced the InflaterInputStream packaged by Spring Boot with the one that comes with the Java JDK. After finding that the replacement, the memory problem was also solved.

At this time, I came back to look at the memory distribution of gperftools and found that when using Spring Boot, the memory usage has been increasing, and suddenly the memory usage dropped a lot at a certain point (the usage was directly reduced from 3G to about 700M). This point should be caused by the GC. The memory should be released, but no memory changes are seen at the operating system level. Is it not released to the operating system and held by the memory allocator?

Continue to explore and found that the system default memory allocator (glibc version 2.12) and the use of gperftools memory address distribution is very different, 2.5G address using smaps found that it belongs to the Native Stack. The memory addresses are distributed as follows:

Memory address distribution displayed by gperftools

At this point, it can basically be determined that the memory allocator is playing tricks; I searched glibc 64M and found that glibc introduced a memory pool for each thread starting from 2.11 (64-bit machine size is 64M memory), the original text is as follows:

Glib memory pool description

Modify the MALLOC ARENA MAX environment variable according to the article, and found no effect. Check that tcmalloc (the memory allocator used by gperftools) also uses the memory pool method.

In order to verify that it is the ghost of the memory pool, I simply write a memory allocator without a memory pool. Use the command gcc zjbmalloc.c-fPIC-shared-o zjbmalloc.soto generate dynamic libraries, and then use the exportLD_PRELOAD=zjbmalloc.soreplace glibc memory allocator. The code Demo is as follows:

 
  1. #include<sys/mman.h>

  2. #include<stdlib.h>

  3. #include<string.h>

  4. #include<stdio.h>

  5. //作者使用的64位机器,sizeof(size_t)也就是sizeof(long)

  6. void* malloc ( size_t size )

  7. {

  8.   long* ptr = mmap( 0, size + sizeof(long), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0 );

  9.   if (ptr == MAP_FAILED) {

  10.      return NULL;

  11.   }

  12.   *ptr = size;                     // First 8 bytes contain length.

  13.   return (void*)(&ptr[1]);        // Memory that is after length variable

  14. }

  15. void *calloc(size_t n, size_t size) {

  16. void* ptr = malloc(n * size);

  17. if (ptr == NULL) {

  18.    return NULL;

  19. }

  20. memset(ptr, 0, n * size);

  21. return ptr;

  22. }

  23. void *realloc(void *ptr, size_t size)

  24. {

  25. if (size == 0) {

  26.    free(ptr);

  27.    return NULL;

  28. }

  29. if (ptr == NULL) {

  30.    return malloc(size);

  31. }

  32. long *plen = (long*)ptr;

  33. plen--;                          // Reach top of memory

  34. long len = *plen;

  35. if (size <= len) {

  36.    return ptr;

  37. }

  38. void* rptr = malloc(size);

  39. if (rptr == NULL) {

  40.    free(ptr);

  41.    return NULL;

  42. }

  43. rptr = memcpy(rptr, ptr, len);

  44. free(ptr);

  45. return rptr;

  46. }

  47. void free (void* ptr )

  48. {

  49.   if (ptr == NULL) {

  50.     return;

  51.   }

  52.   long *plen = (long*)ptr;

  53.   plen--;                          // Reach top of memory

  54.   long len = *plen;               // Read length

  55.   munmap((void*)plen, len + sizeof(long));

  56. }

By burying points in the custom allocator, it can be found that the off-heap memory actually requested by the application after the program is started is always between 700M-800M, and gperftools monitoring shows that the memory usage is also around 700M-800M. But from the operating system point of view, the memory occupied by the process is very different (here only monitoring the off-heap memory).

The author did a test and used different allocators to scan packets in different degrees. The memory occupied is as follows:

Memory test comparison

Why does the custom malloc apply for 800M, and the physical memory that it eventually occupies is 1.7G?

Because the custom memory allocator uses mmap to allocate memory, the mmap allocated memory is rounded up to an integer number of pages as needed, so there is a huge waste of space. Through monitoring, it is found that the number of pages finally requested is about 536k, and the memory actually requested to the system is equal to 512k * 4k (pagesize) = 2G. Why is this data greater than 1.7G?

Because the operating system adopts a delayed allocation method, when applying memory to the system through mmap, the system only returns the memory address and does not allocate real physical memory. Only when it is actually used, the system generates a page fault interrupt, and then allocates the actual physical page.

to sum up

flow chart

The entire memory allocation process is shown in the figure above. The default configuration of MCC scan package is to scan all JAR packages. When scanning packages, Spring Boot will not actively release off-heap memory, resulting in a continuous increase in the off-heap memory usage during the scanning phase. When GC occurs, Spring Boot relies on the finalize mechanism to release off-heap memory; but for performance reasons, glibc did not really return the memory to the operating system, but left it in the memory pool, causing the application layer to think A "memory leak" occurred. So modify the configuration path of MCC to a specific JAR package, the problem is solved. When the author published this article, I found that the latest version of Spring Boot (2.0.5.RELEASE) has been modified. ZipInflaterInputStream actively releases off-heap memory and no longer depends on GC; therefore, Spring Boot is upgraded to the latest version. Can be resolved.

Guess you like

Origin blog.csdn.net/sinat_37903468/article/details/108730256