Fine shock! Spring Boot memory leaks, the investigation actually so hard!

Author | Ji Bing

Source | http://suo.im/5MABXL

background

In order to better manage the project, we will move to the group a project MDP framework (based on Spring Boot), then we find that the system frequently reported Swap area usage is too high anomalies. I was asked to help view the reason that the configurations are within the 4G heap memory, but the actual use of physical memory as high as 7G, ​​really is not normal. JVM parameters are "-XX: MetaspaceSize = 256M -XX: MaxMetaspaceSize = 256M -XX: + AlwaysPreTouch -XX: ReservedCodeCacheSize = 128m -XX: InitialCodeCacheSize = 128m, -Xss512k -Xmx4g -Xms4g, -XX: + UseG1GC -XX: G1HeapRegionSize = 4M ", the physical memory actually used as shown below:

Top command displays memory conditions

Investigation process

1. Java tools positioned level memory area (the heap memory, Code region and DirectByteBuffer application or use unsafe.allocateMemory external memory heap)

Add -XX in the project: NativeMemoryTracking = detailJVM parameter to restart the project, use the command jcmd pid VM.native_memory detail view of the memory distributed as follows:

Memory display case jcmd

We found committed command displays physical memory less than memory, because the memory contains the display command jcmd heap memory, Code areas, and memory by unsafe.allocateMemory DirectByteBuffer application, but does not include an outer stack other Native Code (C code) allocated memory . So guess is that the use of Native Code resulting from application memory.

In order to prevent miscarriage of justice, I use the pmap view memory distribution, found a large number of 64M address; and these address spaces are not jcmd command given address space inside, basically concluded that these 64M memory as a result.

Memory pmap display case

2. Use the tool positioning system-level memory heap outside

Because I have basically determined to be caused by Native Code, and Java-level tools are not easy to troubleshoot these problems, only use system-level tools to locate the problem.

First, the gperftools to locate the problem

The method of use may gperftools reference gperftools, gperftools monitoring is as follows:

gperftools monitoring

As can be seen from the figure: the application memory using malloc up to 3G after it is released, and after maintained at 700M-800M. The author first reaction was: Is Native Code does not use malloc apply directly using mmap / brk application? (Gperftools principle on ways to use dynamic link replaces the operating system's default memory allocator (glibc).)

Then, use strace to track the system calls

Because these were not tracked using gperftools memory, then directly use the command "strace -f -e" brk, mmap, munmap "-p pid" track to OS application memory requests, but did not found suspicious application memory. strace monitor as shown below:

strace monitoring

Next, GDB suspicious to dump memory

Because there is no use strace to track suspicious memory application; then look at the situation thinking about memory. Is the direct use of gdp -pid pid command after entering the GDB, and then use the command dump memory mem.bin startAddress endAddressdump memory, which can be found startAddress and endAddress from / proc / pid / smaps in. Then use strings mem.bin view the contents of the dump, as follows:

gperftools monitoring

From the content point of view, such as JAR decompressed packet information. JAR package information should be read when the project started, then use strace role after the start of the project is not very big. So strace should be used when the project started, but not completed after the start.

Again, using strace start of the project to track the system calls

Use strace project started tracking system call and found that a lot of application 64M of memory space, shots are as follows:

strace monitoring

Using the application address space mmap pmap corresponds to the following:

strace application address space corresponding to the contents pmap

Finally, jstack to view the corresponding thread

Because strace command has shown the application memory thread ID. Directly on the command jstack pid to see the thread stack, find the corresponding thread stack (note 10 decimal and hexadecimal conversion) as follows:

Strace application thread stack space

Here basically can see the problem: MCC (US group uniform distribution center) used for the sweep packet Reflections, bottom Spring Boot used to load JAR. Because decompression JAR use Inflater class, you need to use external memory heap, then use Btrace to track this class stack as follows:

btrace stack trace

And then view the use of MCC where can find that the sweeping package path, the default is to scan all packages. Modify the code so disposed sweep packet routing, after the release of the line memory problems solved.

3. Why heap memory is not freed it outside?

Although the problem has been resolved, but there are a few questions:

  • Why use the old framework is no problem?

  • Why not release the heap memory outside?

  • Why memory size is 64M, JAR size not so big, and they are all the same size?

  • Why gperftools final display memory used is about 700M, unzip the package really does not apply to use malloc memory?

With questions, I looked at the Spring Boot Loader direct that a piece of source code. Spring Boot found on InflaterInputStream Java JDK were packaged and used Inflater, and Inflater itself to decompress JAR packages need to use external memory heap. The class ZipInflaterInputStream after packaging does not release Inflater held outside the heap memory. So I believe they have found the cause of this bug immediately feedback to the Spring Boot community. But after the feedback, I found Inflater the object itself implements finalize method, there are calls to release heap memory outside of logic in this method. That Spring Boot release depends on the GC heap outside memory.

When I use the heap jmap view within the object and found that this object has been virtually no Inflater up. So it is time to doubt the GC, do not call finalize. With this doubt, I put Inflater packed replace their packaging Inflater in Spring Boot Loader inside, RBI monitor finalize, finalize method does result has been called. So I went to read Inflater corresponding C code and found initialization of application memory using malloc, end, it also calls free to free up memory.

At the moment, I can only suspect that there is no real free time to free up memory, it gave the Spring Boot packaged InflaterInputStream to replace Java JDK comes, found that after the replacement, the memory problems can be resolved.

At this time, then come back to see the distribution of memory gperftools found when using Spring Boot, memory usage has been increasing, a point suddenly dropped a lot of memory usage (the amount directly from the 3G reduced to about 700M). This point should be the cause of the GC, the memory should be released, but did not see changes in memory the operating system level, it is not no release to the operating system, memory allocator to be held by it?

Continue to explore, discover the default memory allocator (glibc 2.12 version) Distribution and use of gperftools memory address difference is obvious, 2.5G addresses smaps found it to be part of Native Stack. Memory address distributed as follows:

Gperftools display memory address distribution

This is basically a memory allocator may be determined in mischief; search a bit glibc 64M, found 2.11 glibc been introduced from the memory pool (64-bit machine is the size of the memory 64M) for each thread, reads as follows:

glib explanation memory pool

According to article said MALLOC_ARENA_MAX to modify environment variables, found no effect. View tcmalloc (gperftools use memory allocator) also uses a memory pool way.

In order to verify the memory pool is out of the ghost, I would simply write a memory allocator without memory pool. Use command gcc zjbmalloc.c -fPIC -shared -o zjbmalloc.so a dynamic library, then export LD_PRELOAD = zjbmalloc.so replace glibc memory allocator. Demo wherein the code is as follows:

#include<sys/mman.h>
#include<stdlib.h>
#include<string.h>
#include<stdio.h>
//作者使用的64位机器,sizeof(size_t)也就是sizeof(long)
void* malloc ( size_t size )
{
   long* ptr = mmap( 0, size + sizeof(long), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0 );
   if (ptr == MAP_FAILED) {
  	return NULL;
   }
   *ptr = size;                     // First 8 bytes contain length.
   return (void*)(&ptr[1]);        // Memory that is after length variable
}

void *calloc(size_t n, size_t size) {
 void* ptr = malloc(n * size);
 if (ptr == NULL) {
	return NULL;
 }
 memset(ptr, 0, n * size);
 return ptr;
}
void *realloc(void *ptr, size_t size)
{
 if (size == 0) {
	free(ptr);
	return NULL;
 }
 if (ptr == NULL) {
	return malloc(size);
 }
 long *plen = (long*)ptr;
 plen--;                          // Reach top of memory
 long len = *plen;
 if (size <= len) {
	return ptr;
 }
 void* rptr = malloc(size);
 if (rptr == NULL) {
	free(ptr);
	return NULL;
 }
 rptr = memcpy(rptr, ptr, len);
 free(ptr);
 return rptr;
}

void free (void* ptr )
{
   if (ptr == NULL) {
	 return;
   }
   long *plen = (long*)ptr;
   plen--;                          // Reach top of memory
   long len = *plen;               // Read length
   munmap((void*)plen, len + sizeof(long));
}

By buried in a custom allocator which point can be found in fact after the program start the application outside the actual application of heap memory is always between 700M-800M, gperftools monitor display memory usage is about 700M-800M. But the difference in memory from the operating system perspective, the process takes a lot (here just outside monitor heap memory).

I do a little test, using a different dispenser package varying degrees of sweep, the memory occupied by the following:

Memory test comparison

Why custom malloc apply 800M, eventually occupy physical memory in 1.7G it?

Because the custom memory allocator is used to allocate memory mmap, mmap allocate memory needed rounded up to an integral number of pages, so that there is a huge waste of space. By monitoring the number of pages found in the final application of a 536k or so, in fact, that apply to the system memory is equal to 512k * 4k (pagesize) = 2G. Why this data is larger than 1.7G it?

Because the operating system is delayed allocation taken way, when the application of the system memory via mmap, system memory only return address and no real physical memory allocation. Only when you actually use the system generates a page fault, and then redistributed actual physical Page.

to sum up

flow chart

Entire memory allocation process as shown in FIG. MCC sweep packet is to scan all of the default configuration JAR package. When the scan package, Spring Boot does not seek to release heap memory outside, resulting in the scanning phase, the external heap memory usage continues to soar. When the occurrence of the GC, Spring Boot rely on finalize the mechanism to release the heap memory outside; but glibc For performance reasons, the memory does not really go back to the operating system, but to stay put in the memory pool, that led to the application layer had "memory leaks." So modify the configuration for a specific path MCC JAR package to solve the problem. The author at the time of publication of this article, discover the latest version of Spring Boot (2.0.5.RELEASE) has made changes in ZipInflaterInputStream the initiative to release the heap memory is no longer dependent on outside GC; so Spring Boot upgrade to the latest version, this problem It can be resolved.

Published 50 original articles · won praise 1706 · Views 2.22 million +

Guess you like

Origin blog.csdn.net/zl1zl2zl3/article/details/105297617
Recommended