Troubleshooting Java application off-heap memory leaks | JD Cloud technical team

how the problem was discovered

Recently, a java application is doing stress testing. Pressure testing
environment configuration:
CentOS system 4-core CPU 8g memory jdk1.6.0_25, jvm configuration -server -Xms2048m -Xmx2048m
There is a problem as follows
Execute 300 concurrency, the memory usage after the pressure test lasts for 1 hour The rate increased from 20% to 100%, and the tps decreased from more than 1100 to more than 600.

Detailed process for troubleshooting

First use the top command to view the memory usage as follows
image.png

Then check the java heap memory distribution, check that the heap memory usage is normal, and there is no abnormality in jvm garbage collection.
image.png

Then I thought of an off-heap memory leak. Since there are many jsf interfaces used in the system, the bottom layer is dependent on netty.

  • The first thing to consider is the DirectByteBuffer under the nio package in java, which can directly allocate off-heap memory, but the memory allocated by this class also has a size limit, which can be specified directly through -XX:MaxDirectMemorySize=1g, and when the memory is not enough, the code The System.gc() method will be explicitly called to trigger FullGC. If the memory is still not enough, an out-of-memory exception will be thrown.

  • In order to verify this idea, the off-heap memory size is specified as 1g by -XX:MaxDirectMemorySize=1g in the startup parameters, and then the pressure test is performed again, and it is found that the memory continues to grow, and then exceeds the heap memory 2g and the off-heap memory The sum of 1g, and no exception of memory overflow was found, and FullGC was not performed frequently. So it may not be the off-heap memory occupied by nio's DirectByteBuffer.

In order to analyze who is occupying the off-heap memory, we have to install the google-perftools tool for analysis. Its principle is to switch to its libtcmalloc.so when calling malloc when the java application is running , so that some statistics can be done.
The installation steps are as follows:

  • Download http://download.savannah.gnu.org/releases/libunwind/libunwind-0.99-beta.tar.gz,

  • ./configure

  • make

  • sudo make install //requires root privileges

  • Download http://google-perftools.googlecode.com/files/google-perftools-1.8.1.tar.gz,

  • ./configure --prefix=/home/admin/tools/perftools --enable-frame-pointers

  • make

  • sudo make install //requires root privileges

  • Modify lc_config: sudo vi /etc/ld.so.conf.d/usr-local_lib.conf, add /usr/local/lib (the directory where libunwind's lib is located)

  • Execute sudo /sbin/ldconfig to make libunwind take effect

  • Add before application start:

  • export LD_PRELOAD=/home/admin/tools/perftools/lib/libtcmalloc.so

  • export HEAPPROFILE=/home/admin/heap/gzip

  • Start the application, and you will see heap files such as gzip_pid.xxxx.heap under /home/admin/heap

  • Use /home/admin/tools/perftools/bin/pprof --text $JAVA_HOME/bin/java test_pid.xxxx.heap to view

  • /home/admin/tools/perftools/bin/pprof --text $JAVA_HOME/bin/java gzip_22366.0005.heap > gzip-0005.txt

  • Then view the analysis results as follows

Total: 4504.5 MB
4413.9 98.0% 98.0% 4413.9 98.0% zcalloc
60.0 1.3% 99.3% 60.0 1.3% os::malloc
16.4 0.4% 99.7% 16.4 0.4% ObjectSynchronizer::omAlloc
8.7 0.2% 99.9% 4422.7 98.2% Java_java_util_zip_Inflater_init
4.7 0.1% 100.0% 4.7 0.1% init
0.3 0.0% 100.0% 0.3 0.0% readCEN
0.2 0.0% 100.0% 0.2 0.0% instanceKlass::add_dependent_nmethod
0.1 0.0% 100.0% 0.1 0.0% _dl_allocate_tls
0.0 0.0% 100.0% 0.0 0.0% pthread_cond_wait@GLIBC_2.2.5
0.0 0.0% 100.0% 1.7 0.0% Thread::Thread
0.0 0.0% 100.0% 0.0 0.0% _dl_new_object
0.0 0.0% 100.0% 0.0 0.0% pthread_cond_timedwait@GLIBC_2.2.5
0.0 0.0% 100.0% 0.0 0.0% _dlerror_run
0.0 0.0% 100.0% 0.0 0.0% allocZip
0.0 0.0% 100.0% 0.0 0.0% __strdup
0.0 0.0% 100.0% 0.0 0.0% _nl_intern_locale_data
0.0 0.0% 100.0% 0.0 0.0% addMetaName

You can see that the function Java_java_util_zip_Inflater_init has been allocating memory. Checking the java source code turns out to be

public GZIPInputStream(InputStream in, int size) throws IOException {
    super(in, new Inflater(true), size);
 usesDefaultInflater = true;
 readHeader(in);
}

原来是java中gzip解压缩类耗尽了系统内存,然后跟踪源码到了系统里边使用的jimdb客户端SerializationUtils类,jimdb客户端使用该工具类对保存在jimdb中的key和对象进行序列化和反序列化操作,并且在对Object类型的进行序列化和反序列化的时候用到了gzip解压缩,也就是在调用jimdb客户端的getObject和setObject方法时,内部会使用java的GZIPInputStream和GZIPOutputStream解压缩功能,当大并发进行压测的时候,就会造成内存泄漏,出现内存持续增长的问题,当压测停止后,内存也不会释放。

how to solve the problem

1. Upgrade the jdk version to jdk7u71. After a period of pressure testing, it is found that the memory growth has slowed down, and will be stable within a certain range, and will not exhaust all the memory of the server. Guess it may be a bug of jdk1.6 version
2. Try not to use the getObject and setObject methods of the jimdb client. If you really need to save the object, you can implement serialization and deserialization yourself. Do not decompress the function, because the object is not big , not much space can be compressed. If the decompression function is really needed, it is best to set the decompression threshold. When the object size exceeds the threshold, decompression is performed, and all objects are not decompressed.

Author: Jingdong Retail Cao Zhifei

Source: JD Cloud Developer Community

The Indian Ministry of Defense self-developed Maya OS, fully replacing Windows Redis 7.2.0, and the most far-reaching version 7-Zip official website was identified as a malicious website by Baidu. Go 2 will never bring destructive changes to Go 1. Xiaomi released CyberDog 2, More than 80% open source rate ChatGPT daily cost of about 700,000 US dollars, OpenAI may be on the verge of bankruptcy Meditation software will be listed, founded by "China's first Linux person" Apache Doris 2.0.0 version officially released: blind test performance 10 times improved, More unified and diverse extremely fast analysis experience The first version of the Linux kernel (v0.01) open source code interpretation Chrome 116 is officially released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10097265
Recommended