Just four steps, analyze and solve JVM memory leak in a production environment

Author: Unfinished Symphony

Abnormal

First built by our in-house we found online journal platform java application environment a large number of interface requests http timeout, log in to see linux server network environment with no problem, the application itself is run to determine abnormal, restart the application is still abnormal, Find problem.

Find the preliminary question

By instruction: jstat -gcutil View gc jvm memory footprint and conditions:
Just four steps, analyze and solve JVM memory leak in a production environment

It found that memory usage is too high proportion of older years, and each time was not effective recovery after fullGC. Memory usage percentage change of the old year is as follows:
Just four steps, analyze and solve JVM memory leak in a production environment

Preliminary judgment the direct cause paralysis of the large number of requests timeout and services:
after each of fullGC memory footprint increasing
memory usage grew faster and faster
frequency fullGC higher and higher
final occupancy reaches 100%, the service completely paralyzed

Analytical Processing

Use instructions: jmap -histo: live *** | more to see the number and size of objects in the heap memoryJust four steps, analyze and solve JVM memory leak in a production environment

Found that many Log4jLogEvent this object instance, take up memory is also extremely large, a preliminary analysis is asynchronous log transfer speed can not keep up, resulting in the accumulation log object in memory.
Transmission Log Flume try to use the adjustment parameters: increase the amount of a single transmission flume, reduce the maximum delay time
to restart the application calls the situation and monitor interface finds application temporarily back to normal.

Subsequent analysis

Analysis of the previous step while the memory using the instruction: jmap -dump: format = b, file = heapDump.hprof memory information derived in real time (the dump process is relatively slow, so that a temporary problem has been dealt with in the subsequent analysis), analysis of the use of mat memory structure:Just four steps, analyze and solve JVM memory leak in a production environment

You can see the main information of the object occupy heap memory, it really is Flume asynchronous transfer log jam problems.
Just four steps, analyze and solve JVM memory leak in a production environment

to sum up

Jvm memory leak to solve such problems, mainly to good use similar jstat jvm provided, jmap and other analysis tools to find the problem. While solving this problem, but there is still the risk of subsequent recurrence of such problems. So in addition to strengthening the jvm troubleshooting capabilities, we also plan to establish application monitoring platform on the agenda, hoping to be monitored jvm memory, threads, and other real-time performance indicators, to facilitate early detection of problems.

Welcome to share with everyone, like the point of a praise yo remember the article, thanks for the support!

Guess you like

Origin blog.51cto.com/14442094/2426782