Remember once gc jvm crazy whirlwind cause high CPU problem solving

Record a java virtual machine CPU soar exception handling

Online web server appears from time to time is very card, login server top command found to be very high server CPU,

After the restart tomcat CPU back to normal, half a day or a day after would even now the same problem.

To solve the problem must first find the flashpoint issue, even for problem now is very difficult to target.

After the restart the server can only wait for the problem occurs again, this time the first question whether a cron job that caused a large number of calculations or a request triggered a cycle of death,

So first of all place the code inside the suspect's analysis again, add a little log, the next day in the afternoon from happening again,

The first strategy is to protect the crime scene, because the line is two points, a point after the restart restore point just off the assembly line to another without restarting the retention crime scene.

First look at the log on the server business problems, did not find a large number of duplicate log, initially ruled out the possibility of an infinite loop, then only analyze the jvm.

The first step: top command to see CPU-pid

 

This is a screenshot afterwards, then soar to more than 500 cpu, pid is 27683

Then ps aux | grep 27683 search to confirm our tomcat is not occupied cpu, this basically is yes, because after tomcat restart flew down the CPU.

You can also use jps java show the pid

Step two: top -H -p 27683 27683 Find the following thread id, thread cpu time display of occupancy, occupancy ratio and found that there are many threads are CPU utilization is high, only each investigation.

 

The third step: jstack view the thread information, the command: jstack 27683 >> aaaa.txt output to a text filter, can also direct pipeline Search jstack 27683 | grep "6c23" The thread id is a hexadecimal representation, you need to turn about, you can use this command to turn printf "% x \ n" tid can also turn about their own calculators.

The tragedy is that I was the introduction of a misunderstanding at the time of the investigation, when 6c26 search to this thread, I found the thread is doing gc, gc madness caused by too high, but can not find where to produce caused so many objects, has been looking for all possible infinite loop and possible memory leaks

 

Then you can check the gc per second by naming jstat -gcutil [PID] 1000100

 

This is after the re-set of shots, then it has no screenshot

S0 stop and then found a new object, and then gc, constantly repeated so gc, see the stack information, nothing found nothing but a maximum of String objects and map, can not determine infinite loop code, can not find the problem flashpoint, bringing into complete confusion. Nor is it to find confirmation of some memory leaks, struggling to find no fruit, I lost in thought.

CPU and it has remained below the high, helpless, or jstack 27683 see the thread stack, aimless see chaos, but found a problem, the current point I was off the assembly line that is not accessed by the user, or CPU has been so high, and also kept the thread stack print, then that thread is currently still running is probably the culprit, immediately analyze thread stack information, major discovery.

 

This thread is a lot of information appears, httpProxy_jsp this thread jsp constantly active, this is what the hell jsp? Is the server is attacked? Immediately went to look for the code found there is a genuine jsp, see git commit record is the code submitted by another colleague a few days before, and the time point of time the problem first appears very consistent, she felt lucky to find the problem should be the point, the colleague called over his analysis of the code, jsp this is actually very simple, it is to do a proxy request, initiated a back-end http request

 

HttpRequestUtil follows, it is a colleague to write their own tools, do not use the common tools, a post method is not set in the connection link and read timeout timeout:

 

And there are a fatal problem, he http request does not set a timeout waiting time, connection or if you do not set timeout 0 is considered to be infinite, that is, will not have a timeout, this time if the requested page if the third party does not respond or the response is very slow, this request will have to wait, or request did not come back again next time, leading to this thread got stuck, but the thread here do not accumulate in the crash has also been doing some things that will generate a lot the object, and then stop the madness triggered jvm GC Biao the server CPU to the limit, then the server response becomes very slow, finally found the problem, but also very consistent with the characteristics of the problem, this place really changed the way to write a plus 2 seconds timeout limit, the problem does not occur again.

The problem solving process draw some ideas:

1, jvm problem is very complex, it may not be the root of the problem through the log can see, there are problems to solve luck, analyze business scenarios log + + blind Mongolia are solutions to the problem, analyze the problem do not use one way black, and more with the current scene plus some speculation.

2, the root of the problem is high CPU soared, the beginning is always thinking about the code, there are an infinite loop, and later thought it was a memory leak, so take a lot of detours, finally see the thread stack of the most stupid way to see the log the problem, the problem is no uniform solution to specific problems to be specifically addressed, not rigidly adhere to past experience.

3, try to use common tools in the original project has been widely used in the process of writing code, try not to put their own code for the project has not been tested to introduce the project, even seemingly simple piece of code to the project may introduce disaster unless you have sufficient grasp of you understand the underlying code, such as the time-out settings problem.
----------------
Disclaimer: This article is CSDN blogger "super tomato egg" in the original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source and link this statement.
Original link: https: //blog.csdn.net/hotthought/article/details/82987428

Guess you like

Origin www.cnblogs.com/kabi/p/12125018.html
Recommended