Troubleshooting the high CPU usage of Java applications

1. Background

Recently, the test feedback test environment interface occasionally has an access timeout, and then the APP prompts that the network has failed. After looking at the application of the test environment, there is no problem at all, and I always thought it was a network problem.

There was feedback from the test today, and I quickly checked the test server. This time I finally had symptoms, and the CPU directly soared to 300%.

Even if the problem reappears, start working directly to locate the problem

img

2. Solutions

1. Locate the application process of the problem

Run the top command and sort by CPU, as shown in the figure below, the Java application with process PID 13258 occupies 300% of the CPU resources.

image-20230901100203416

The main reason for high CPU in Java applications is that threads are always in the Runnable state. Usually these threads are performing tasks such as non-blocking operations, loops, regularization, or pure calculations. Another possible cause of high CPU is frequent GC
.

So now the thread status of the process

2. View the thread usage in the process

Use the command top -H -p to view the following figure:

#-H:所有线程占用资源情况。
#-p<进程号>:指定进程;
top -H -p 13258

image-20230901100131530

It is found that there are three threads with high real-time CPU usage, and the processor usage time is very long, and the 25438 threads actually occupy the processor for up to 190 minutes.

Now that the problem has been found, it is that these three threads continue to occupy CPU resources. Let’s take a closer look at what this thread does next.

3. View the snapshot of the thread

View the snapshot of thread 14689 through the jstack command

jstack 13258 |grep "3961" -A 30

image-20230901100305203

The parameter after jstack is the value of PID, and 3961 is the hexadecimal value of thread id 14689.

You can convert the thread id to hexadecimal by the following command

 printf "%x\n" 14689

By observing the output, you can see that the thread is in a runnable state for a long time, and you can find the corresponding code directly based on the stack information.

4. Finally

After finding the corresponding code and modifying it, the problem is basically solved

img

Guess you like

Origin blog.csdn.net/yucdsn/article/details/132652261