Reprinted: How to Troubleshoot Java Online Problems?

This article is reproduced in: here

1.Java online problem handling process.

   There are mainly three steps here, and the usual applications are applicable to this step, and the first thing we need to do is to recover quickly, and the second is to solve the problem.

  1. quick recovery
  2. Problem location and resolution
  3. Problem prevention

   This article mainly focuses on the location of the problem. When there is a problem with the online application, how can we find the root cause of the problem.

2. Problem positioning

   When there is a problem with the online program application, the first thing to do is to check the log. Usually, the log can still intuitively reflect the problem (if there is no response, then you may need to reflect on whether your log is added properly). However,
   in In some cases, the log cannot reflect the corresponding problem, so we need to do our own investigation, which is also the purpose of this article. There are three main steps in the investigation, machine level, process level and thread level.

1. Machine level.

(1) Check the CPU usage of the machine.

Command: top
insert image description here
us CPU percentage occupied by user space: 7.3%
sy CPU percentage occupied by kernel space: 2.0%
ni CPU percentage occupied by processes with changed priority in user process space: 0.0%
id Idle CPU percentage: 90.4%
wa waiting for input and output CPU time percentage; 0.3%

load average : The average value of the current system load. The following three values ​​are the average number of processes 1 minute ago, 5 minutes ago, and 15 minutes ago. Generally, it can be considered that when this value exceeds the number of CPUs, the CPU will be under a relatively difficult load Processes contained in the current system

(2). View machine memory usage:

Command: free -h
insert image description here total: total physical memory size.
used: How much has been used.
free: How many are available.
Shared: The total amount of memory shared by multiple processes.
Buffers/cached: The size of the disk cache.

(3) Check the use of the hard disk of the machine:

Command: df -h
insert image description here

(4) Check the IO status of the machine network:

Command: iostat
insert image description here
insert image description here

2. Process level (overall situation).

(1) First get the process ID.

Command: ps -ef | grep appname
insert image description here

(2) View the cpu and memory occupied by the process

Command: ps -aux | grep process ID
insert image description here
USER: user
PID: process number
%CPU: CPU percentage occupied by the process when executing the command
%MEM: memory percentage occupied by the process when executing the command
VSZ: virtual memory occupied by the process (generally don’t care)
RSS : The physical memory occupied by the process (the actual occupied memory, in KB)
TTY : Terminal number

3. Thread level and process internal memory usage analysis.

(1). Check what the thread is doing, that is, thread snapshot.

Command: jstack process ID
insert image description here

(2) JVM old generation, new generation usage.

Command: jmap -heap process ID .
This command will print a heap summary information, including the GC algorithm used, heap configuration information and memory usage information of each memory area
insert image description here
insert image description here

(3) Check which class uses the most memory

Command: jmap -histo:live process ID
insert image description here

(4) Check the detailed GC situation.

Command: jstat -gcutil Process ID
insert image description here
S0, S1 means Survivor area
E means Eden (new generation),
O means Old (old generation),
M means Metaspace (meta space),
YGC (Minor GC times),
YGCT (Minor GC time-consuming , unit: second),
FGC (Full GC times),
FGCT (Full GC time-consuming, unit: second),
GCT (GC time-consuming)

Note: For the above two YGCTs, FGCT is the total time of gc executed by the new generation and the old generation respectively. If you need to calculate the average time of Full GC, you need to use FGCT/FGC. If the average time is greater than 1s, you need to consider optimization.

After the above steps, the problem can be basically located. If you locate, for example, high CPU usage OOM or memory overflow in the process, you can check the detailed steps in this article: Analysis of OOM and high CPU usage

Guess you like

Origin blog.csdn.net/qq_36256590/article/details/132433302