Stack Analysis - Server Problem Location

1. Location of common server problems:

1. Common server problems:

In daily development, our common server problems can be classified into the following categories:

  • CPU overload problem
  • high memory problem
  • Disk IO problem
  • Internet problem

2. The general idea of ​​server problem location:

When there is a problem with the server, we can generally locate it according to the following ideas

  • ① If a version has been released recently, first analyze whether there is a problem with the latest version of the code from the code submission record
  • ② Use ps, top and other commands to analyze the thread status
  • ③ First offline the abnormal server and keep the server environment, then use jstack to export the thread snapshot information and use jmap to export the stack memory log for analysis. This command will affect the performance of the online server, so the server is offline before execution to prevent impact on users

        For disk IO problems and network problems, we can usually locate them quickly, so we will not discuss them in this article. Here we mainly introduce how to locate and solve the problem of high CPU and Java memory usage.

Second, CPU soaring problem positioning:

1. What are the CPU-intensive operations in Java:

  • Frequent GC: If the access volume is high and the memory allocation is too fast, it may cause frequent GC or even FGC, which will cause the CPU to soar
  • Thread context switching: The state of a large number of threads changes between Blocked (lock waiting, IO waiting, etc.) and Running. This can easily happen when lock contention is high.
  • Some threads are doing non-blocking operations like infinite loops
  • Serialize and deserialize
  • regular expression

2. Java program CPU soaring positioning steps:

(1) Use the top command to find the thread ID that consumes the most CPU:

top –Hp PID

(2) Convert the thread ID that consumes the most CPU into hexadecimal, because the thread number in the thread snapshot file is recorded in hexadecimal:

printf ‘%x\n’ PID

(3) Offline the server in question, then use jstack to export the thread snapshot information, and check what the thread ID found in step (2) is doing:

jstack pid | grip time -A 30

        At this point, we have located what kind of operation the abnormal thread is performing, and we can take specific solutions to deal with the problem.

Third, the positioning method of Java memory usage is too high:

        High memory is generally caused by memory leaks and memory overflows. First check whether the thread snapshot log is abnormal, and then use jmap -dump to export the stack log to see what objects are occupying space. It is often caused by misoperation or imprecise operation in the code, and there may be too many threads created. Lead to

Export stack log: jmap -dump:format=b,file=filename[pid]

Then use IBM HeapAnalyzer or Eclipse Memory Analyzer to analyze

Fourth, jstack thread snapshot file analysis description:

4.1. Thread snapshot status stacktrace Description:

4.1.1, Deadlock: Deadlock thread, a situation where multiple threads occupy resources with each other and wait for each other all the time, leading to blocking.

4.1.2, runnable: Indicates that the thread has all the running conditions, or the thread state in execution

4.1.3. Waiting for monitor entry and in Object.wait(): Monitor is the main means of Java's synchronized lock mechanism to achieve mutual exclusion and cooperation of threads. It can be regarded as an object or class lock. Each object has one and only one lock. monitor, each Monitor can only be owned by one thread at the same time, the thread is "Active Thread", and other threads are "Waiting Thread", "Waiting Thread" is stored in two sets "EntrySet" and "WaitSet" inside waiting. The thread state waiting in "EntrySet" is "Waiting for monitor entry", and the thread state waiting in "WaitSet" is "in Object.wait()". The code segment protected by synchronized is the critical section. When a thread applies to enter the critical section, it enters the "EntrySet" queue. When the thread obtains the Monitor and enters the critical section, if it is found that the conditions for the thread to continue running are not met , it calls the wait() method of the object, abandons the Monitor, and enters the "WaitSet" queue. Only when other threads call notify() or notifyAll() on the object, the threads in the "WaitSet" queue get a chance to go to compete.

4.1.4, waiting on condition: waiting for resources or waiting for a certain condition to occur, which needs to be analyzed in combination with stacktrace:

(1) The most common is that the thread is in the sleep state, waiting to be woken up.

(2) The common situation is to wait for network IO. Before NIO is introduced, for each network connection, there is a corresponding thread to handle network read and write operations. Even if there is no readable and writable data, the thread is still blocked in read and write operations. After the introduction of NIO, if you find that a large number of threads are blocked in the network, it may be a symptom of a network bottleneck, because the network blocking prevents the threads from executing:

  • One situation is that the network is very busy, almost consuming all the bandwidth, and there is still a lot of data waiting for the network to read and write;
  • Another situation may be that the network is idle, but due to problems such as routing, the packets cannot arrive normally.

4.1.5. Blocked: Thread blocking means that during the execution of the current thread, the required resources have been waiting for a long time but have not been obtained, and are marked as blocked by the thread manager of the container, which can be understood as a thread waiting for resource timeout .

4.2. The phenomenon that needs to be paid attention to in the thread snapshot file:

(1) IO thread that runs continuously runnable:

        IO operations can be blocked in the RUNNABLE state, such as database deadlock, network read and write, so special attention should be paid to the analysis of the real state of the IO thread. Generally speaking, IO calls captured in RUNNABLE are problematic. .

        The following stack shows: the thread state is RUNNABLE, the call stack is on SocketInputStream or SocketImpl, socketRead0 and other methods. The call stack contains jdbc-related packages, and a database deadlock is likely to occur

"d&a-614" daemon prio=6 tid=0x0000000022f1f000 nid=0x37c8 runnable
[0x0000000027cbd000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at oracle.net.ns.Packet.receive(Packet.java:240)
at oracle.net.ns.DataPacket.receive(DataPacket.java:92)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:172)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:117)
at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1034)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:588)

The following stack is the case for the thread pool:

"http-bio-8082-exec-3858" #50615 daemon prio=5 os_prio=0 tid=0x00007f7cc002f800 nid=0xc5c0 runnable [0x00007f7c34659000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:516)
	at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:501)
	at org.apache.coyote.http11.Http11Processor.setRequestLineReadTimeout(Http11Processor.java:167)
	at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:946)
	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
	at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:315)
	- locked <0x0000000093db7ce8> (a org.apache.tomcat.util.net.SocketWrapper)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

(2) Deadlock phenomenon:

        jstack will help us analyze the deadlock situation, which can be located directly, and the specific situation can be analyzed and processed.

(3) Infinite loop:

        The manifestation of the infinite loop state is that the CPU is soaring, checking the running status of the thread to find a thread, taking a long time for the CPU running time segment and high CPU usage, printing the thread stack log and finding that a thread has been in the Runnable state

in Runnable state

(4) Thread context switching is too frequent:

① JVM has a thread scheduler, which is used to determine which thread to run at which time. There are two main types of schedulers: preemptive thread scheduler and cooperative thread scheduler. Each thread may have its own priority, but the priority does not mean that the high-priority thread will be scheduled, but is randomly selected by the CPU

  • Preemptive thread scheduling: When a thread is executing its own task, although the task has not been completed, the CPU will force it to suspend, allowing other threads to occupy the right to use the CPU.
  • Cooperative thread scheduling: When a thread is executing its own task, it is not allowed to be interrupted in the middle. It must wait for the current thread to complete the task execution before releasing its possession of the CPU, and other threads can preempt the CPU.

② The manifestation of this type of problem is often that the CPU and memory are all soaring, and there are a large number of threads in jstack that are in the waiting, timed_waiting state, because the execution scheduling coordination of threads is handled by the CPU, and the CPU is responsible for allocating executable segments to each thread. , the frequent processing of these thread scheduling will cause the CPU load to be too high

(5) A large number of GC threads:

        The manifestation of this type of problem is that there are a large number of GC threads in jstack, and the GC thread consumes the highest CPU, which is often referred to as frequent GC:

The jstat command can view the usage of various parts of the heap memory and the number of loaded classes. The format of the command is as follows:

jstat [-command options] [vmid] [interval time/ms] [number of queries]

root@8d36124607a0:/# jstat -gcutil 9 1000 10
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
  0.00   0.00   0.00  75.07  59.09  59.60   3259    0.919  6517    7.715    8.635
  0.00   0.00   0.00   0.08  59.09  59.60   3306    0.930  6611    7.822    8.752
  0.00   0.00   0.00   0.08  59.09  59.60   3351    0.943  6701    7.924    8.867
  0.00   0.00   0.00   0.08  59.09  59.60   3397    0.955  6793    8.029    8.984

Guess you like

Origin blog.csdn.net/a745233700/article/details/122660058