Performance tuning case of JVM series

1. Basic problems of tuning

1.1. Why tuning?

  • Strikethrough format prevents OOM, JM planning and pre-tuning
  • Solve various OOMs in program operation
  • Reduce the frequency of Full GC and solve slow and stuck problems

1.2. General direction of tuning

  • code sensibly
  • Full and reasonable use of hardware resources
  • Reasonable JVM tuning

1.3. Considerations at different stages

  • Before going live
  • Project running phase
  • OOM appears online

1.4. Basis for tuning monitoring

  • run log
  • exception stack
  • GC log
  • thread snapshot
  • heap dump snapshot

1.5, performance optimization steps

1.5.1. Step 1: Familiarize yourself with business scenarios

1.5.2, Step 2 (Finding Problems): Performance Monitoring

An activity to collect or view application operational performance data in a non-forced or intrusive manner. Monitoring generally refers to a preventive or proactive activity performed in a production, quality assessment, or development environment.
When application stakeholders ask performance issues without providing enough clues.
First we need performance monitoring, followed by performance analysis.
Before monitoring, set the recycler combination, select the CPU (the higher the main frequency, the better), set the age ratio, and set the log parameters (usually, only one log file is not set in the production environment). for example:

-Xloggc:/opt/xxx/logs/xxx-xxx-gc-%t.log 
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5 
-XX:GCLogFileSize=20M 
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps 
-XX:+PrintGCCause
1.5.2.1. Possible situations
  • GC is frequent
  • CPU load is too high
  • OOM
  • memory leak
  • deadlock
  • Program response time is long

1.5.2, Step 3 (Troubleshooting): Performance Analysis Team

  • Print the GC log and analyze the log management information through GCviewer or http://gceasy.io
  • Flexible use of command line tools, jstack, jmap, jinfo, etc.
  • Dump out the heap file and use the memory analysis tool to analyze the file
  • Use Ali Arthas, or jconsole, JVisualVM to view JVM status in real time
  • jstack view stack information

1.5.4, Step 4 (Solving the Problem): Performance Tuning

An activity of changing parameter, source code, property configuration to improve application responsiveness or throughput, performance tuning is an activity that follows performance monitoring and performance analysis.

  • Appropriately increase the memory and select the garbage collector according to the business background
  • Optimize code, control memory usage
  • Increase the number of machines to disperse the node pressure
  • Reasonably set the number of threads in the thread pool
  • Use middleware to improve program efficiency, such as cache, message queue, etc.
  • other…

1.6. Summary

Tuning starts from business scenarios. Tuning without business scenarios is a hooligan, no monitoring, no tuning!

Jmeter is used for pressure measurement.

1.7. Interview questions

  • How to optimize and reduce Full GC? (Ali-Xianyu)
  • When there is a memory overflow, how do you troubleshoot. (Jingdong)
  • Are there any actual VM performance tuning cases? Which core parameters should we focus on? (DiDi)
  • Tell me about OOM? How to check? What will cause OOM? When does OOM appear (Tencent)
  • What has JVM performance tuning done? (Alipay)
  • Have you done JVM memory optimization? (Xiaomi)
  • JVM compilation optimization (Ant Financial Services)
  • What has JVM performance tuning done (Ant Financial Services)
  • How to tune the JVM and how much the stack space is set appropriately... (Ant Financial Services)
  • What are the JVM-related analysis tools used? What are the specific performance tuning steps (Ant Financial Services)
  • How to perform JVM tuning? What are the methods? (Ali)
  • How to tune the JVM and how to adjust the parameters? (Byte Beat)
  • Why do GCs occur frequently in a seckill system with hundreds of thousands of concurrent transactions per second? (JD.com)
  • How to optimize the JVM for an average daily million-level trading system? (Jingdong)
  • How to monitor, locate and solve the online production system OOM? (Jingdong)
  • How to optimize the performance of high concurrency system based on G1 garbage collector? (Jingdong)

2. Performance optimization case 1: Adjust the heap size to improve the throughput of the service

Initial configuration

export CATALINA_OPTS="$CATALINA_OPTS -Xms30m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:SurvivorRatio=8"
export CATALINA_OPTS="$CATALINA_OPTS -Xmx120m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+UseParallelGC"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDetails"
export CATALINA_OPTS="$CATALINA_OPTS -XX:MetaspaceSize=64m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDateStamps"
export CATALINA_OPTS="$CATALINA_OPTS -Xloggc:/opt/model/tomcat-8.5/logs/gc.log"

insert image description here

insert image description here

Optimization

export CATALINA_OPTS="$CATALINA_OPTS -Xms120m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:SurvivorRatio=8"
export CATALINA_OPTS="$CATALINA_OPTS -Xmx120m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+UseParallelGC"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDetails"
export CATALINA_OPTS="$CATALINA_OPTS -XX:MetaspaceSize=64m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDateStamps"
export CATALINA_OPTS="$CATALINA_OPTS -Xloggc:/opt/model/tomcat-8.5/logs/gc.log"

insert image description here

insert image description here

3. Performance optimization case 2: Reasonable allocation of heap memory

In case 1, we mentioned that increasing the memory can improve the performance of the system and the effect is significant. Then a question that comes with it is, how much memory should we increase? If the memory is too large, then if a FullGC occurs, the GC time will be reduced. It is relatively long. If the memory is small, GC will be triggered frequently. In this case, how can we reasonably adapt the heap memory size?
Analysis:
The principle is based on the recommended formula in Java Performance. set up.

insert image description here

3.1. Recommended configuration

Java whole heap size setting:
Xmx and Xms are set to 3-4 times the surviving objects in the old generation, that is, 3-4 times the memory occupied by the old generation after FullGC.
The method area (PermSize and MaxPermSize of the permanent generation or MetaspaceSize and MaxMetaspaceSize of the metaspace) is set to 1.2-1.5 times the surviving objects of the old generation.
The young generation Xmn is set to 1-1.5 times the surviving objects of the old generation.
The memory size of the old generation is set to 2-3 times the size of surviving objects in the old generation.

However, the above statement is not absolute, that is to say, it gives a reference value. According to a conclusion after various tuning, you can set our initialization memory according to this value to ensure that the program is normal. In the case of running, we also need to check the recovery rate of GC, the time consuming of GC pause, and the actual data in the memory. Full GC is basically impossible. If there is, we need to do memory dump analysis, and then do it again A reasonable memory allocation.
We should also note that how to determine the old age surviving objects mentioned above.

3.2. How to calculate old age surviving objects

How to calculate old age surviving objects

3.2.1, Method 1: View log

Recommended/safe!
Add the GC log to the JVM parameters. The GC log will record the memory size of each generation after each FullGC, and observe the space size after the GC in the old generation. You can observe the memory situation after FullGC for a period of time (such as 2 days), and estimate the size of surviving objects in the old generation after FullGC according to the space size data of the old generation after multiple FullGCs (according to the memory after multiple FullGCs) average size).

3.2.2, Method 2: Force trigger FullGc

Method 2: Forcibly triggering FullGC will affect online services. Use with caution!
Method 1 is more feasible, but you need to change the JVM parameters and analyze the logs. At the same time, when using the CMS collector, FullGC may not be triggered, so there is no FullGC log recorded in the log. It is more difficult to deal with when analyzing. Therefore, sometimes it is necessary to forcefully trigger a FullGC to observe the size of surviving objects in the old age after FullGC.

Note: Forcibly triggering FullGC will cause the online service to stop (STW). Be careful! The recommended operation method is to remove the service node before forcing FullGC, and then link the service back to the available node after FullGC to provide external services. Trigger FullGC at different time periods, and estimate the size of surviving objects in the old generation after FullGC according to the memory situation of the old generation after multiple FullGCs

How to forcibly trigger Full GC?
1. jmap -dump:live,format=b,file=heap.bin dumps the current live object to a file, which will trigger FullGC
2. jmap -histo:live prints the instance of each class Number, memory usage, full class name information... After the live sub-parameters are added, only the number of live objects will be counted. At this time, FullGC will be triggered
3. In the performance test environment, you can trigger FullGC through Java monitoring tools, such as VisualVM and JConsole , VisualVM integrates JConsole, there is a button to trigger GC on VisualVM or JConsole.

3.2.3. Will you estimate the GC frequency?

Under normal circumstances, we should make a memory estimation based on our system. We can test this in the test environment. At the beginning, we can set the memory to be larger, such as 4G. Of course, this can also be estimated according to the business system.
For example, getting a piece of data from the database occupies 128 bytes and needs to get 1000 pieces of data, then the size of the memory read at one time is ((128 B/1024 Kb/1024M) * 1000 = 0.122M, then our program may need to read concurrently Take, for example, read 100 times per second, then the memory usage is 0.122 100 = 12.2M, if the heap memory is set to 1 G, then the size of the young generation is about 333M, then 333M 80%/12.2M = 21.84s, that is to say Our program does two or three youngGCs almost every minute. This gives us a rough estimate of the system.

3.2.4. Case analysis

Now we start the springboot project through idea, and we initialize the memory to 1024M. Here we start to analyze our GC logs from the memory of 1024M, and make a reasonable memory setting based on some of our knowledge above.

Pressure measurement interface

    @RequestMapping( " /getData")
    public List<People> getProduct(){
    
    
        List<People> peopleList = peoplesevice.getPeopleList();
        return peopleList;
    }

The JVM settings are as follows:

-XX:+PrintGCDetails 
-XX:MetaspaceSize=64m
-Xss512K
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=heap/heapdump3.hprof 
-XX:SurvivorRatio=8
-XX:+PrintGCDateStamps 
-Xms1024M -Xmx1024M
-Xloggc:log/gc-oom3.log

Force Full GC

insert image description here
View heap space ratio

jmap -heap 12324

insert image description here
The memory ratio of the old age recommended on the official website of Anzhao, the old age should be set to 3 to 4 times the used memory

-XX:+PrintGCDetails 
-XX:MetaspaceSize=64m
-Xss512K
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=heap/heapdump3.hprof 
-XX:SurvivorRatio=8
-XX:+PrintGCDateStamps 
-Xms80M -Xmx80M
-Xloggc:log/gc-oom3.log

4. Performance optimization case 3: Troubleshooting solution for high CPU usage

Troubleshooting process for high CPU usage:

  1. ps aux | grep java View the current java process using cpu, memory, and disk to get the process with abnormal usage
  2. top -Hp process pid Check the pid of the currently used abnormal thread
  3. Change the thread pid to hexadecimal such as 31695 -> 7bcf and then get Ox7bcf
  4. pid of the jstack process | grep -A20 Ox7bcf Get the code of the related process or export the log as a file, and search the log file according to the hexadecimal thread number (pid of the jstack process >xx.log)

5. Performance optimization case 4: The impact of the number of threads concurrently executed by G1 on performance

JVM parameter settings

export CATALINA_OPTS="$CATALINA_OPTS -XX:+UseG1GC"
export CATALINA_OPTS="$CATALINA_OPTS -Xms30m"
export CATALINA_OPTS="$CATALINA_OPTS -Xmx30m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDetails"
export CATALINA_OPTS="$CATALINA_OPTS -XX:MetaspaceSize=64m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDateStamps"
export CATALINA_OPTS="$CATALINA_OPTS -Xloggc:/opt/tomcat8.5/logs/gc.log"
export CATALINA_OPTS="$CATALINA_OPTS -xx:ConcGCThreads=1"

Note: The last parameter can be added after testing the initial concurrent GCThreads with G1GC. The initialization memory and the maximum memory adjustment are smaller, the purpose is to generate FullGc, and the GC time is concerned. The focus is: GC times, GC time, and the average response time of Jmeter
insert image description here
insert image description here

After JVM parameter settings are modified

export CATALINA_OPTS="$CATALINA_OPTS -XX:+UseG1GC"
export CATALINA_OPTS="$CATALINA_OPTS -Xms30m"
export CATALINA_OPTS="$CATALINA_OPTS -Xmx30m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDetails"
export CATALINA_OPTS="$CATALINA_OPTS -XX:MetaspaceSize=64m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+PrintGCDateStamps"
export CATALINA_OPTS="$CATALINA_OPTS -Xloggc:/opt/tomcat8.5/logs/gc.log"
export CATALINA_OPTS="$CATALINA_OPTS -xx:ConcGCThreads=2"

insert image description here
insert image description here

6. Performance optimization case 5: How to set JVM parameters in the daily average order trading system of millions

insert image description here

7. Performance optimization case 6 Website problem analysis

There is a 500,000 PV data website (extracting documents from disk to memory). The original server is 32-bit and has a 1.5G heap. Users report that the website is relatively slow. Therefore, the company decided to upgrade. The new server is 64-bit and has 16G heap memory. As a result, the user feedback is very serious, but it is less efficient than before!

7.1. Why is the original website slow?

Frequent GC, STw time is long, response time is slow!

7.2. Why is it more stuck?

The larger the memory space, the longer the FGC time and the longer the delay time

732、How to deal with it

Garbage collector: parallel Gc; ParNew + CMS; G1
configuration Gc parameters: -XX:MaxGCPauseMillis, -XX:ConcGCThreads According to log log, dump file analysis, optimize the proportion of memory space
jstat jinfo jstack jmap

8. The system memory is soaring, how to find the problem?

  1. On the one hand: jmap -heap . jstat . . . ; gc log situation
  2. On the other hand: dump file analysis

Guess you like

Origin blog.csdn.net/prefect_start/article/details/124145643
Recommended