"Sandbox Simulation Series" How to tune the JVM

On paper, I finally feel shallow, but I don’t know that this matter has to be done

My company basically has no chance to tune JVM parameters, but if you don't experience some things yourself, no matter how much theoretical knowledge you see, it can only be regarded as a piece of paper. When you really encounter problems, you still don't know how to analyze them. . So just create some problems by yourself and then look at the phenomenon, use the knowledge you have learned to speculate in advance, and see if the phenomenon is the same as your own speculation. This will not only consolidate the knowledge you have learned, but also exercise your ability to solve problems (although the problems are created by yourself).

In fact, before writing this article, I have read the content about JVM tuning several times, whether it is a book or a blog, but after reading most of them, I feel that I understand it, but when I really simulate the operation myself, I feel that I can’t do anything. , but after simulating it by myself, I found that I could associate the previous knowledge to form a face, and I felt that my understanding was a little deeper. It is emphasized here that I hope that after reading it, you can simulate it on the machine yourself, use different parameters, and then guess the result and verify it yourself

Tool preparation

If a worker wants to do a good job, he must first sharpen his tools. Before analyzing the JVM, we need to prepare tools, one is a visual garbage collection tool, and the other is a stress measurement tool.

GcViews installation

  1. Download the GcViewscode from Git to the github address
  2. Execute the command in the root directory of the projectmvn clean install
  3. Then I found that a folder was generated in the root directory, and files targetcan be found in itgcviewer-1.37-SNAPSHOT.jar

JMeter installation

Apache JMeter is an open source stress testing tool. JMeter is developed based on Java. JMeter is not only used for Web stress testing, but also uses open source for stress testing based on access software. It can perform stress testing on static files, databases, FTP, SSH, etc. do a stress test

  • Download JMeter, download address
  • Unzip it, my address is/Users/hupengfei/apache-jmeter-5.1.1
  • Open a terminal and go to its bindirectory
  • Excuting an ordersh jmeter

Then I won't go into details about how to configure the parameters in it. You can read this article JMeter Http Stress Test [Illustration]

Introduction to Theory

For JVM tuning, it is mainly the optimization of JVM garbage collection. Generally speaking, optimization is required because there are problems. Therefore, for JVM GC, if you observe that the CPU usage of your application service process is relatively high , and found in the GC log that the number of GCs is frequent and the GC pause time is long, which indicates that you need to optimize the GC.

In the process of GC tuning, we do not need to know some GC principles. More importantly, we need to be proficient in using various monitoring and analysis tools, and have the actual combat ability of GC tuning. At present, there are two garbage collectors with the highest usage rate, one is CMS and the other is G1. Starting from Java9, G1 is used as the default garbage collector, and the goal of G1 is to gradually replace CMS. So let me briefly introduce the difference between these two collectors.

You can use java -XX:+PrintCommandLineFlags -versionthe command to view some default parameters of the output on the command line. The default garbage collector for each version can be viewed here

  • Java 7: Parallel GC
  • Java 8: Parallel GC
  • Java 9: ​​G1 GC
  • Java 10: G1 GC

CMS collector

The CMS collector divides the Java heap into the young generation and the old generation (the permanent generation has been removed in Java 8 and turned into a metaspace, and the metaspace is stored directly in memory, not in the JVM). This is mainly because studies have shown that more than 90% of objects will be collected in the first GC, but a few objects will survive for a long time.

In CMS, the young generation is also divided into two parts, one is the survivor space (Survivor) and the Eden space (Eden) . New objects are always created in Eden space, and once an object survives a garbage collection, it is moved to survivor space. When an object survives multiple garbage collections, it is moved to the old generation. The purpose of this is to use different garbage collection algorithms in the young and old generations to achieve higher collection efficiency. For example, because the objects in the young generation have a shorter survival time and fewer objects are left in a garbage collection, the copy-sorting algorithm is used . However, in the old age, the object survival time is longer, and there may be fewer objects collected in one garbage collection and more objects left, so the mark-sort algorithm is used .

G1 collector

Compared with CMS, G1 has two major characteristics

  • G1 can complete most of the GC work concurrently, and will not "Stop-The-World" during this period
  • G1 uses non-contiguous space, which allows G1 to efficiently handle very large heaps, and G1 can collect both young and old generations. Instead of dividing the Java heap into three spaces (Eden, Survior, and Old), G1 divided the heap into many very small areas. The size of these regions is fixed (by default each region is 2MB in size). Each area is allocated a space.

U in the figure represents the unallocated area. G1 splits the heap into small areas. One of the biggest advantages is that it can perform local area garbage collection instead of reclaiming the entire area each time, such as the young generation and the old generation. The pause time for recycling will be shorter. The collection process is roughly as follows

  • Copy all live objects from the collected area to the unallocated area . For example, the collected area is Eden space, copy the surviving objects in Eden to the unallocated area, and this unallocated area becomes the Survior space. Ideally, if an area is all garbage (meaning that a surviving object has no ), you can directly declare the area as "unallocated".
  • To optimize collection time, G1 always prioritizes the regions with the most garbage, thus minimizing the amount of work required to allocate and free heap space subsequently. This is also the origin of the name of the G1 collector - Garbage-First

Practical drill

The version I am using is Java8 and the Java garbage collector used is CMS'

Next, I will use practical examples to combat frequent GCs in Java programs due to the small setting of the youth generation. We will use GC log analysis tools to observe GC activities and locate problems.

First, we create a SpringBoot program as our tuning object. code show as below:

@RestController
@Slf4j
public class GcTestController {
    
    private List<Greeting> objListCache = new ArrayList<>();
    
    @RequestMapping("/greeting")
    public Greeting greeting() {
        Greeting greeting = new Greeting();
        if (objListCache.size() >= 100000) {
            log.info("clean the List!!!!!!!!!!");
            objListCache.clear();
        } else {
            objListCache.add(greeting);
        }
        return greeting;
    }
}

@Data
class Greeting {
    private String message1;
    private String message2;
    private String message3;
    private String message4;
    private String message5;
    private String message6;
    private String message7;
    private String message8;
    private String message9;
    private String message10;
    private String message11;
    private String message12;
    private String message13;
    private String message14;
    private String message15;
    private String message16;
    private String message17;
    private String message18;
    private String message19;
    private String message20;
}

The above code creates an object pool, which will be emptied once when the number of objects in the object pool reaches 100,000, which is used to simulate the objects of the old age. Can you use my last article to put millions of data into memory and not blow up the system? Roughly calculate how much memory 10W objects occupy in memory. Here I will directly say that 100,000 Greeting objects occupy about 10M of space.

So below I set the startup parameter settings in Idea, the parameters are as follows

-Xmx52m -Xmn9m -Xss256k -XX:+PrintGC -XX:+UseConcMarkSweepGC -Xloggc:/Users/hupengfei/Downloads/gclog/gc.log

The initial heap size I set for the program is 52MB, the size of the young generation is 9MB, and the default ratio between the Eden area and the Survior area in the young generation is 4:1, so probably the size of the Eden area in the young generation is 7.2MB. The purpose is to Let everyone see that objects that are not recycled in the Eden area will enter the old age, and if the Eden area is full, Young GC will occur.

Then we use the JMeter stress testing tool to send a test request to the program. Note that the access time I set here is 10 minutes, and then a thread accesses it without interruption.

After ten minutes, we can use the GCViewer tool to open the GC log, and we see the following picture

  • The blue line: indicates the size of the heap that has been used. We see that its cycle is oscillating up and down. This is because our object pool will not be emptied until it expands to 100,000.
  • The green line at the bottom: Indicates that GC activity occurs. We can see that after the usage of the heap increases, frequent GC will be triggered
  • The black line in the middle: indicates Full GC, we can see that the blue line drops along with Full GC, which means that Full GC reclaims the objects of the old age

Based on the above figure, we can draw a conclusion that the young generation set is not enough. Why do we come to such a conclusion?

  • Frequent GC activity: you can see that the green lines are denser
  • The memory of the Java heap can be reclaimed after a Full GC occurs, indicating that it is not a memory leak

Through the display on the left of GCView, we can see that the total GC occurred 1622 times, of which the Full GC occurred once.

Next, when the total heap size remains unchanged, we just adjust the size of the young generation to 16MB, and then we look at the picture

We can see that although there is still one Full GC, the GC of the young generation is not so frequent. And the cumulative GC pause time is only 1.48 seconds

What if we want to continue optimizing? It is to continue to expand the total size of the heap memory. Next, we set the heap to 200MB and the young generation to 80MB. Let's take a look at the effect.

It can be seen that at the same time, there is no Full GC, and less GC occurs in the young generation

Tuning strategy

For the CMS collector, we need to set a reasonable size of the young generation and the old generation. You may ask if there is a fixed formula? In fact, I don't have it here. The tuning process is an iterative process. You can use the default value of the JVM, and then perform a stress test to analyze the GC log. Observe the recovery of GC in different situations.

If we see that Minor GC occurs frequently, and the efficiency of frequent GC is not high, it means that our objects are not reclaimed so quickly. At this time, we can appropriately increase the size of the young generation and observe.

If we see that the memory usage of the old generation is high, resulting in frequent occurrence of Full GC. This is generally divided into two cases

  • If the memory occupancy rate of the old generation does not decrease every time Full GC, it may be a memory leak, and the code needs to be checked.
  • If the memory usage decreases after Full GC, it means that it is not a memory leak, you can consider adjusting the old age

code address

The test code has been put on GitHub https://github.com/modouxiansheng/Doraemon and the GC log I have tested many times has also been put in. If you don't want to test it yourself, you can download the GC log and see it for yourself picture

The author's writing skills are still shallow, if there is anything wrong, please point out generously, I will be grateful

Summarize

Knowledge on paper, or knowledge in books or online, is ultimately the author's own experience summary. There must be an author's idea. However, it may not be combined with the actual situation. More importantly, the accurate information to be conveyed in a sentence cannot be obtained by everyone who has read the text description. If there is precisely this experience, it will resonate.

I believe that people read for the purpose of learning, and learning is precisely for their own growth. So the center of learning is people, not books. The essence of learning is to combine what you have learned with yourself. If you don't combine with yourself, you can't resonate with the knowledge of the book, it will be difficult to deeply understand the truth of the book, and naturally it will be difficult to remember this truth.

Most of the knowledge in the book is the author's own understanding and perception, so it is difficult to reproduce this kind of scene that resonates with the author in the mind of the reader, so that the author also resonates. Only by deconstructing the knowledge in the book and connecting with oneself, "to do it without knowing it", then when we learn knowledge, we are also understanding ourselves, understanding the world we live in, gaining spiritual resonance, and obtaining the consolidation of knowledge.

So I will emphasize again and again in my article. If you want to have a deeper knowledge of this aspect, you must run it on the machine yourself, observe it yourself, modify a few parameters yourself, and verify the situation. It's possible that you can't touch the pit I encountered, and I haven't encountered the pit you encountered. Then if you encounter this pit and solve it yourself, it is to improve your ability.

refer to

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324032032&siteId=291194637