5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

Preface

This article will focus on the analysis of jvm, and the content involved includes jvm memory model, class loader, GC recovery algorithm, GC recovery device, and the overall bias is theoretical. This article is not suitable for beginners. Due to the limited space, the editor will sort out more than 400 pages of study notes on JVM performance tuning, and pay attention to the public species Hao: Kylin changes bugs, share it with everyone, suitable for more than 3 years of development experience The technical staff of, welcome everyone to exchange and share, if there are any shortcomings in the article, welcome readers to point out, thank you in advance.

A clear relationship between jdk, jre and jvm

The following picture is the official website about jdk, jre and jvm architecture diagram, from the architecture diagram, it is easy to see the relationship between the three:

(1) jdk contains jre, and jre contains jvm

(2) JDK is mainly used in the development environment, and jre is mainly used in the release environment. Of course, it is okay to use JDK in the release environment, but performance may be affected a bit. The relationship between jdk and jre is somewhat similar to the relationship between the debug version and the release version of the program

(3) In terms of file size, jdk is larger than jre. As can be seen from the figure, jdk has one more toolkit than jre, such as commonly used javac, java commands, etc.

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

Second class loader

About the jvm class loader, it can be summarized as the following figure:

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

1. Why is there a class loader?

(1) Load the bytecode file into the runtime data area. The bytecode file (.class) formed by compiling the .java source code through the Javac command is loaded into the jvm through the class loader.

(2) Determine the uniqueness of the data area of ​​the bytecode file at runtime. The same bytecode file can form different files through different class loaders. Therefore, the uniqueness of the data area of ​​the bytecode file at runtime is determined by the bytecode file and the class loader that loads it.

2. Types of class loaders

Divided from the category, class loaders are mainly divided into four categories

(1) Start the class loader (the root class loader Bootstrap ClassLoader): This class loader is located at the top level of the class loader, and mainly loads the JRE core related jar packages, such as /jre/lib/rt.jar

(2) Extension ClassLoader: This class loader is located at the second level of the class loader hierarchy, and it mainly loads JRE extension related jar packages, such as /jre/lib/ext/*.jar

(3) Application ClassLoader App: This class loader is located in the third layer of the class loader, and it mainly loads related jar packages under the classpath (classpaht)

(4) User-defined class loader (User ClassLoader): This class loader is a user-defined class loader, which mainly loads related jar packages under the path specified by the user

3. The mechanism of class loader (parental delegation)

For the loading of bytecode, the class loading mechanism is parent delegation. What is parent delegation?

After the class loader obtains the bytecode file, it does not directly load it, but passes the bytecode file to its direct parent class loader, and its direct parent loader continues to pass to the direct parent load of its direct parent loader Loader, and so on to the root parent loader, if the root parent loader

If it can be loaded, then load, otherwise it will be loaded by its direct child loader. If the direct child loader can load, it will be loaded. If it can’t, the direct child class loader will be followed by analogy. Loader.

4. How to implement class loader in jdk 1.8?

The following is the implementation of jdk 1.8 class loader, using recursive mode

protected Class<?> loadClass(String name, boolean resolve)
        throws ClassNotFoundException
    {
        synchronized (getClassLoadingLock(name)) {
            // First, check if the class has already been loaded
            Class<?> c = findLoadedClass(name);
            if (c == null) {
                long t0 = System.nanoTime();
                try {
                    if (parent != null) {
                        c = parent.loadClass(name, false);
                    } else {
                        c = findBootstrapClassOrNull(name);
                    }
                } catch (ClassNotFoundException e) {
                    // ClassNotFoundException thrown if class not found
                    // from the non-null parent class loader
                }

                if (c == null) {
                    // If still not found, then invoke findClass in order
                    // to find the class.
                    long t1 = System.nanoTime();
                    c = findClass(name);

                    // this is the defining class loader; record the stats
                    sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0);
                    sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);
                    sun.misc.PerfCounter.getFindClasses().increment();
                }
            }
            if (resolve) {
                resolveClass(c);
            }
            return c;
        }
    }

5. Destroy the parent delegation model

In some cases, due to the limitation of the loading range, the parent class loader cannot load the required file, so the parent class loader needs to delegate its sub-class loader to load the corresponding bytecode file.

For example, the database driver interface Driver defined in jdk, but the implementation of this interface is implemented by different database vendors, which causes such a problem: by the startup class (Bootstrap ClassLoader)

The implemented DriverManager must load the relevant implementation classes that implement the Driver interface to achieve unified management, but Bootstrap ClassLoader can only load the corresponding files under jre/lib, not

Implementation classes related to the Dirver interface implemented by various vendors (Dirver implementation classes are loaded by Application ClassLoader). At this time, Bootstrap ClassLoader is required to entrust its subclass loader to load Driver

To achieve, thereby destroying the parental delegation model.

Three types of life cycles

The life cycle of classes in java and jvm is roughly divided into five stages:

1. Loading stage: Obtain the bytecode binary stream, convert the static storage structure into the runtime data structure of the method area, and generate the corresponding class object (java.lang.Class object) in the method area as the data of the class Access the entrance.

2. Connection phase: This phase includes three small phases, namely verification, preparation and analysis

(1) Verification: Ensure that the bytecode file meets the requirements of the virtual machine specification, such as metadata verification, file format verification, bytecode verification and symbol verification, etc.

(2) Preparation: Allocate memory for the internal static table, and set the jvm default value. For non-static variables, at this stage, there is no need to allocate memory.

(3) Analysis: Convert symbol references in the constant pool into direct references

3. Initialization stage: some necessary initialization work before the use of the class object

The following quoted from a blogger's point of view, personally think that the explanation is very good.

In Java code, if we want to initialize a static field, we can assign it directly during declaration, or assign it in a static code block.

Except for final static modified constants, direct assignment operations and all codes in static code blocks will be placed in the same method by the Java compiler and named as <clinit>. The purpose of initialization is to mark as

Constant value field assignment and the process of executing the <clinit> method. The Java virtual machine ensures that the <clinit> method of the class is executed only once by locking.

Under what conditions will class initialization occur?

(1) When the virtual machine starts, initialize the main class (main function) specified by the user;

(2) When encountering the new instruction for creating a new instance of the target class, initialize the target class of the new instruction;

(3) When an instruction to call a static method is encountered, initialize the class where the static method is located;

(4) The initialization of the subclass will trigger the initialization of the parent class;

(5) If an interface defines a default method, the initialization of the class that directly implements or indirectly implements the interface will trigger the initialization of the interface;

(6) When using reflection API to make a reflection call to a class, initialize this class;

(7) When the MethodHandle instance is called for the first time, initialize the class of the method pointed to by the MethodHandle.

4. Use stage: use objects in jvm

5. Unloading stage: Unloading the object from the jvm (unload), what conditions will make the jvm class unloading?

(1) The class loader that loaded the class is recycled

(2) All instances of this class have been recycled

(3) The java.lang.Class object corresponding to this class is not referenced anywhere

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

Four jvm memory model

1. What is the JVM memory model?

The following is a diagram of the JVM memory model architecture. As discussed in the previous article, I will not discuss them one by one here, and mainly explain the heap area.

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

Before jdk 1.8, the heap area was mainly divided into young generation, old generation and permanent generation. After jdk 1.8, the permanent generation has been removed and the MetaSpace area has been added. Here, mainly share jdk 1.8.

According to jdk1.8, the heap area logic is abstracted into three parts:

(1) New generation: including Eden area, S0 area (also called from area), S21 (also called TO area)

(2) Old age

(3) Metaspace area

2. What is the memory size of the new generation and the old generation?

According to official recommendations, the new generation accounts for one-third (Eden:S0:S1=8:1:1), and the old generation accounts for two-thirds. Therefore, the memory allocation diagram is as follows:

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

3. How does GC recovery work?

The object runs in the Eden area first. When the memory of Eden is full, Eden will perform two operations: reclaim unused objects and put unreclaimed objects in s0 area. At this time, s0 area and s1 area exchange names, namely s0- >s1, s1->s0, the Eden area has been reclaimed once, and the space is released. When Eden is full again next time, the same steps are executed and executed in turn. When the Eden area is reclaimed, the remaining objects exceed the s0 capacity. A Minor GC will be triggered. At this time, the unreclaimed objects will be put into the old area and executed in a loop. When the Eden area triggers the Minor GC and the remaining object capacity is greater than the remaining capacity of the old area, the old area will trigger a Major GC , A Full GC will be triggered at this time. It should be noted that, in general, Major GC will be accompanied by a Full GC recovery. Full GC is very performance-consuming. Pay attention to the JVM tuning.

The following picture is a GC picture taken by me in the production environment, the monitoring tool VisualVM

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

4. What are the garbage collection algorithms?

(1) Mark-clear algorithm

The algorithm is divided into two stages, namely the marking stage and the clearing stage. First, all objects to be recycled are marked, and then the marked objects are recycled. This algorithm is inefficient and prone to memory fragmentation.

a. Low efficiency: need to traverse the memory twice, marking the first time, and recycling the marked object the second time

b. Because it is a non-contiguous memory segment, it is prone to fragmentation. When the object is too large, Full GC is prone to occur

The figure below is a schematic diagram of the comparison of the mark-sweep algorithm before and after recovery

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

(2) Mark-copy algorithm

This algorithm solves the problem of the low efficiency of the "mark-and-sweep" algorithm and most of the memory fragmentation. It divides the memory into two pieces of equal size, and only uses one piece at a time. When one piece needs to be recycled, only the area The surviving objects are copied to another block, and then the block of memory is cleaned up at once, and the cycle repeats.

The figure below is a schematic diagram of the mark-copy algorithm before and after recycling

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

However, because most of the objects in the young generation have a very short residence time, 98% of the objects are quickly recycled, and there are very few surviving objects. There is no need to divide the memory according to 1:1, but according to 8:1:1. Divide,

Put 2% of the surviving objects in s0 (from area).

The following is a schematic diagram of dividing according to Eden:s0:s1 =8:1:1

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

(3) Marking-sorting algorithm

The algorithm is divided into two stages, marking and sorting. First, all surviving objects are marked, these objects are moved to one end, and then the memory outside the end boundary is directly cleaned up. Since objects in the old age live longer, this algorithm is suitable.

The marking process is still the same as the "mark-sweep" process, but the subsequent steps are not to directly clean up the recyclable objects, but to move all surviving objects to one end, and then directly clean up the memory outside the end boundary.

The following is a schematic diagram of the payback period and post-recovery of the "mark-sort algorithm"

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

(4) Generational collection algorithm

The algorithm is the current JVM algorithm, using generational thinking, the model is as follows:

5 tips to thoroughly understand the JVM memory model [for Java development over 3 years]

5. What are the common GC collectors?

(1) SerialGC

SerialGC is also called the serial collector, and it is also the most basic GC collector. It is mainly suitable for single-core CPUs. The new generation adopts the replication algorithm, and the old generation adopts the mark-compression algorithm. The application needs to be suspended during operation.

Therefore, it will cause STW problems. Mark the parameter in JVM as: -XX:+UseSerialGC.

(2)ParallelGC

ParallelGC is based on SerialGC. It mainly solves the serial problem of SerialGC. It is changed to parallel problem to solve the multi-thread problem, but it will also cause STW problems. Jvm key parameters:

a.-XX:+UseParNewGC, which means the new generation parallel (replication algorithm) the old serial (mark-compression)

b.XX:+UseParallelOldGC, the old age is also parallel

(3)CMS GC

CMSGC belongs to the old-age collector. It adopts the "mark-sweep algorithm" and will not cause STW problems. The parameter settings in jvm:

-XX:+UseConcMarkSweepGC, which means that the old age uses the CMS collector

(4)Garbage First

Garbage First is oriented to the jvm garbage collector. It satisfies a short pause while achieving a high throughput. It is suitable for multi-core CPUs and large memory servers. It is also the default garbage collector of jdk9.

Five summary

In-depth analysis of the JVM memory model, which focuses on the analysis of the relationship between jdk, jre and jvm, jvm class loader, jvm heap memory division, GC collector and GC recycling algorithm, etc. The overall bias is theoretical, due to limited space, the editor will organize it accordingly A JVM performance tuning practice with more than 400 pages of study notes, pay attention to the public species Hao: Kylin changes the bug, share it with you, this article does not analyze how these technologies are used in the actual tuning of the JVM, and will be in the next Share with you in the article.

Guess you like

Origin blog.51cto.com/14994509/2596518