JVM series-Section 1: Introduction to JVM, runtime data area, memory generation model

1. What is JVM?

JVM is the abbreviation of Java Virtual Machine (Java Virtual Machine). JVM is a specification for computing devices. It is a fictitious computer that is realized by simulating various computer functions on an actual computer.

JVM is a specification, there are many implementations, such as Oracle/Sun JDK, OpenJDK, etc., all use the same JVM: HotSpot VM ; a highly modular JVM developed by IBM: J9 . In addition, there are many other JVM implementations. Usually when people talk about "how is Java performance", "how many kinds of GC are there in Java", "how to tune the JVM" and other issues, the default is HotSpot VM, so HotSpot VM is the absolute mainstream. The JVM mentioned below refers to HotSpot.

The Java virtual machine is essentially a program. When it is started on the command line, it begins to execute the instructions stored in the bytecode file. JVM has two important functions, 1. Machine code translation. JVM guarantees "compile once, run many times", because different platforms have different JVMs. For example, HotSpot has windows and linux versions, and different platforms use different versions. For programmers, they only need to pay attention to some codes. Consider the portability of the code, because the JVM of different platforms has shielded the differences in the system. 2. Memory management. Programmers need to use an object, only need to be new, don't care about how the new is coming out, don't care about the life cycle of the object, and don't care about when to recycle the object.

JVM partition

The Java virtual machine is mainly divided into five modules: class loader, runtime data area, execution engine, native method interface, and garbage collection module. The following content mainly talks about two of them, the runtime data area and the garbage collector.

2. Runtime data area

Before a thread runs, different parts of the code that needs to be executed will be placed in different areas of the runtime data area, and data will be taken from different locations when the thread is running.

2.1 Program counter

The program counter stores the address and line number of the bytecode instruction being executed by the current thread. Why record the address and line number of the bytecode instruction being executed by a thread? Thread is the smallest execution unit of java, because when CUP executes multiple threads at the same time, it will involve thread switching. When CUP switches threads, this information should be recorded so that when CUP switches to the current thread again, the thread knows from what The position begins to continue execution. Each thread will have its own program counter.

2.1 stack

The virtual machine stack stores the data, instructions, and return addresses required by the method running by the current thread .

Give a simple code example:

package com.wuxiaolong.jvm;

/**
 * Description:
 *
 * @author 诸葛小猿
 * @date 2020-09-06
 */
public class TestJVM {
    
    

    public static final int AGE = 30;

    public static void test () {
    
    
        int a = 1;
        int b = 2;
        int c = a + b;
        Object objc= new Object();
    }
}

Find the TestJVM.class file compiled by the above TestJVM.java, javapview each instruction of the bytecode through the command, and save the instruction into the TestJVM.txt file.

$ javap -c -v ./TestJVM.class > TestJVM.txt

Command file TestJVM.txt:

Classfile /C:/Users/WuXiaoLong/Desktop/java-summary/target/classes/com/wuxiaolong/jvm/TestJVM.class
  Last modified 2020-9-6; size 497 bytes
  MD5 checksum e2bee1c0136645a123ea37b4c6aba4a2
  Compiled from "TestJVM.java"
public class com.wuxiaolong.jvm.TestJVM
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
// 这里是常量池的描述      
Constant pool:
   #1 = Methodref          #2.#23         // java/lang/Object."<init>":()V
   #2 = Class              #24            // java/lang/Object
   #3 = Class              #25            // com/wuxiaolong/jvm/TestJVM
   #4 = Utf8               AGE
   #5 = Utf8               I
   #6 = Utf8               ConstantValue
   #7 = Integer            30
   #8 = Utf8               <init>
   #9 = Utf8               ()V
  #10 = Utf8               Code
  #11 = Utf8               LineNumberTable
  #12 = Utf8               LocalVariableTable
  #13 = Utf8               this
  #14 = Utf8               Lcom/wuxiaolong/jvm/TestJVM;
  #15 = Utf8               test
  #16 = Utf8               a
  #17 = Utf8               b
  #18 = Utf8               c
  #19 = Utf8               objc
  #20 = Utf8               Ljava/lang/Object;
  #21 = Utf8               SourceFile
  #22 = Utf8               TestJVM.java
  #23 = NameAndType        #8:#9          // "<init>":()V
  #24 = Utf8               java/lang/Object
  #25 = Utf8               com/wuxiaolong/jvm/TestJVM
{
    
    
  // 静态常量AGE的描述    
  public static final int AGE;
    descriptor: I
    flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
    ConstantValue: int 30
  // 这里是TestJVM类的描述
  public com.wuxiaolong.jvm.TestJVM();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 9: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lcom/wuxiaolong/jvm/TestJVM;
  // 这里是test方法的描述
  public static void test();
    descriptor: ()V
    flags: ACC_PUBLIC, ACC_STATIC
    Code: // test方法的指令
      stack=2, locals=4, args_size=0
         0: iconst_1
         1: istore_0
         2: iconst_2
         3: istore_1
         4: iload_0
         5: iload_1
         6: iadd
         7: istore_2
         8: new           #2                  // class java/lang/Object //创建一个对象 在堆上分配了内存并在栈顶压入了指向这段内存的地址
        11: dup
        12: invokespecial #1                  // Method java/lang/Object."<init>":()V  //调用构造函数、实例化方法
        15: astore_3
        16: return
      LineNumberTable:  // test方法在java代码中的行号 
        line 14: 0
        line 15: 2
        line 16: 4
        line 17: 8
        line 18: 16
      LocalVariableTable: // test方法的本地局部变量表
        Start  Length  Slot  Name   Signature
            2      15     0     a   I
            4      13     1     b   I
            8       9     2     c   I
           16       1     3  objc   Ljava/lang/Object;
}
SourceFile: "TestJVM.java"

Through the above instruction description file, you can see the description of all the instructions of the TestJVM class.

Each thread in the JVM has a runtime stack. The JVM will open up a space in the runtime stack for the method executed by each thread. This space is called a stack frame . Each stack frame is divided into several blocks, such as the method's local variable table, operand stack, dynamic link, method exit (return address), etc.:

Virtual machine stack

In the stack frame that records the test method , there are four local variables in the test method in the local variable table : a/b/c/objc, corresponding to lines 81-86 of the TestJVM.txt instruction description file;

Operand stack in memory of the operand corresponding local variables, such as line 62-74 TestJVM.txt instruction file describes an instruction, a first instruction iconst_1: int type constant value 1 into the stack, indicates the int a = 1sentence code Operand 1 is put into the operand stack (at this time there is only one number 1 in the stack); the second instruction istore_0: store the int type value on the top of the stack into the 0th local variable, and the 0th local variable is Who? The TestJVM.txt instruction describes the 83 line of the file, indicating the 0th local variable a, and operand 1 is stored in the local variable a outbound. In fact, iconst_1 and istore_0 are int a = 1the instructions executed by this code. The analysis of other instructions will not be explained in detail. You can check the meaning of each instruction on the Internet.

Need to pay attention to the 17th line of the test method of TestJVM.java Object objc= new Object();. This is an object, which is different from the local variables a/b/c above. The instructions involved in this sentence include four instructions: new, dup, invokespecial, and astore_3. Among them, new refers to the creation of an object, which specifically allocates memory on the heap and pushes the address pointing to this memory on the top of the stack. Objects are stored in the heap, and only the address of the object in the heap is stored in the stack .

Dynamic linking refers to if the called method or object cannot be determined at compile time, that is to say, the symbolic reference of the calling method can only be converted into a direct reference during the runtime of the program, because this reference conversion process is dynamic , So it is called dynamic link. Similar to #2 and #1 in lines 70 and 72 of TestJVM.txt, corresponding to the Constant pool in lines 12 and 11 of TestJVM.txt, the function of dynamic linking is to reference these symbols (#) finally Converted to a direct reference to the calling method. To give a simple example, usually when the Service layer is called in the Controller layer, a Service is injected through @Autowired. Usually, a Service interface is used instead of an implementation class. In a method of the Controller, the specific method is called through the method in the Service interface. Service implementation, if there are multiple implementations of the Service interface, the program does not know which implementation class to use at compile time. At this time, a dynamic link (#) will be generated in the bytecode Constant pool part, which will eventually be converted to The direct reference of the calling method. Through the bytecode instruction file, it can be seen that the translation of Constant pool into "constant pool" is not accurate. In addition to constants, there are also symbol references (including descriptors for classes, methods, fields, etc.) in Constant pool.

Method exit (return address) means that when a method is executed, it needs to be popped from the stack, then where to go after the stack is popped, the popping of a method that is normally executed is different from that of an abnormally executed method.

Note that when a method is called recursively in a thread, there will be a stack frame for each method call, so the stack depth requested by the thread is greater than the stack depth allowed by the virtual machine, and a StackOverflowError will be thrown. Although the size of the stack can be automatically expanded, it will still report OutOfMemory when it is unable to apply for a larger space during dynamic expansion.

The time determined by the stack frame size is in the compile time and is not affected by the runtime data. Therefore, the memory space required by the local variable table is allocated during compilation. When entering a method, how much local variable space this method needs to allocate on the stack is completely determined, and the size of the local variable table will not be changed during the running of the method.

2.3 Local method stack

The native method stack is similar to the virtual machine stack, except that it describes the execution of the native method. What is the local method? The native method refers to the method modified by the native keyword. There is no specific implementation class in the JDK. The specific implementation is in the code of the JVM. Here you can find the source code of various versions of Hotspot. The source code is written in C or C++. In the previous article, the Hotspot source code was used when analyzing CAS.

2.4 Method area

The method area stores the class information, constants, static variables, JIT and other information of the class bytecode.

Lines 10-35 of the TestJVM.txt command file are the constant pool, and the content of this part is placed in the constant pool in the method area.

Method area

Here you can think about it, why not put static variables and constants in the heap? I feel that because constants and static variables are generally unchanged, as long as you store one copy, the same data will exist every new object in the heap, causing a waste of space.

2.5 heap

For most applications, the heap is the largest memory area managed by the Java virtual machine. Because the objects stored in the heap are shared by threads, a synchronization mechanism is also required when multithreading . Therefore, we need to focus on understanding.

All objects stored in the heap are new objects, and references to the objects stored in the stack point to the memory addresses of the objects in the heap. All object instances and arrays must allocate memory on the heap, but with the development of the JIT compiler and the maturity of escape analysis technology ( talking about HotSpot escape analysis ), this statement is not so absolute, but most of the cases are like this .

Three, JVM memory generation model

Before JDK1.8, the memory of the JVM was divided into three major blocks, the new generation, the old generation, and the permanent generation. The first two pieces are in the heap, and the latter piece is in the method area. In JDK1.8 and later, Meta Space has been removed from the permanent generation.

JVM memory model

Why do we need to generate generations ? Because the life cycles of different objects are different, objects with different life cycles are placed in different generations, and different garbage collection algorithms are used for recycling.

3.1 Cenozoic

Generally speaking, objects in the new generation will be placed in the Cenozoic. The objects in the Cenozoic are generally those with a relatively short life cycle, and more than 98% of the objects can be recycled (Minor GC) in one collection.

The Cenozoic memory is divided into three Eden area, s0 area, and s1 area. Why is it divided into three ?

The Cenozoic is divided into three parts. The main reason is that the garbage collection algorithm used by the Cenozoic uses a replication algorithm.

Minor GC

S0 and S1 are two areas with the same size and the same function, but only one area can work in a GC. When the eden area is full for the first time, the first minor gc will be triggered, and the objects in the eden area will be recovered. After gc, there is another object that is reachable, then it belongs to the surviving object, and this object will be placed in s0 area. When the eden area is full for the second time, the second minor gc will be triggered. At this time, the eden area and the s0 area will be reclaimed. If the object b is still reachable at this time, and the object j is also reachable, then these two objects are Will enter the s1 area. It can be seen that the surviving objects in each gc will be copied back and forth between the s0 and s1 regions. This is the replication algorithm. The age of the object surviving after gc will be increased by 1. After multiple gc age reaches a fixed threshold (default 15), the object will enter the old age.

The new generation memory is divided into three Eden area, s0 area, s1 area, the ratio is: 8:1:1. Why is the ratio 8:1:1 ? Because in the replication algorithm, only one area of ​​s0 and s1 can work, and the other is empty, so not all new generations are valid storage space. Excessive s0 and s1 will cause the available memory to become smaller and the eden area to be too small. Minor gc will become more frequent; when s0 and s1 are too small, resulting in fewer gc times, s0 and s1 will be full, causing younger objects to enter the old age. 8:1:1 can be regarded as the twenty-eight principle.

3.2 Old age

The memory ratio between the new generation and the old generation is 1:2.

Objects that still exist after multiple collections in the new generation will enter the old generation.

3.3 Permanent Generation and Meta Space

In order to avoid the overflow of the permanent generation, in JDK1.8 and later, the permanent generation is removed and Meta Space is used. The memory of Meta Space belongs to the memory allocated outside the generation, and it uses the direct memory of the machine. Meta Space can automatically expand, although it can automatically expand, but Meta Space is not as big as possible, because the total memory of the machine is fixed, Meta Space will squeeze the use of other memory space.

The follow-up will continue to introduce the java memory model, four references, GC recovery algorithm, GC recovery device, JVM optimization and so on.

Follow the official account and enter " java-summary " to get the source code.

Finished, call it a day!

[ Dissemination of knowledge, sharing of value ], thank you friends for your attention and support. I am [ Zhuge Xiaoyuan ], an Internet migrant worker struggling in hesitation.

Guess you like

Origin blog.csdn.net/wuxiaolongah/article/details/109323029