[Java] JVM runtime data area and memory recovery mechanism

foreword

JVM (Java Virtual Machine) is a virtual machine that programs such as Java, Kotlin, and Groovy run. It is one of the most important components in the Java technology system. Familiarity with the running process of the JVM, the runtime data area, and the memory recycling mechanism is very helpful for us to understand the Java language and solve problems such as memory leaks and memory overflows.

1. Java technology system

In a broad sense, Java, Kotlin, Scala, Clojure, JRuby, Groovy and other programming languages ​​and related programs running on the Java virtual machine are all members of the Java technology system. If only in the traditional sense, the Java technology system includes the following components:

  • Java programming language
  • Realization of Java Virtual Machine on Various Hardware Platforms
  • Class file format
  • Java class library API
  • Third-party libraries from commercial organizations and the open source community

Advantages of Java technology:

  • Rigorous structure, object-oriented
  • Cross-platform, "write once, run anywhere"
  • automatic memory management
  • Hot code detection, runtime compilation optimization
  • There is a complete set of application programming interface and three-party class library

Second, the Java runtime data area

During the execution of a Java program, the Java virtual machine divides the memory it manages into several different data areas. These areas have their own purposes, as well as the time of creation and destruction. Some areas always exist with the start of the virtual machine process, and some areas are established and destroyed depending on the start and end of user threads. According to the "Java Virtual Machine Specification", the memory managed by the Java virtual machine will include the following runtime data areas.

insert image description here

(1) Program counter

The program counter (Program Counter Register) is a small memory space, which can be regarded as the line number indicator of the bytecode executed by the current thread. Since the multithreading of the Java virtual machine is realized by switching threads in turn and allocating processor execution time, at any given moment, a processor (a core for a multi-core processor) will only execute one instructions in the thread. Therefore, in order to return to the correct execution position after thread switching, each thread needs to have an independent program counter. The counters between threads do not affect each other and are stored independently. We call this type of memory area "thread private". of memory.

(2) Java virtual machine stack

Like the program counter, the Java virtual machine stack is also private to the thread, and its life cycle is the same as the thread. The virtual machine stack describes the thread memory model of Java method execution: when each method is executed, the Java virtual machine will synchronously create a stack frame to store local variable table, operand stack, dynamic connection, method exit and other information. The process of each method being called until the execution is completed corresponds to the process of a stack frame from being pushed to being popped in the virtual machine stack.

The local variable table stores various basic data types of the Java virtual machine (boolean, byte, char, short, int, float, long, double) and object reference (reference type, which is not equivalent to the object itself, which may It is a reference pointer pointing to the starting address of the object, or it may point to a handle representing the object or other location related to the object) and returnAddress type (pointing to the address of a bytecode instruction)

The Java virtual machine stack memory area specifies two types of abnormal conditions:

  • If the stack depth requested by the thread is greater than the depth allowed by the virtual machine, a StackOverflowError exception will be thrown;
  • If the stack space requested by the thread exceeds the remaining memory of the JVM, an OutOfMemoryError exception will be thrown.

The virtual machine stack memory size is specified by the -Xss parameter

(3) Native method stack

Native Method Stacks (Native Method Stacks) are very similar to virtual machine stacks. The difference is that virtual machine stacks serve the virtual machine to execute Java methods (that is, bytecodes), while native method stacks serve virtual machine stacks. The local (Native) method service used by the machine. The "Java Virtual Machine Specification" does not have any mandatory regulations on the language, usage and data structure used by the methods in the local method stack, so specific virtual machines can implement it freely according to needs, and even some Java virtual machines (such as Hot-Spot Virtual machine) directly combines the local method stack and the virtual machine stack into one. Like the virtual machine stack, the local method stack will also throw StackOverflowError and OutOfMemoryError exceptions when the stack depth overflows or the stack expansion fails.

(4) Java heap

For Java applications, the Java heap (Java Heap) is the largest piece of memory managed by the virtual machine. The Java heap is a memory area shared by all threads and created when the virtual machine starts. The sole purpose of this memory area is to store object instances, and "almost" all object instances in the Java world allocate memory here. The description of the Java heap in the "Java Virtual Machine Specification" is: "All object instances and arrays should be allocated on the heap." The
Java heap can be implemented as either a fixed size or an expandable one, but currently The mainstream Java virtual machines are all implemented in terms of scalability (set by parameters -Xmx and -Xms). If there is no memory in the Java heap to complete the instance allocation, and the heap can no longer be expanded, the Java virtual machine will throw an OutOfMemoryError exception.

Java heap memory size parameters:

-Xmx: maximum memory

-Xms: initial memory

-Xmn Java Heap Young area size

eg:java -Xmx128m -Xms64m -Xmn32m -Xss128k Test

(5) Method area

The method area (Method Area) is also called "permanent generation". Like the Java heap, it is a memory area shared by each thread. It is used to store type information, constants, static variables, and real-time compiler compiled data that have been loaded by the virtual machine. Code cache and other data. Relatively speaking, garbage collection is indeed relatively rare in this area, but it does not mean that the data enters the method area and exists "permanently" like the name of the permanent generation. The memory recovery goal in this area is mainly for the recovery of the constant pool and the unloading of types.

Method area memory size setting parameters are as follows

-XX:PermSize=56m,-XX:MaxPermSize=128m

(6) Runtime constant pool

The Runtime Constant Pool is part of the method area. In addition to the class version, fields, methods, interfaces and other description information in the Class file, there is also a constant pool table (Constant Pool Table), which is used to store various literals and symbol references generated during compilation. This part The content will be stored in the runtime constant pool in the method area after the class is loaded.

Another important feature of the runtime constant pool compared to the class file constant pool is that it is dynamic. The Java language does not require constants to be generated only at compile time, that is, the contents of the constant pool that are not preset in the class file can be entered. The runtime constant pool in the method area, and new constants can also be put into the pool during runtime. This feature is used more by developers than the intern() method of the String class.

intern()方法的作用:When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

Question 1:

String s=new String("abc") //创建了几个对象?

Tip: The string object declared with double quotes is placed in the string constant pool, and the object created by the new keyword is stored in the heap memory.

Question 2:

Integer a = 200, b = 200;
System.out.println(a == b);//打印true还是false?
a = 100;
b = 100;
System.out.println(a == b);//打印true还是false?

Tip: Integer in java has a constant pool range -128~127

Question 3:

        String str1 = "abc";
        String str2 = "abc";
        System.out.println("str1==str2:" + (str1 == str2));  //打印true
        
        String str3 = new String("abc");
        System.out.println("str1==str3:" + (str1 == str3)); //打印false
        System.out.println("str1==str3.intern():" + (str1 == str3.intern())); //打印true
        System.out.println("str2==str3.intern():" + (str2 == str3.intern())); //打印true
        
        String str4 = new String("abc");
        System.out.println("str3==str4:" + (str3 == str4));//打印false
        System.out.println("str3.intern()==str4.intern():" + (str3.intern() == str4.intern()));
        //打印true

        String str5 = new StringBuilder("程序").append("员").toString();
        System.out.println("str5.intern() == str5:" + (str5.intern() == str5));//打印true

        String str6 = new String(new StringBuilder("中").append("国").toString());
        System.out.println("str6.intern() == str6:" + (str6.intern() == str6));//打印true

        String str7 = new String("工程师");
        System.out.println("str7.intern() == str7:" + (str7.intern() == str7));//打印false

Question 4: Can a StackOverFlowError occur in the virtual machine stack by increasing the memory capacity of the virtual machine stack?


3. JVM memory recovery mechanism

The program counter, virtual machine stack, and local method stack in the jvm runtime data area are born with the thread and destroyed with the thread, and the stack frames in the stack are pushed and popped in an orderly manner as the method enters and exits . The memory allocation and recovery in these areas are deterministic, and there is no need to think too much about memory allocation and recovery, because the memory is naturally recovered when the method ends or the thread is harnessed. The Java heap is different from the method area. The memory allocation and recovery of these areas are dynamic. The garbage collector mainly focuses on these two parts of memory.

(1) How to determine whether the memory needs to be recovered

1. Reference counting method

Add a reference counter to the object. Whenever there is a reference to it, the counter value will be increased by one; In the field of Java, at least the mainstream Java virtual machine does not use the reference counting algorithm to manage the memory. The main reason is that this seemingly simple algorithm has many exceptions to consider, and it must cooperate with a lot of extra processing to ensure correct work. For example, simple reference counting is difficult to solve the problem of circular references between objects.

2. Reachability analysis algorithm (root search algorithm)

The memory management subsystems of current mainstream commercial programming languages ​​(Java, C#) all use the Reachability Analysis (Reachability Analysis) algorithm to determine whether an object is alive. The basic idea of ​​this algorithm is to use a series of root objects called "GC Roots" as the starting node set. Starting from these nodes, search downwards according to the reference relationship. The path traveled by the search process is called "reference chain" ( Reference Chain), if there is no reference link between an object and GC Roots, or in terms of graph theory, when the object is unreachable from GC Roots, it proves that the object cannot be used again.

As shown in the figure, although objects object 5, object 6, and object 7 are related to each other, they are not reachable to GC Roots, so they will be judged as recyclable objects.

insert image description here

In the Java language, objects that can be used as GC Roorts include the following:

  • Objects referenced in the virtual machine stack (local variable table in the stack frame) . For example, parameters, local variables, temporary variables, etc. used in the method stack of each thread being called.
  • Objects referenced by class static properties in the method area . For example, a reference type static variable of a Java class.
  • Objects referenced by constants in the method area . For example, references in the string constant pool (String Table).
  • The object referenced by JNI (that is, the so-called Native method) in the local method stack .

(2) Citation classification

An object has only two states of "referenced" or "unreferenced", and it is powerless to describe some objects that are "tasteless to eat, but a pity to discard". For example, we hope to describe a class of objects: when the memory space is still sufficient, they can be kept in memory. If the memory space is still very tight after garbage collection, then these objects can be discarded-many system cache functions are in line with Such an application scenario. After JDK version 1.2, Java expanded the concept of references and divided references into four types: Strongly Reference, Soft Reference, Weak Reference and Phantom Reference , these four citation strengths gradually weaken in turn.

  • Strong reference : Strong reference is the most traditional definition of "reference", which refers to the reference assignment that commonly exists in program code, that is, a reference relationship like "Object obj=new Object()". In any case, as long as the strong reference relationship still exists, the garbage collector will never recycle the referenced object.

  • Soft references : Soft references are used to describe objects that are useful but not necessary. Objects that are only associated with soft references will be included in the recycling range for the second recycling before the system will experience a memory overflow exception. If there is not enough memory for this recycling, a memory overflow will be thrown abnormal. The SoftReference class is provided after JDK 1.2 to implement soft references.

  • Weak references : Weak references are also used to describe non-essential objects, but their strength is weaker than soft references. Objects associated with weak references can only survive until the next garbage collection occurs. When the garbage collector starts working, no matter whether the current memory is sufficient or not, objects that are only associated with weak references will be reclaimed. The WeakReference class is provided after JDK 1.2 to implement weak references.

  • Phantom reference : Phantom reference is also called "ghost reference" or "phantom reference", which is the weakest kind of reference relationship. Whether an object has a virtual reference will not affect its lifetime at all, and an object instance cannot be obtained through a virtual reference. The only purpose of setting a phantom reference association for an object is to receive a system notification when the object is reclaimed by the collector. The Phant omReference class is provided after JDK 1.2 to implement phantom references.

(3) Garbage collection algorithm

1. Mark-clear algorithm
The algorithm is divided into two stages: "mark" and "clear". First, mark all the objects that need to be recycled. After the mark is completed, all the marked objects are collected uniformly. It can also be reversed to mark survival objects, all unmarked objects are collected uniformly.

insert image description here

2. Mark-copy algorithm
The mark-copy algorithm is often referred to simply as the copy algorithm. In order to solve the problem of low execution efficiency of the mark-sweep algorithm in the face of a large number of recyclable objects, in 1969 Fenichel proposed a garbage collection algorithm called "Semispace Copying", which divides the available memory into Two blocks of equal size, only one of them is used at a time. When the memory of this block is used up, copy the surviving object to another block, and then clean up the used memory space at one time. If most of the objects in the memory are alive, this algorithm will generate a lot of memory-to-memory copy overhead, but for the case where most of the objects are recyclable, the algorithm needs to copy only a small number of surviving objects, and every time It is to reclaim memory for the entire half area. When allocating memory, there is no need to consider the complicated situation of space fragmentation. Just move the top pointer of the heap and allocate in order. This is simple to implement and efficient to run, but its flaws are also obvious. The price of this copy recovery algorithm is to reduce the available memory to half of the original, which is a bit too much space waste. The default ratio of Eden to Survivor for HotSpot virtual machine is 8:1, that is, the available memory space in each new generation is 90% of the entire new generation capacity (80% of Eden plus 10% of one Survivor), and there is only one Survivor Space, that is, 10% of the new generation will be "wasted".
Most of the current commercial Java virtual machines use this collection algorithm first to recycle the new generation.

insert image description here

3. Marking-sorting algorithm

The mark-copy algorithm will perform more copy operations when the object survival rate is high, and the efficiency will decrease. More importantly, if you don't want to waste 50% of the space, you need to have additional space for allocation guarantees to deal with the extreme situation where all objects in the used memory are 100% alive, so you generally can't directly choose this in the old generation algorithm. Aiming at the survival characteristics of objects in the old age, Edward Lueders proposed another targeted "mark-compact" (Mark-Compact) algorithm in 1974, in which the marking process is still the same as the "mark-clear" algorithm, but the subsequent steps Instead of directly cleaning up recyclable objects, all surviving objects are moved to one end of the memory space, and then directly clean up the memory outside the boundary. The schematic diagram of the "mark-sort" algorithm is shown in Figure 3-4.
The essential difference between the mark-sweep algorithm and the mark-sort algorithm is that the former is a non-moving recycling algorithm, while the latter is mobile.

insert image description here


Fourth, the difference between the Java virtual machine and the Android virtual machine

(1), the executed bytecode is different

The Java virtual machine executes class bytecode (jar), and Android executes dex bytecode. The dex file deletes a large amount of bytecode redundant information, which is smaller in size and faster in class search than jar.

The class bytecode is converted to dex bytecode by the dx tool

(2), the compilation method is different

The Java virtual machine adopts the JIT (Just In Time) compilation method, and Android 5.0 and later uses the art virtual machine, and adopts the static AOT (Ahead Of Time) method of compiling ahead of time. Compared with JIT, AOT has faster application startup and running speed and smoother experience, but it takes up more storage space and takes longer to install the application.


Five, Android memory leak case analysis

(1) Memory leaks caused by internal classes and threads

public class MainActivity extends AppCompatActivity {
    
    
  @Override
    protected void onCreate(Bundle savedInstanceState) {
    
    
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        loadData();
    }

    private void loadData() {
    
    
        new Thread(new Runnable() {
    
    
            @Override
            public void run() {
    
    
                SystemClock.sleep(30000);
            }
        }).start();
    }

}

Tip: Non-static inner classes hold outer class references by default (check the bytecode to verify).
Solution: Declare the anonymous inner class Runnable as a static real-name class, and use WeakReference packaging when you need to use activity. Or terminate the thread when the activity exits.

(2) Memory leak caused by Handler

public class MainActivity extends AppCompatActivity {
    
    
    private final Handler handler = new Handler(Looper.myLooper()) {
    
    
        @Override
        public void handleMessage(@NonNull Message msg) {
    
    
            super.handleMessage(msg);
        }
    };

    @Override
    protected void onCreate(Bundle savedInstanceState) {
    
    
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        handler.postDelayed(new Runnable() {
    
    
            @Override
            public void run() {
    
    

            }
        }, 30000);
    }

(3) Memory leaks caused by Timer and TimerTask

(4) Memory leak caused by AsyncTask

(5) Memory leak caused by property animation

(6) Memory leak caused by singleton

(7) Memory leak caused by static variable reference context

(8) Other


6. Android memory leak analysis tool

(1) LeakCanary

(2) Android Studio Profiler

(3) DDMS

(4 )MAT


Answers to thinking questions:
Thinking question 1: Two.
Thinking question 2: false and true. The double equal signs of reference types compare references, and the double equal signs of primitive types compare values. If Integer wants to compare the value, it must be converted to int type, just by calling intValue().
Thinking question 3: The jdk1.8 constant pool is stored in the heap memory instead of the method area, and calling the inern() method copies the string reference of the heap memory to the constant pool instead of copying the string object.
Thinking question 4: It is not recommended to increase the stack memory capacity to solve StackOverFlowError, because increasing the stack memory capacity means that the memory occupied by each thread is larger, and the number of threads that can be created will be less, which will easily cause OOM exceptions. It is recommended to solve it by optimizing the code.

Reference: "In-depth understanding of Java virtual machine - Zhou Zhiming"

Guess you like

Origin blog.csdn.net/devnn/article/details/121548116