JVM dynamic memory partition and garbage collection algorithm

JVM dynamic memory partition and garbage collection algorithm-"In-depth understanding of the Java Virtual Machine" reading summary

In order to achieve dynamic management of runtime data, the Java virtual machine implements a set of memory management mechanisms: first, the runtime memory is divided, a conceptual model of memory division is specified, and a garbage collection algorithm is provided for a specific area. So that Java programmers do not need to care about the details of memory management, but everything has advantages and disadvantages. When there are problems related to memory processing, we need to have an understanding of the underlying implementation details, so that we can quickly locate and solve the problem.

This paper describes three parts:
- runtime memory zoning
- the object storage and access
- garbage collection algorithm


Runtime memory area division

It needs to be emphasized again: the memory area division mentioned in this article is a summary of the conceptual model in the JVM specification. There may be differences in different JVM implementation products, such as the HotSport virtual machine implementation. The method stack is implemented uniformly.

JVM memory partition
Different from what we thought before, the memory is simply divided into two parts: heap and stack. As can be seen from the figure, the JVM runtime data area is mainly divided into five parts: program counter, virtual machine stack, local method stack, Java heap, and method area. The details are described below.

  1. The program counter
    may be understood as the line number indicator, the main execution flow control instruction, it is to perform a basic sequence, branched, cyclic three flow statement. Since each processor can only execute a certain instruction in one thread at a time, each thread needs to save a program counter of the current thread.

  2. VM stack
    for each method call to the stack frame as a unit, through the stack, the stack operation implementation process method call. The stack frame stores information such as local variable tables, operand stacks, dynamic links, and method exits.

  3. Native method stacks
    role with the virtual machine stack, but the native method stacks for use primarily native methods.

  4. Heap
    is mainly used for storing data objects and arrays, according to the garbage collection strategy, divided into the new generation and the old era, the Cenozoic is divided into Eden, FromSurvior, ToSurvior. There are also other classification methods, for example, from the perspective of memory allocation, it can be divided into multiple thread private allocation buffer TLAB (Thread Local Allocation Buffer). In addition, the Hotspot virtual machine puts the method area in the Java heap memory, and manages it in a permanent generation. It writes less about the method area garbage collection method, but the effect is not satisfactory.

  5. Methods district
    class information storage method main area of the virtual machine load, constant pool, static variables, the time compiler to compile the code and other data.

In addition, the direct memory marked in the figure is not the memory area defined in the JVM specification, but the JVM can access the off-heap memory through the NIO call local function library to improve the calling efficiency. This part is not limited by the JVM's memory size, but It is still limited by the total memory size of the machine. Therefore, when dividing the memory of the java program, it cannot occupy all the memory of the machine. You need to consider the needs of this part of memory.


Object creation and access

It mainly includes the storage structure, creation process and access method of objects.
Object creation and access

The object is mainly stored in the heap memory, and its creation process is mainly divided into five steps:
1. Check whether the class is loaded, if not loaded, then execute the class loading process, after the class loading is completed, you can determine the memory size of the object;
2 . Allocate memory, according to whether the memory area is regular, it is divided into "pointer collision" (Bump the Pointer) and "free list" (Free List) two methods, pointer collision needle refers to the memory division due to memory regulation only move the memory pointer Way, the free list means the way to record the usage of memory blocks through a record table. Whether the memory area is regular or not is determined by the garbage collection algorithm;
3. Initialize the value of 0, which is why the attribute has the default value after our object is created;
4. Set the object header, set the runtime data of the object according to the different state of the object;
5. Execute the init method of the Java class.

After the object is created, its data structure in memory mainly includes object header and class pointer information. If the object is an array, you also need to store data of array length.

Object access methods are divided into handles and direct pointers.
1. Handle access: A handle pool is divided in the Java heap. The reference type variable in the local variable table of the stack space points to the handle. The handle includes the address and method area of ​​the object. The address of type data.
2. Direct pointer: The reference type variable directly points to the address of the object memory in the heap. The object needs to store the address pointing to the type data.
Both methods have their own advantages and disadvantages. For the operation of moving objects, the handle method only needs to change the address where the handle is stored. There is no need to change the reference data, but it occupies more memory and increases the pointer positioning overhead when accessing the object; The pointer method is more efficient, but more operations are required when accessing type data and moving objects. HotSpot uses direct pointer access.


Garbage collection algorithm

Garbage collection was put forward earlier than Java, and this part mainly includes three questions: What memory is recovered? When will it be recycled? How to recycle?
Garbage collection algorithm

What memory is reclaimed?

The reachability analysis algorithm is used in Java to achieve the purpose of judging whether an object is alive. Java starts enumerating objects from a series of object nodes called GC Roots. The path that is traversed is called a "reference chain". If an object reaches GC Roots When no reference chain is connected, the object is determined to be recyclable.
In addition to references and no references, there are some intermediate states for reference between objects. The reference type has been extended since JDK 1.2 to describe some objects in the state of "no taste for food, a shame for giving up", similar to the application of caching in the system. According to the reference strength from strong to weak, it is divided into four types, which will be treated differently when performing reachability analysis and memory recovery for different types of references.
1. Strong reference: refers to the general reference that exists, similar to Object obj = new Object ();
2. Soft reference: non-essential objects, their associated objects will be recycled twice before the memory overflow exception occurs, JDK provides SoftReference class implementation;
3. Weak reference: also describes non-essential objects, in the next garbage collection, regardless of whether the memory is sufficient, the reference object will be recycled, JDK provides WeakReference class implementation;
4. Virtual reference: the object life cycle is not It has an impact. You will receive a system notification during garbage collection of the reference object of this class. JDK provides PhantomReference class implementation.

For unreachable objects, there is still a chance of survival. During the two markings of garbage collection, a filter will be performed for the first marking. If the object rewrites the finalize () method, it will be added to the F-Queue queue. The JVM executes through a self-built, low-priority Finalizer thread. If the object's finalize () method refers itself to the GC Roots reference chain, it can avoid recycling, but the JVM does not guarantee that the finalize () method can be successfully executed. This method is only for Java C ++ developers when Java was born. It's easier to accept it.

When will it be recycled?

STW (Stop The World): This name sounds cool, but it is not a good thing ... It refers to the need for inter-object references when the JVM enumerates the root node (determination of the GC Roots reference chain in reachability analysis) Maintain a consistent condition (for example, an object has just been created and has not been associated with its reference type variable. At this time, accessibility analysis may be performed to identify it as a recyclable object ...), which requires having to Stopping all Java execution threads is called STW.
In the HotSpot virtual machine, an OopMap data structure is used to store reference relationships to achieve accurate GC (the JVM knows the specific type of data in the memory location), so it is not necessary to check the complete context and global reference location. There are many instructions that cause changes to OopMap. Only when the Java thread executes to the SafePoint (safe point-has the feature of allowing the program to execute for a long time), the GC can be performed only when the OopMap data is temporarily fixed. , Abnormal jumps, etc.
There are two ways to stop the program at a safe point: preemptive interruption, active interruption, preemptive interruption means that the JVM directly stops all execution threads, and if there are no threads on the safe point, it will be restored. Let it continue to a safe point, and then stop; active interruption is to set an interrupt flag, and each thread polls this flag when it executes. If the flag is true, the interrupt is suspended.
For the thread in the Sleep or Blocked state, it cannot respond to the JVM interrupt request and executes to a safe point. At this time, SafeRegion (safe area-a code snippet, the reference relationship does not change) needs to be used. When the thread enters this code At the time, it identifies itself as entering a safe area, and the JVM initiates GC, regardless of the thread in SafeRegion. When the thread leaves SafeRegion, it is necessary to check whether the root node enumeration is completed, and then it can leave after finishing.

How to recycle?

  1. Mark - sweep
    it is the most basic collection algorithm, the other two are based on this algorithm. The algorithm is divided into two steps: marking and clearing. When determining which memory to recycle, the marking process is mentioned twice to mark the objects to be recycled. After the marking is completed, the objects to be recycled are cleared. It has the problems of low execution efficiency and prone to memory fragmentation.

  2. Copy
    after for efficiency, replication algorithm the memory area is divided into a space Eden, two smaller Survivor areas, each with only one Eden, a Survivor memory area, garbage collection, copy the live objects to another one within the range Survivor , If the surviving object takes up more space than another Survivor memory area, it will be allocated to the guaranteed memory range (HotSpot specifically refers to the old generation). If the
    replication algorithm has a high survival rate, the replication efficiency is low, and additional memory space is required to guarantee The characteristics are suitable for the new generation of memory recycling.

  3. Mark - finishing
    the labeling process and method mark - sweep algorithm as the finishing process is to move to the end of the live objects, and objects outside the boundaries of direct clean-up, solve the problem of memory fragmentation for the old era.

The garbage collector in the HotSpot virtual machine is classified according to the characteristics of garbage collection algorithms, concurrency, and parallelism, and I will not introduce it here.

to sum up

This article gives a detailed introduction to the conceptual model of the runtime memory of the Java virtual machine, and details the object creation process, object data structure, and access location in the heap memory. Finally, for the garbage collection algorithm, we will describe which objects, when, and how to recycle. After finishing these contents, we have a theoretical understanding of the JVM memory structure and garbage collection. These theories and principles are in different garbage collectors, JVM parameters are not fixed under different configurations and other conditions, and we need to understand and apply them according to the actual situation.

Published 159 original articles · praised 225 · 210,000 views

Guess you like

Origin blog.csdn.net/lyg673770712/article/details/80837046
Recommended