JVM runtime data area heap space

1. Core Overview

A JVM instance only has one heap memory, and the heap is also the core area of ​​Java memory management. The heap area is created when the JVM is started , and its space size is determined. It is the largest memory space managed by the JVM .

The description of the Java heap in the "Java Virtual Machine Specification" is: all object instances and arrays should be allocated on the heap at runtime. (The heap is the run-time data area fromwhich memory for all class instances and arrays is allocated) What I'm saying is: "almost" all object instances are allocated memory here. One is from the perspective of actual use.

  • Arrays and objects may never be stored on the stack because the stack frame holds a reference to the location of the object or array on the heap.
  • After the method ends, the objects in the heap will not be removed immediately, but will only be removed during garbage collection.
  • The heap is the key area where GC (Garbage Collection) performs garbage collection.

2. Internal structure

Most modern garbage collectors are designed based on the generational collection theory. The internal structure of the heap space is subdivided into:

The heap memory of Java 7 and before is logically divided into three parts: new area + retirement area + permanent area

  • YoungGeneration Space Young/New
    • It is divided into Eden area and Survivor area.
  • Tenure generation space Old/Tenure
  • Permanent Space Perm

Java 8 and later heap memory is logically divided into three parts: new area + retirement area + meta space

  • YoungGeneration Space Young/New
    • It is divided into Eden area and Survivor area.
  • Tenure generation space Old/Tenure
  • Meta Space meta space Meta

Internal structure diagram of JDK7

Variety

3. Young generation and old generation

Java objects stored in the JVM can be divided into two categories:

  • One type is transient objects with short life cycles. Such objects are created and destroyed very quickly.

  • Another type of object has a very long life cycle, and in some extreme cases can be consistent with the JVM life cycle.

If the Java heap area is further subdivided, it can be divided into the young generation and the Eden space, Survivor0 space and Survivor1 space (sometimes also called from area and to area).

3.1 Configuration

The following parameters are generally not adjusted during development:
Configuring the proportion of the new generation and the old generation in the heap structure

  • The default is -XX:NewRatio=2, which means that the new generation occupies 1, the old generation occupies 2, and the new generation occupies 1/3 of the entire heap.

  • You can modify -XX:NewRatio=4, which means that the new generation occupies 1, the old generation occupies 4, and the new generation occupies 1/5 of the entire heap.

In HotSpot, the default ratio between the Eden space and the other two Survivor spaces is 8:1:1 . Of course, developers can adjust this space ratio through the option -XX:SurvivorRatio". For example, -XX:SurvivorRatio=8

Almost all Java objects are new in the Eden area. Most Java objects are destroyed in the new generation. IBM's special research shows that 80% of the new generation's objects are "live and die".

4. Object allocation process

  1. Allocating memory for new objects is a very rigorous and complex task. JVM designers not only need to consider how and where to allocate memory, but also need to consider GC because the memory allocation algorithm is closely related to the memory recycling algorithm. Whether memory fragmentation will be generated in the memory space after memory recycling is performed.
  2. The new object is placed in Eden Park first. This area has size restrictions.
  3. When the space in Eden is filled up and the program needs to create objects, the JVM's garbage collector will perform garbage collection (Minor GC) on Eden and destroy objects in Eden that are no longer referenced by other objects. Then load new objects and put them in Eden Park
  4. Then move the remaining objects in Eden to Survivor Zone 0
  5. If garbage collection is triggered again, the items that survived last time will be placed in survivor area 0. If they are not recycled, they will be placed in survivor area 1.
  6. If it goes through garbage collection again, it will be put back into survivor area 0, and then go to survivor area 1.
  7. When can I go to the retirement area? You can set the number of times. The default is 15 times. You can set the parameters:-XX:MaxTenuringThreshold=N to set
  8. In the retirement area, it is relatively leisurely. When the memory in the retirement area is insufficient, GC is triggered again: Major GC to clean up the memory in the retirement area.
  9. If it is found that the object cannot be saved after executing the Major GC in the retirement area, an OOM exception will occur.

highlighter- mipsasm

java.lang.OutOfMemoryError: Java heap space

5. Memory allocation strategy

5.1 The idea of ​​heap space generation

  • After research, different objects have different life cycles. 70%-99% of objects are temporary objects
  • New generation: There is Eden and two Survivor blocks of the same size (also called from/to, s0/s1) forming to, which is always empty.
  • Old generation: stores objects in the new generation that have survived multiple GCs.

By dividing the heap into generations, you can place short-lived objects in the young generation and long-lived objects in the old generation. In this way, the garbage collector can adopt different strategies for objects of different generations when recycling. Objects in the young generation can be quickly scanned and recycled, while objects in the old generation require multiple garbage collections before they can be recycled. This generational thinking can optimize the efficiency of garbage collection and improve program performance.

If there is no generation, all objects are placed in the same generation, which will lead to inefficient garbage collection, because the garbage collector needs to scan the entire heap to find objects that need to be recycled. At the same time, not generational generation will also lead to memory waste, because although some objects have been released, their memory space has not been reclaimed. Therefore, generational generation is an important optimization strategy for Java garbage collection, which can improve program performance and reliability.

5.2 Distribution principle

  • Prioritize allocation to Eden

  • Large objects are directly allocated to the old generation to avoid too many large objects in the program.

  • Long-lived objects are allocated to the old generation

  • Dynamic object age judgment If the sum of the sizes of all objects of the same age in the survivor area is greater than half of the survivor space, objects whose age is greater than or equal to this age can directly enter the old generation without waiting for the age required in MaxTenuringThreshold.

6. TLAB

TLAB (Thread Local Allocation Buffer) is a memory allocation optimization technology of the Java virtual machine. It allocates a private memory area for each thread, called TLAB (Thread Local Allocation Buffer), so that each thread has its own memory space, thereby avoiding memory competition and synchronization problems between multi-threads, and improving memory allocation. s efficiency. Because the heap area is a thread-shared area, any thread can access the shared data in the heap area. Object instances are created very frequently in the JVM, so dividing the memory space from the heap area in a concurrent environment is thread unsafe.

The size of the TLAB is fixed. When the TLAB is full, a new TLAB will be applied for, but the objects in the old TLAB will remain in place, and there is no way to know whether they are allocated from the TLAB. When the memory is still not enough after allocation once, it will be moved directly to the Eden area.

The advantage of TLAB is that it improves the efficiency of memory allocation and reduces competition and synchronization problems between multi-threads. However, since the size of TLAB is fixed, space may be wasted, resulting in discontinuous space in the Eden area, which adds up to a small amount. Therefore, when creating a large number of objects, you should consider adjusting the heap structure or using techniques such as object pooling to avoid the disadvantages of TLAB.

Although not all object instances can successfully allocate memory in TLAB, the JVM does use TLAB as the first choice for memory allocation.

In the program, developers can set whether to enable TLAB space through the option "-XX:UseTLAB".

By default, the memory of TLAB space is very small, occupying only 1% of the entire Eden space. Of course, we can set the percentage of Eden space occupied by TLAB space through the option "-XX:TLABWasteTargetPercent".

Once an object fails to allocate memory in the TLAB space, the JVM will try to ensure the atomicity of data operations by using a locking mechanism to allocate memory directly in the Eden space.

7. Commonly used parameter settings for heap space

highlighter- awk

Oracle official website configuration: 
https://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html

highlighter- diff

-XX:+PrintFlagsInitial: View the default initial values ​​of all parameters 
-XX:+PrintFlaqsFinal: View the final values ​​of all parameters (there may be modifications -XX:+PrintFlaqsFinal is no longer the initial value) 
-Xms: Initial heap space memory (Default is 1/64 of physical memory) 
-Xmx: Maximum heap space memory (Default is 1/4 of physical memory) 
-Xmn: Set the size of the new generation. (Initial value and maximum value) 
-XX:NewRatio: Configure the proportion of the new generation and the old generation in the heap structure 
-XX:SurvivorRatio: Set the proportion of Eden and s0/S1 space in the new generation 
-XX:+PrintGCDetails: Output detailed GC processing log 
Print gc brief information 
-XX:+PrintGC 
-verbose:gc 
-XX:HandlePromotionFailure: whether to set the space allocation guarantee 
-XX:MaxTenuringThreshold: set the maximum age of the new generation garbage

8. Briefly describe several types of GC

When the JVM performs GC, the above three memory areas (new generation, old generation; method area) are not recycled together every time. Most of the time, the recycling refers to the new generation.

For the implementation of HotSpot VM, the GC in it is divided into two types according to the recycling area:

Partial GC : Garbage collection that does not completely collect the entire Java heap. Which is further divided into:

  • New generation collection (Minor GC / Young Gc): only garbage collection of the new generation
  • Old generation collection (Major Gc / old Gc): just garbage collection in the old generation
  • Mixed GC: Collect garbage from the entire new generation and part of the old generation. Currently, only G1 GC behaves this way

One is full heap collection (Full GC) : garbage collection that collects the entire java heap and method area

8.1Minor GC (young generation GC)

  • When the young generation space is insufficient, Minor GC will be triggered. The young generation here refers to the Eden generation being full. When the Survivor is full, GC will not be triggered (Minor GC will clean up the memory of the young generation every time).

  • Because most Java objects have the characteristics of ephemeral life and death, Minor GC is very frequent and the recovery speed is generally relatively fast.

  • GC will trigger STW, suspend other user threads, and wait until the garbage collection is completed before the user thread Minor thread resumes operation.

8.2Major GC (old generation GC)

The emergence of Major GC is often accompanied by at least one Minor GC (but not absolutely. In the collection strategy of the ParallelScavenge collector, there is a strategy selection process for direct Major GC).
That is to say, when there is insufficient space in the old generation, Minor GC will be triggered first. If there is not enough space later, Major GC will be triggered.

The speed of Major GC is generally more than 10 times slower than that of Minor G, and the STW time is longer. If the memory is not enough after Major GC, OOM will be reported.

8.3Full GC

There are five situations that trigger FullGC execution:

(1) When calling System.gc(), the system recommends executing Full GC, but it is not necessarily executed
(2) There is insufficient space in the old generation
3) There is insufficient space in the method area
(4) The average size of the old generation after passing minor GC is larger than the old generation When copying the available memory
(5) from the Eden area and survivor space0 (From Space) area to the survivor space1 (ToSpace) area, if the object size is larger than the available memory of To Space, the object will be transferred to the old generation, and the old generation will be available The memory is smaller than the size of the object
. Note: Full gc should be avoided during development or tuning. This will temporarily shorten the time

9. Escape analysis

There is a description of Java heap memory in "In-depth Understanding of Java Virtual Machine": With the development of JIT compilation period and the gradual maturity of escape analysis technology, stack allocation and scalar replacement optimization technology will lead to some subtle changes. Objects are allocated on the heap and gradually become less "absolute".

In the Java virtual machine, it is common knowledge that objects are allocated memory in the Java heap. However, there is a special case, that is, if it is found through escape analysis that an object does not have an escape method, it may be optimized to allocation on the stack. iThis eliminates the need to allocate memory on the heap and eliminate the need for garbage collection. This is also the most common off-heap storage technology.

How to allocate objects on the heap to the stack requires escape analysis. This is a cross-function global data flow analysis algorithm that can effectively reduce synchronization load and memory heap allocation pressure in Java programs.
Through escape analysis, the Java Hotspot compiler can analyze the usage range of a new object's reference and decide whether to allocate this object to the heap.

9.1 Code Example

The basic behavior of escape analysis is to analyze the dynamic scope of the object: when an object is defined in a method and the object is only used inside the method, it is considered that no escape has occurred.

highlighter- typescript

public void test_method1(){ 
        V v = new V(); 
        v = null; 
} 
If there is no escaped object, it can be allocated to the stack. As the method execution ends, the stack space is removed.

When an object is defined in a method and it is referenced by an external method, an escape occurs. For example, passed as a call parameter to other places.

highlighter- processing

public static StringBuffer createStringBuffer(String s1,String s2){
    StringBuffer stringBuffer = new StringBuffer();
    stringBuffer.append(s1);
    stringBuffer.append(s2);
    return stringBuffer;
}
上述代码如果想要stringBuffer sb不逃出方法,可以这样写:    
public static String createStringBuffer(String s1,String s2){
    StringBuffer stringBuffer = new StringBuffer();
    stringBuffer.append(s1);
    stringBuffer.append(s2);
    return stringBuffer.toString();
}

9.2 Code optimization

9.2.1 Allocation on the stack

Based on the results of escape analysis during compilation, the JIT compiler finds that if an object does not have an escape method, it may be optimized into allocation on the stack. After the allocation is completed, execution continues in the call stack. Finally, the thread ends, the stack space is recycled, and the local variable objects are also recycled. This eliminates the need for garbage collection.

Common stack allocation scenarios have been explained in the code examples of escape analysis. They are assigning values ​​to member variables and passing method return values ​​by instance reference.

9.2.2 Synchronization omission (elimination)

The cost of thread synchronization is quite high, and the consequence of synchronization is reduced concurrency and performance.

When dynamically compiling a synchronized block, the JIT compiler can use escape analysis to determine whether the lock object used by the synchronized block can only be accessed by one thread and has not been released to other threads. If not, the JIT compiler will desynchronize this part of the code when compiling this synchronized block. This can greatly improve concurrency and performance. This process of canceling synchronization is called synchronization omission, also called lock elimination.

highlighter- pgsql

public void f(){
    Object o = new Object();
    synchronized (o){
    	System.out.println(o);
    }
}

The Object object is locked in the code, but the life cycle of the hollis object is only in the f() method and will not be accessed by other threads, so it will be optimized during the JIT compilation phase. Optimized into:

highlighter- pgsql

public void f(){
    Object o = new Object();
    System.out.println(o);
}
9.2.3 Scalar substitution

Scalar refers to a data that cannot be broken down into smaller data. The primitive data type in Java is scalar.
In contrast, data that can be decomposed is called an aggregate. An object in Java is an aggregate because it can be decomposed into other aggregates and scalars.
In the JIT stage, if it is found through escape analysis that an object will not be accessed by the outside world, then after JIT optimization, the object will be disassembled into several member variables contained in it and replaced. This process is scalar replacement.

highlighter- arduino

public static void main(String[] args) {
        alloc();
    }
    private static void alloc(){
        Point point = new Point(1, 2);
        System.out.println("point X:" + point.x);
    }

class  Point{
    private int X;
    private int y;

    public Point(int i, int i1) {
    }
}

The above code, after scalar replacement, will become

highlighter- arduino

private static void alloc(){
    int x = 1;
    int y = 2;
    System.out.println("point X:" + point.x);
}

It can be seen that after the escape analysis of the aggregate quantity Point, it was found that it did not escape, so it was replaced by two aggregate quantities. So what are the benefits of scalar replacement? It can greatly reduce the heap memory usage. Because once there is no need to create objects, there is no need to allocate heap memory.

Scalar replacement parameter settings:
Parameter -XX:+EliminateAllocations: Enables scalar replacement (on by default), allowing objects to be scattered and allocated on the stack.

9.3 Parameter settings

After JDK 6u23 version, escape analysis is enabled by default in HotSpot.

If using an earlier version, developers can do this via:

The option "-XX: +DoEscapeAnalysis" explicitly turns on escape analysis

View the filtering results of escape analysis through the option "-XX:+PrintEscapeAnalysis"

highlighter- makefile

Parameter -server: Start server mode, because escape analysis can be enabled only in server mode. 
Parameters -XX:+DoEscapeAnalysis; 
Enable escape analysis Parameters -Xmx10m: Specify a maximum heap space of 10MB 
Parameters -XX:+PrintGC: GC log will be printed. 
Parameter -XX:+EliminateAllocations: Enables scalar replacement (on by default), allowing objects to be scattered and allocated on the stack. For example, if an object has two fields, id and name, then these two fields will be treated as two independent Local variables are allocated.

10. Summary

  • The young generation is the area where objects are born, grow, and die. An object is generated and used here, and is finally collected by the garbage collector and ends its life.

  • Objects with long life cycles placed in the old generation are usually Java objects copied from the survivor area. Of course, there are also special cases. We know that ordinary objects will be allocated on TLAB; if the object is large, the JVM will try to allocate it directly to other locations in Eden; if the object is too large, it will not be able to find a long enough continuous free space in the new generation. space, the JVM will directly allocate it to the old generation.

  • When GC only occurs in the young generation, the act of recycling young generation objects is called MinorGC. When GC occurs in the old generation, it is called MajorGC or FulIGC. Generally, the occurrence frequency of MinorGC is much higher than that of MajorGc, that is, the frequency of garbage collection in the old generation will be much lower than that in the young generation.

Guess you like

Origin blog.csdn.net/2301_78834737/article/details/131990542