[Dark Horse JVM (2)] Garbage Collection

How to determine if an object can be recycled

reference counting

As long as an object is referenced by other variables, the count of the object is +1. If it is referenced twice, the count is +2. If a variable no longer refers to it, the count is -1. When the object reference count is 0 , it means that the object is not referenced and can be treated as garbage collection.
There are disadvantages:
Insert image description here
object A refers to object B, and object B's reference count is 1. Object B in turn refers to object A, and object A's reference count is also 1, resulting in a circular dependency. The two keep referencing each other, and the memory cannot be released. This results in a memory leak.

Reachability analysis algorithm

The garbage collector in the Java virtual machine uses reachability analysis to explore all live objects.

  • First, we must determine a series of objects: objects that must not be recycled as garbage (GC Root objects)
  • Before garbage collection, all objects in the heap memory must be scanned to see which objects are directly or indirectly referenced by the following objects. These objects cannot be recycled. On the contrary, if they are not directly or indirectly referenced by the following objects, they can be used as garbage in the future. Be recycled. (Scan the object in the heap to see if the object can be found along the reference chain starting from the GC Root object. If it cannot be found, it means it can be recycled)

Which objects can be used as GC Root?
Insert image description here
The four categories in the above figure are:
system classes: classes loaded by the startup class loader, core classes, classes that will definitely be used during runtime (Object, HashMap...)
Local method stack: Java When the virtual machine executes a method, it must call the operating system method. The Java method referenced by the operating system method.
Active thread: the object referenced in the stack frame of the running thread.
The object being locked: the synchronized keyword locks an object. Locked objects cannot be recycled

In the Java counting system, objects that can be used as GC Roots objects can be divided into the following types:

  • Objects referenced in the Java virtual machine stack (local variable table in the stack frame), such as parameters, local variables, temporary variables, etc. used in the method stack called by each thread.
    When each method is executed, the jvm will create a corresponding stack frame, and each function call is pushed onto the stack. The stack includes a local variable table and an operand stack. The variables in the local variable table may be reference types (reference), and the objects they refer to can be used as GC Root. However, these references will disappear as the function call ends and is popped off the stack.
  • The object referenced by the static attribute of the class in the method area.
    The reference type field of the static declaration used in the class is loaded into the memory when the class is loaded.
  • Objects referenced by constants in the method area, such as references in the string constant pool.
    Reference type fields declared using final
  • Object referenced by JNI (native method) in the native method stack
    Object referenced by the native method in the program
  • References within the Java virtual machine, such as class objects corresponding to basic data types, some resident exception objects (such as NullPointException, OutOfMemoryError), and the system class loader.
  • All objects held by synchronized locks (synchronized keyword)
  • JMXBean that reflects the internal situation of the Java virtual machine, issues registered in JVMTI, local code caching, etc.
    Local variables are stored in the active stack frame, and the reference objects (new) of local variables are placed in the heap.

Four kinds of references

Strong reference (FinalReference)>Soft reference (SoftReference)>Weak reference (WeakReference)>PhantomReference (PhantomReference)
Insert image description here

  1. Strong reference: The default in Java is a strong reference. New an Object object and assign it to obj. This obj is a strong reference to new Object().
Object obj = new Object();

In any case, as long as the strong reference relationship still exists, the garbage collector will never reclaim the referenced object.

  1. Soft reference: Soft reference is used to describe some useful but unnecessary references.
  • An instance of SoftReference saves a soft reference to a Java object. The object will be recycled only when there is insufficient memory and there are no other strong references. If the memory is enough, it will not be recycled.
  • When the object referenced by the soft reference is recycled, the soft reference will enter the reference queue and the memory occupied by the soft reference itself will be released.
  1. Weak reference: It is also used to describe some non-essential objects, but its strength is weaker than soft reference.
  • Objects referenced by weak references and no other strong references are referenced will be recycled as long as garbage collection is performed, regardless of whether there is insufficient memory.
  • The recycling of weak references can also be used in conjunction with the reference queue to release the weak references themselves.
    Insert image description here
  1. Virtual reference: the weakest kind of reference relationship
  • Virtual references cannot be obtained through the get method. When a virtual reference object is recycled, it will be placed in a ReferenceQueue queue, which means that when the virtual reference is recycled, a signal will be given and placed in the queue.
  • It must be used with the reference queue, mainly with the ByteBuffer. When the referenced object is recycled, the virtual reference will be enqueued, and the Reference Handler thread will call the virtual reference related method to release the direct memory.
    • Whether an object has a virtual reference has no impact on its lifetime, and it is impossible to obtain an object instance through a virtual reference.
    • The only purpose of setting a virtual reference association for an object is to get a system notification when the object is reclaimed by the collector.
    • When a virtual reference object is created, it will be associated with a reference queue. When a ByteBuffer implementation class object is created, a Cleaner virtual reference object will be created. ByteBuffer will allocate a piece of direct memory and pass the direct memory address to the virtual reference object.
    • When there is no strong reference to the ByteBuffer, the ByteBuffer is garbage collected, and the direct memory cannot be managed by Java. Therefore, when the ByteBuffer is recycled, the virtual reference object is allowed to enter the reference queue (ReferenceQueue). The queue where the virtual reference is located will be represented by a Reference The Handler thread regularly references the queue to check whether there is a new cleaner added to the queue. If so, the clean method in the cleaner will be called (Unsafe.freeMemory will be called to release the direct memory based on the recorded direct memory address).
      Insert image description here
  1. FinalReference
  • No manual coding is required, but it is used internally with the reference queue. During garbage collection, the finalizer reference is enqueued (the referenced object has not been recycled yet), and the Finalizer thread finds the referenced object through the finalizer reference and calls its finalize method. , the referenced object can be recycled only during the second GC.
    • All class objects will inherit the Object class, which has a finalize method. When an object overrides the final method and does not have a strong reference, it can be recycled as garbage.
    • The virtual machine creates a corresponding finalizer reference for the object. During garbage collection, the finalizer reference is added to the reference queue (the referenced object has not been recycled yet), and then a thread with a very low priority (Finalizer Handler) regularly references it. The queue queries whether there is a new finalizer reference added to the queue. It finds the referenced object through the finalizer reference and calls its finalize method. The referenced object is recycled during the second GC.
      Insert image description here

-Xmx20m -XX:+PrintGCDetails -verbose:gc Set the maximum heap memory to 20m and print GC details

Insert image description here
Strong references will cause heap memory overflow

    public static void main(String[] args) throws IOException {
    
    
        List<byte[]> list = new ArrayList<>();
        for (int i = 0; i < 5; i++) {
    
    
            list.add(new byte[_4MB]);
        }

        System.in.read();
    }

Insert image description here
list first references the soft reference object, and then indirectly references byte[]

 public static void main(String[] args) throws IOException {
    
    
   // list --> SoftReference --> byte[]
  List<SoftReference<byte[]>> list = new ArrayList<>();
       for (int i = 0; i < 5; i++) {
    
    
           SoftReference<byte[]> ref = new SoftReference<>(new byte[_4MB]);		
           System.out.println(ref.get());
           list.add(ref);
           System.out.println(list.size());
        }   
     System.out.println("循环结束:" + list.size());
        for (SoftReference<byte[]> ref : list) {
    
    
            System.out.println(ref.get());
        }
    }

It can be seen from the GC information that when the fifth addition was made, the memory was no longer enough. After a complete garbage collection, the memory space was still insufficient, and a new memory recycling was triggered. The soft reference memory recycling was combined with the reference queue
Insert image description here
. Clean up soft reference objects

public class Demo2_4 {
    
    
    private static final int _4MB = 4 * 1024 * 1024;

    public static void main(String[] args) {
    
    
        List<SoftReference<byte[]>> list = new ArrayList<>();

        // 配合引用队列,将软引用清理
        ReferenceQueue<byte[]> queue = new ReferenceQueue<>();

        for (int i = 0; i < 5; i++) {
    
    
            // 关联了引用队列, 当软引用所关联的 byte[]被回收时,软引用自己会加入到 queue 中去
            SoftReference<byte[]> ref = new SoftReference<>(new byte[_4MB], queue);
            System.out.println(ref.get());
            list.add(ref);
            System.out.println(list.size());
        }

        // 从队列中获取无用的 软引用对象,并移除
        Reference<? extends byte[]> poll = queue.poll();
        while( poll != null) {
    
    
            list.remove(poll);
            poll = queue.poll();
        }

        System.out.println("===========================");
        for (SoftReference<byte[]> reference : list) {
    
    
            System.out.println(reference.get());
        }

    }
}

Insert image description here
Weak reference example:

-Xmx20m -XX:+PrintGCDetails -verbose:gc

public class Demo2_5 {
    
    
    private static final int _4MB = 4 * 1024 * 1024;

    public static void main(String[] args) {
    
    
        //  list --> WeakReference --> byte[]
        List<WeakReference<byte[]>> list = new ArrayList<>();
        for (int i = 0; i < 10; i++) {
    
    
            WeakReference<byte[]> ref = new WeakReference<>(new byte[_4MB]);
            list.add(ref);
            for (WeakReference<byte[]> w : list) {
    
    
                System.out.print(w.get()+" ");
            }
            System.out.println();

        }
        System.out.println("循环结束:" + list.size());
    }
}

When adding the 5th time, the memory was not enough. The 4th one was recycled before adding to the 5th one. The 10th time was because the weak references themselves also occupied memory. When they could not be accommodated, Fll GC was performed to remove all the weak references. Empty.
Insert image description here

Garbage collection algorithm

mark-clear

The mark-sweep algorithm is divided into two stages: "mark" and "clear". First, through reachability analysis, all objects that need to be recycled are marked, and then all marked objects are recycled uniformly.
Insert image description here

  • Advantages: No additional processing required, fast cleaning speed
  • Disadvantages: It will cause memory fragmentation, and subsequent problems may occur such that large objects cannot find available space.

Mark-Organize

The "marking" process of the mark-collation algorithm is consistent with that of the "mark-clear algorithm", but it will not be cleaned directly after marking. Instead, all live objects are moved to one end of memory. Just clean up the remaining parts after the move.
Insert image description here

  • Advantages: No memory fragmentation
  • Disadvantages: low efficiency, slow speed

mark-copy

Divide the memory into two blocks, and use one of them every time you apply for memory. When the memory is not enough, copy all the surviving ones in this block of memory to the other block. Then clean up the entire used memory.

Insert image description here
Insert image description here

  • Advantages: No memory fragmentation
  • Disadvantages: Available space halved

Generational garbage collection

Objects that are used for a long time are placed in the old generation, and objects that can be discarded after use are placed in the new generation. Garbage collection in the old generation occurs once in a long time, while garbage collection in the new generation occurs more frequently.

  • New objects are allocated in the Eden area by default, and new objects are constantly added. When Eden memory is insufficient, minor gc garbage collection is triggered.
  • Minor gc will trigger stop the world, suspend other user threads, and wait until the garbage collection is completed before the user threads resume running.
    Insert image description here
  • Through reachability analysis, follow the GC Root reference chain to see if it can be used as garbage. Use the marked copy algorithm to copy the surviving objects in Eden and the survivor area from to the survivor area To. The life of the surviving objects is increased by 1, and the Eden and from are recycled. The object is marked as garbage, and the survivor areas from and to are swapped.
    Insert image description here
    Insert image description here
  • When new objects fill up Eden again, the second garbage collection is triggered. Find the surviving objects in Eden and put them into the survival area To, with a lifespan of +11. Put the surviving objects in the survival area From into the survival area To. The life span is increased by 1, unnecessary objects are recycled, and the survivor areas from and to are exchanged.

Insert image description here
Insert image description here
Insert image description here

  • When the lifespan exceeds the threshold (default 15 (4bit)), it means that the object is frequently used and the garbage is promoted to the old generation.
  • When there is insufficient space in the old generation, minor gc will be triggered first. If the space is still insufficient later, full gc (entire cleanup) will be triggered, and the STW time will be longer.

Related VM parameters

meaning parameter
Heap initial size -Xms
maximum heap size -Xmx or -XX:MaxHeapSize=size
Cenozoic size -Xmn or -XX:NewSize=size + -XX:MaxNewSize=size (initial maximum simultaneous specification)
Survival area ratio (dynamic) -XX:InitialSurvivorRatio=ratio and -XX:+UserAdaptiveSizePolicy (on)
Survival area ratio -XX:SurvivorRatio=ratio (default 8, if the new generation is 10, Eden is 8, to and from are 2 each)
Promotion threshold -XX:MaxTenuringThreshold=threshold()
Promotion details -XX:+PrintTenuringDistribution prints promotion details
GC details -XX:+PrintGCDetails -verbose:gc print GC details
MinorGC before FullGC -XX:+ScavengeBeforeFullGC is turned on by default

GC case analysis
setting parameters:

-Xms20M -Xmx20M -Xmn10M -XX:+UseSerialGC -XX:+PrintGCDetails -verbose:gc

public class Demo2_1 {
    
    
    private static final int _512KB = 512 * 1024;
    private static final int _1MB = 1024 * 1024;
    private static final int _6MB = 6 * 1024 * 1024;
    private static final int _7MB = 7 * 1024 * 1024;
    private static final int _8MB = 8 * 1024 * 1024;

    public static void main(String[] args) throws InterruptedException {
    
    
    }
}

Insert image description here

public class Demo2_1 {
    
    
    private static final int _512KB = 512 * 1024;
    private static final int _1MB = 1024 * 1024;
    private static final int _6MB = 6 * 1024 * 1024;
    private static final int _7MB = 7 * 1024 * 1024;
    private static final int _8MB = 8 * 1024 * 1024;

    public static void main(String[] args) throws InterruptedException {
    
    
            ArrayList<byte[]> list = new ArrayList<>();
            list.add(new byte[_7MB]);
            list.add(new byte[_512KB]);
    }
}

Insert image description here

public class Demo2_1 {
    
    
    private static final int _512KB = 512 * 1024;
    private static final int _1MB = 1024 * 1024;
    private static final int _6MB = 6 * 1024 * 1024;
    private static final int _7MB = 7 * 1024 * 1024;
    private static final int _8MB = 8 * 1024 * 1024;

    public static void main(String[] args) throws InterruptedException {
    
    
            ArrayList<byte[]> list = new ArrayList<>();
            list.add(new byte[_8MB]);
    }
}

Insert image description here

public class Demo2_1 {
    
    
    private static final int _512KB = 512 * 1024;
    private static final int _1MB = 1024 * 1024;
    private static final int _6MB = 6 * 1024 * 1024;
    private static final int _7MB = 7 * 1024 * 1024;
    private static final int _8MB = 8 * 1024 * 1024;

    public static void main(String[] args) throws InterruptedException {
    
    
            ArrayList<byte[]> list = new ArrayList<>();
            list.add(new byte[_8MB]);
            list.add(new byte[_8MB]);
    }
}

Insert image description here

public class Demo2_1 {
    
    
    private static final int _512KB = 512 * 1024;
    private static final int _1MB = 1024 * 1024;
    private static final int _6MB = 6 * 1024 * 1024;
    private static final int _7MB = 7 * 1024 * 1024;
    private static final int _8MB = 8 * 1024 * 1024;

    public static void main(String[] args) throws InterruptedException {
    
    
       new Thread(() -> {
    
    
            ArrayList<byte[]> list = new ArrayList<>();
            list.add(new byte[_8MB]);
            list.add(new byte[_8MB]);
        }).start();

        System.out.println("sleep....");
        Thread.sleep(1000L);
    }
}

OutOfMemory within a thread will not cause the Java process to end
Insert image description here

Garbage collector

serial

A single-threaded collector means that it will only use one CPU or one collection thread area to complete garbage collection work, and when it performs garbage collection, all other working threads must be suspended until it completes the collection.
Advantages: Simple and efficient (compared to the single threads of other collectors). For environments limited to a single CPU, the Serial collector has no thread interaction overhead and can concentrate on garbage collection, so it can naturally achieve the highest single-thread collection efficiency.
Scenario: Suitable for small heap memory, personal computers

-XX:+UserSerialGC =Serial +SerialOld specifies that both the young generation and the old generation use the serial collector, which is
equivalent to using Serial GC (copy algorithm) in the new generation and Serial Old GC (marking + sorting algorithm) in the old generation.

The user's work thread stops at a safe point
Insert image description here

Throughput first

The multi-threaded collector mainly minimizes the STW (garbage collector maximum pause time) per unit time (0.2+0.2 = 0.4), which can efficiently use the CPU time and complete the program's computing tasks as quickly as possible (garbage collection time accounts for than the lowest, so the throughput is said to be high).
Scenario: The heap memory is large and multi-core CPU supports it (single-core, multiple threads take turns competing for the time slice of the single-core CPU, which is less efficient), suitable for tasks that operate in the background and do not require too much interaction. Parallel means that multiple garbage collectors can run in parallel and occupy different CPUs. But during this period, the user thread is suspended, and only the garbage collection thread is running.

-XX:+UseParallelGC Manually specify the young generation to use the Parallel parallel collector to perform memory recycling tasks (copy algorithm)

-XX:+UseParallelOldGC Manually specify the old generation to use the parallel recycling collector (marking + sorting algorithm)
jdk8 is enabled by default. The above two parameters, one is enabled by default, and the other will also be enabled (activate each other)

-XX:+UseAdaptiveSizePolicy: Adaptively adjust the size of the new generation (proportion of the new generation and promotion threshold size)

-XX:ParallelGCThreads: Set the number of threads of the young generation parallel collector,
preferably equal to the number of CPUs, to avoid too many threads affecting garbage collection performance. By default, the number of CPUs is less than 8, and the value of ParallelGCThreads is equal to the number of CPUs. , when the number of CPUs is greater than 8, the value of ParallelGCThreads is equal to 3+(5*CPU_COUNT/8)

-XX:GCTimeRatio: The proportion of garbage collection time to the total time (=1/(N+1)), the value range used to measure throughput
(0,100), the default is 99, that is, the garbage collection time does not exceed 1% , difficult to achieve, generally set to 19, that is, only 5 minutes of garbage collection is allowed in 100 minutes.
There is a certain contradiction with the -XX:MaxGCPauseMillis parameter. The longer the pause time, the easier it is for the Radio parameter to exceed the set ratio.

-XX:MaxGCPauseMillis sets the maximum pause time of the garbage collector (that is, the STW time) in milliseconds (this parameter needs to be used with caution).
In order to control the pause time within MaxGCPauseMillis as much as possible, the collector will adjust the Java heap size or Some other parameters
: For users, the shorter the pause time, the better the experience. However, on the server side, we focus on high concurrency and overall throughput, so the server side is suitable for Parallel control.

Insert image description here

Response time priority

Multi-threaded
scenario: The heap memory is large, and the multi-core CPU
should try to minimize the time of a single STW 0.1+0.1+0.1+0.1+0.1 = 0.5

-XX:+UseConcMarkSweepGC (old generation, mark clearing algorithm) ~ -XX:+UseParNewGC ~ SerialOld (new generation, copy algorithm)
concurrent concurrency (when the garbage collector performs garbage collection, other user threads can also perform concurrently, and garbage collection Thread seizes cpu)mark mark, sweep clear()

-XX:ParallelGCThreads=n The number of parallel garbage collection threads is generally equal to the number of CPUs
-XX:ConcGCTreads=threads The number of concurrent garbage collection threads
is generally 1/4 of ParallelGCThreads, that is, one CPU does garbage collection, and the remaining 3 The cpu is left to other user threads.

-XX:CMSInitiatingOccupancyFraction=percent, the memory ratio when starting CMS garbage collection, the
early default is 65, that is, as long as the old generation memory usage reaches 65%, it will start cleaning up, leaving 35% space for newly generated floats Rubbish.

-XX:+CMSScavengeBeforeRemark Do a garbage collection on the new generation before remarking

Insert image description here

When a parallel failure occurs in the CMS collector, the CMS collector will degenerate into SerialOld's single-threaded mark-based garbage collector.

CMS old generation recycling process

  • When the old generation space is insufficient, all processes run to a safe point and pause, and the garbage collection thread performs initial marking. The initial marking is faster and only marks the root object. This process will Stop The World and block other user threads.
  • After the initial marking is completed, the next safe point is reached, and other user threads can continue to run. At this time, the garbage collection thread performs concurrent marking, that is, it can work concurrently with other user threads to mark other garbage. This process will not STW, the response time is very short, and it does not affect the work of user threads.
  • After reaching the next safe point, re-mark, because during the previous concurrent marking, other user threads were also executing concurrently, which may generate new objects and new references, causing interference to the garbage collection thread, and need to be re-marked. This process will STW
  • After reaching the next safe point, other user processes resume, the garbage collection thread begins to clean up garbage concurrently, and resumes operation.

The entire work phase will only STW during initial marking and re-marking. Other phases are executed concurrently, and the response time is particularly short.

CMS garbage collector

  • The CPU usage of the CMS garbage collector is not high, but the user worker thread is also running. The garbage collection thread occupies the user thread's worker thread, and the throughput of the entire application is reduced.
  • When CMS performs the last step of concurrent cleanup, new garbage will be generated because other threads are still running, and the new garbage cannot be cleaned until the next garbage collection. This garbage is called floating garbage, so reserve some space for floating garbage.
  • In the re-marking phase, objects in the new generation may reference objects in the old generation. When re-marking, the entire heap needs to be scanned. When doing reachability analysis, as long as the reference to the new generation exists, whether it is necessary or not, the old generation will be found through the reference. age, which has a somewhat large impact on performance. Because there are many objects in the new generation, and many of them need to be collected as garbage. The reachability analysis will find the old generation through the new generation references, but even if the old generation is found, the new generation will still be recycled, which means there is no need to search the old generation. Therefore, it is necessary to recycle the new generation (-XX:+CMSScavengeBeforeRemark parameter setting) before remarking, so that there will be no new generation referencing the old generation, and then searching for the old generation. After the garbage collection of the new generation (through -XX:+UseParNewGC), there are fewer new generation objects, and the pressure of re-marking is lightened.
  • Because CMS is based on the mark and clear algorithm, it may generate more memory fragments. This will cause insufficient memory space after minorGC when allocating space to objects in the future, and insufficient space in the old generation, which will cause concurrency failure. CMS will degenerate into SerialOld serial garbage collection, and gain space by marking and defragmenting. However, it will cause the garbage collection time to become very long (need to be sorted out), resulting in a bad experience for users.

G1

Definition: Garbage First
2004 Paper release
2009 JDK 6u14 Experience
2012 JDK 7u4 Official support
2017 JDK 9 Default, replaces CMS garbage collector
Insert image description here
Applicable scenarios:

  • Pay attention to both throughput and low latency. The default pause target is 200 ms. While the user thread is working, the garbage collection thread is also executing concurrently.
  • Very large heap memory will divide the heap into multiple regions of equal size (region, 1248M)
  • The whole is a marking + sorting algorithm, and the two areas are a copy algorithm.

-XX:+UseG1GC Display startup G1
-XX:G1HeapRegionSize=size Set the region size
-XX:MaxGCPauseMillis=time Set the pause target

Insert image description here

garbage collection phase

Insert image description here

New Generation Garbage Collection: Also called Minor GC (Young GC), the time of occurrence is when the Eden area is full.
New Generation Garbage Collection + Concurrent Marking: When the old generation memory exceeds the threshold, concurrent mark
mixed collection will be performed at the same time as the New Generation garbage collection: Not only will the young generation be cleaned up, but also part of the old generation will be cleaned up.

Young Collection

Young GC mainly performs GC on the Eden area, and it will be triggered when the Eden space is exhausted. In this case, the data in the Eden space is moved to the Survivor space. If the Survivor space is not enough, part of the data in the Eden space will be directly promoted to the old generation space. The data in the Survivor area is moved to the new Survivor area, and some data is also promoted to the old generation space. Finally, the data in the Eden space is empty, the GC stops working, and the application thread continues to execute.
Insert image description here

Young Collection references across generations

During new generation garbage collection, first find the GC Root object, perform reachability analysis algorithm, find surviving objects, and copy the surviving objects to the survivor area.
So how to find all root objects? Part of the root objects come from the old generation, and there are many surviving objects in the old generation. If you traverse the old generation to find the root object, it will take a lot of time to scan. G1 introduced the concept of RSet. Its full name is Remembered Set, and its function is to track object references pointing to a certain heap area.
Insert image description here
The RSet of region2 in the above figure records the relationship between two references to objects in this region.

  • Each region has its own corresponding memory set RSet

  • Every time a reference type data is written, a post-write barrier will be generated to temporarily interrupt the operation.

  • Check whether the object pointed to by the reference to be written is in a different region from the reference type data (other collectors will check whether the old generation object refers to the new generation object, if so, mark it as a dirty card)

  • If different, record the relevant reference information through CardTable to the RSet of the region object where the reference points to. The
    Insert image description here
    old generation maintenance uses card table technology to subdivide the old generation area into cards (the orange area on the right side of the figure above). Each card It is about 512k. If the old generation object references the new generation, the corresponding card is marked as dirty card (pink area). When doing GC Root traversal, you do not need to find the entire old generation. You only need to focus on the dirty card area to reduce the scanning range. Improve search efficiency.

  • When the new generation of the heap is recycled, the corresponding dirty card is found through the Remembered Set record, and then the GC Root of the Region is traversed in the dirty card area.

Young Collection+CM

  • When the memory usage of the heap space reaches the threshold (-XX:InitiatingHeapOccupancyPercent, default 45%), the concurrent marking process of the old generation starts.
  • Initial marking phase: Mark objects that are directly reachable by GC Roots, that is, directly reference relationship objects. STW will occur (because it is a mark of a directly reachable object, so the pause time is very short), and a Young GC will be triggered.
  • Root Region Scanning: G1 scans the old generation area objects directly reachable by the Survivor area and marks the referenced objects. This process must be completed before Young GC (because Young GC will operate objects in the Survivor area).
  • Concurrent Marking: Concurrent marking in the entire heap (executed concurrently with the program thread). This process may be interrupted by Young GC. During the concurrent marking phase, if all objects in some regions are found to be garbage, Then this region will be recycled immediately. At the same time, during the concurrent marking process, the object activity of each region will be calculated (the proportion of surviving objects in the region. Not all regions will participate in recycling during G1 garbage collection. According to the value of recycling to give priority to regions with higher value).
  • Mark again: Since the concurrent marking phase is executed concurrently by the collector's marking thread and the program thread, it is necessary to mark again to correct the last marking result, which can be understood as incremental compensation marking. STW (shorter pause time) will occur. G1 uses an initial snapshot algorithm that is faster than CMS: snapshot-at-the-beginning (SATB).
  • Exclusive cleaning: Calculate the surviving objects and GC recycling ratio of each region, sort them (sort by high recycling value), and identify areas that can be mixed for recycling. To pave the way for the next stage, STW will occur. It should be noted that garbage collection is not actually performed at this stage.
  • Concurrent cleanup phase: Identify and clean up idle areas.
    Insert image description here
Remark-SATB

The full name of SATB is Snapshot-At-The-Beginning. Literally understood, it is a snapshot of the objects that were alive when the GC started. It is obtained through Root Tracing, and its function is to maintain the correctness of concurrent GC. So how does it maintain the correctness of concurrent GC? According to the three-color marking algorithm, we know that there are three states of objects: White: The object has not been marked. After the marking phase is completed, it will be collected as garbage. Gray: The object is marked, but its fields have not been marked or have been marked. Black: The object is marked and all its fields are marked.

Insert image description here

SATB uses write barrier to record (add to the queue) all the old references of the reference relationship that are about to be deleted, mark them as gray, and finally rescan these old references as the root to avoid leakage. mark question.

Insert image description here
Therefore, there is an essential difference between Stop The World in the G1 Remark phase and CMS's remark, that is, this pause only needs to scan the object with the write barrier as the root object, while CMS's remark needs to rescan the entire root collection, so CMS remark may be very slow.

Mixed Collection

Mixed garbage collection, each collection may only collect the young generation partition (young generation collection), or it may collect part of the old generation partition while collecting the young generation (mixed collection), so that even if the heap memory is large, It is also possible to limit the collection scope, thereby reducing pauses.

G1 has a parameter: "-XX: InitiatingHeapOccupancyPercent". The default value is 45%. When the size of the old generation occupies 45% of the heap memory, a mixed recycling phase of the new generation and the old generation will be triggered. Full recycling of ES 0 H.

Once this stage is triggered, it will cause the system to enter STW and perform the last mark at the same time:

  • Final marking phase: Based on the object modifications recorded in the concurrent marking phase, which objects are finally marked as alive and which objects are garbage.

At this time, the old generation is also recycled according to the mark-copy algorithm, and the marked surviving objects will be copied to the new Region as the old generation area:

  • Garbage collection begins immediately after marking is completed. For a mixed collection process, G1 moves live objects from the old generation to free areas, and these free areas become the old generation regions. When more and more objects are promoted to the old generation region, in order to avoid the heap memory being exhausted, the mixed garbage collection Mixed GC will be triggered. This algorithm is not an Old GC nor a Full GC. In addition to recycling the entire Young region In addition, some Old regions will also be recycled, and some region garbage collection designs can control the time-consuming garbage collection.
  • After the concurrent marking is completed, the memory segments in the regions in the old generation that can be completely confirmed as garbage are recycled, and the memory segments in the regions that are partially garbage are also calculated. By default, the memory segments in these old generation regions are recycled. The segment will be divided into 8 recycling times (can be set through -XX:G1MixedGCCountTarget).
  • The collection of mixed recycling includes 1/8 of the old generation memory segmentation, Eden area memory segmentation, and Survivor memory segmentation. The hybrid recycling algorithm is exactly the same as the young generation recycling algorithm.
  • Mixed recycling does not have to be performed 8 times. There is a threshold setting: -XX:G1HeapWastePercent. The default value is 10%, which means that 10% of the entire heap memory is allowed to be wasted. This means that if it is found that the garbage that can be recycled accounts for If the memory ratio is less than 10%, mixed collection will not be performed, because the time spent by GC is not worth the loss compared to less garbage collection.
  • Since memory segments in the old generation are divided into 8 recycling times by default, G1 will give priority to memory segments with a lot of garbage. The higher the proportion of garbage in memory segments, the higher the garbage content will be . And there is a threshold that determines whether the memory segment is recycled: -XX:G1MixedGCLiveThresholdPercent. The default is 65%, which means that the proportion of garbage in the memory segment must reach 65% to be recycled. If the garbage proportion is too low, it means that there are too many surviving objects. , the replication algorithm will spend more time replicating surviving objects.
  • When necessary (the object allocation speed is much greater than the recycling speed), Full GC will still be triggered (Full GC has higher costs, single thread, poor performance, and long STW time)
  • The heap memory is too small, the object allocation speed is much faster than the recycling speed, etc., which can cause G1 to have no free memory segments available when copying surviving objects, eventually causing Full GC to be triggered.
    Insert image description here

Full GC

SerialGC
Garbage collection that occurs due to insufficient memory in the new generation - minor gc
Garbage collection that occurs due to insufficient memory in the old generation - full gc
ParallelGC
Garbage collection that occurs due to insufficient memory in the new generation - minor gc Garbage collection that
occurs due to insufficient memory in the old generation - full gc
CMS
new generation memory Insufficient garbage collection - minor gc
When the old generation memory is insufficient, the garbage collection speed is lower than the generation speed, concurrency fails, and it degrades to single-threaded SerialGC serial execution, which is full GC, otherwise it is not. Garbage collection caused by insufficient memory in the new generation of
G1 - minor gc insufficient memory in the old generation: when the threshold is exceeded, concurrent marking is performed first and then mixed collection is performed. When the recycling speed is higher than the speed at which new user threads generate garbage, concurrent garbage collection is in progress. When the garbage collection speed is lower than the newly generated garbage speed, it degrades to full GC and the response time is longer.

G1 garbage collection optimization

JDK 8u20 string deduplication

-XX:+UseStringDeduplication turns on the string deduplication function. It is turned on by default.

Insert image description here
All newly allocated strings will be put into a queue. When the new generation is recycled, G1 concurrently checks whether there are duplicate strings. If they have the same value, let them reference the same char[]. s1 and s2 refer to two different objects in the heap, but both objects point to the same string, so s1!= s2.

Advantages: Saves a lot of memory.
Disadvantages: Slightly more CPU time is occupied, and the new generation recycling time is slightly increased.

Note: Unlike String.intern(),
String.intern() focuses on string objects, while string deduplication focuses on char[]. Within the JVM, different string tables are used.

JDK 8u40 concurrency class uninstallation

Classes in previous versions of jdk are generally not unloaded. After the class is loaded, it will always occupy memory.
After all objects are marked for concurrency, you can know which classes are no longer used. When all classes of a class loader are no longer used,
all classes loaded by it are unloaded.

-XX:+ClassUnloadingWithConcurrentMark is enabled by default

Unloading conditions:
All instances of the class are recycled and
all classes in the class loader where the class is located are no longer used.

JDK 8u60 recycles giant objects

When an object is larger than half the region, it is called a giant object.
G1 will not copy giant objects and will be given priority when recycling.
G1 will track all incoming references in the old generation, so that giant objects with an incoming reference of 0 in the old generation can be disposed of during the new generation garbage collection.
Insert image description here

JDK 9 Adjustment of concurrent marking start time

In order to reduce Full GC, concurrent marking and mixed collection can be started in advance.
Before JDK 9, you need to use -XX:InitiatingHeapOccupancyPercent. The threshold for the proportion of the old generation in the entire heap memory. When exceeded, concurrent garbage collection starts. The default is 45%.

JDK 9 can dynamically adjust -XX:InitiatingHeapOccupancyPercent to set the initial value. During the garbage collection process, data sampling and dynamic adjustment of the threshold will add a safe gap space and reduce the probability of Full GC.

Garbage collection tuning

Reference: How to tune GC.
Check the virtual machine operating parameters: java -XX:+PrintFlagsFinal -version | findstr "GC" (check the parameters related to the local virtual machine and GC)

PS D:\java\idea\IdeaProject2\jvm\out> java -XX:+PrintFlagsFinal -version | findstr "GC"
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
    uintx AdaptiveSizeMajorGCDecayTimeScale         = 10                                  {
    
    product}
    uintx AutoGCSelectPauseMillis                   = 5000                                {
    
    product}
     bool BindGCTaskThreadsToCPUs                   = false                               {
    
    product}
    uintx CMSFullGCsBeforeCompaction                = 0                                   {
    
    product}
    uintx ConcGCThreads                             = 0  CMS并发线程数,默认0                                 {
    
    product}
     bool DisableExplicitGC                         = false                               {
    
    product}
     bool ExplicitGCInvokesConcurrent               = false                               {
    
    product}
     bool ExplicitGCInvokesConcurrentAndUnloadsClasses  = false                               {
    
    product}
    uintx G1MixedGCCountTarget                      = 8                                   {
    
    product}
    uintx GCDrainStackTargetSize                    = 64                                  {
    
    product}
    uintx GCHeapFreeLimit                           = 2                                   {
    
    product}
    uintx GCLockerEdenExpansionPercent              = 5                                   {
    
    product}
     bool GCLockerInvokesConcurrent                 = false                               {
    
    product}
    uintx GCLogFileSize                             = 8192                                {
    
    product}
    uintx GCPauseIntervalMillis                     = 0                                   {
    
    product}
    uintx GCTaskTimeStampEntries                    = 200                                 {
    
    product}
    uintx GCTimeLimit                               = 98                                  {
    
    product}
    uintx GCTimeRatio                               = 99       GC时间占比                           {
    
    product}
     bool HeapDumpAfterFullGC                       = false                               {
    
    manageable}
     bool HeapDumpBeforeFullGC                      = false                               {
    
    manageable}
    uintx HeapSizePerGCThread                       = 87241520                            {
    
    product}
    uintx MaxGCMinorPauseMillis                     = 4294967295                          {
    
    product}
    uintx MaxGCPauseMillis     最大GC停止时间目标      = 4294967295                          {
    
    product}
    uintx NumberOfGCLogFiles                        = 0                                   {
    
    product}
     intx ParGCArrayScanChunk                       = 50                                  {
    
    product}
    uintx ParGCDesiredObjsFromOverflowList          = 20                                  {
    
    product}
     bool ParGCTrimOverflow                         = true                                {
    
    product}
     bool ParGCUseLocalOverflow                     = false                               {
    
    product}
    uintx ParallelGCBufferWastePct                  = 10                                  {
    
    product}
    uintx ParallelGCThreads                         = 13                                  {
    
    product}
     bool ParallelGCVerbose                         = false                               {
    
    product}
     bool PrintClassHistogramAfterFullGC            = false                               {
    
    manageable}
     bool PrintClassHistogramBeforeFullGC           = false                               {
    
    manageable}
     bool PrintGC                                   = false                               {
    
    manageable}
     bool PrintGCApplicationConcurrentTime          = false                               {
    
    product}
     bool PrintGCApplicationStoppedTime             = false                               {
    
    product}
     bool PrintGCCause                              = true                                {
    
    product}
     bool PrintGCDateStamps                         = false                               {
    
    manageable}
     bool PrintGCDetails                            = false                               {
    
    manageable}
     bool PrintGCID                                 = false                               {
    
    manageable}
     bool PrintGCTaskTimeStamps                     = false                               {
    
    product}
     bool PrintGCTimeStamps                         = false                               {
    
    manageable}
     bool PrintHeapAtGC                             = false                               {
    
    product rw}
     bool PrintHeapAtGCExtended                     = false                               {
    
    product rw}
     bool PrintJNIGCStalls                          = false                               {
    
    product}
     bool PrintParallelOldGCPhaseTimes              = false                               {
    
    product}
     bool UseGCOverheadLimit                        = true                                {
    
    product}
     bool UseGCTaskAffinity                         = false                               {
    
    product}
     bool UseMaximumCompactionOnSystemGC            = true                                {
    
    product}
     bool UseParNewGC                               = false                               {
    
    product}
     bool UseParallelGC                            := true                                {
    
    product}
     bool UseParallelOldGC                          = true                                {
    
    product}
     bool UseSerialGC                               = false                               {
    
    product}

Master related tools: jmap, jconsole, jstat to view GC related status

Tuning should not only focus on memory GC, but also consider thread heap lock competition, CPU usage, IO calls, network latency, and software and hardware considerations.

Confirm target

For GC tuning, we first need to know what the tuning goal is? You must know what your application does. If it is doing scientific calculations, you must focus on high throughput. If it is an Internet project, you must pursue low latency and improve user experience.

  • High throughput: ParallelGC
  • Low latency: CMS (not recommended), G1, ZGC (java12)

From a performance perspective, GC tuning usually focuses on three aspects: memory usage (footprint), latency (latency) and throughput (throughput). In most cases, tuning will focus on one or two of the goals, and rarely can three different perspectives be taken into consideration. Other GC-related scenarios may also need to be considered. For example, OOM may also be related to unreasonable GC-related parameters; or the application startup speed requirements, GC will also be a consideration.

The fastest GC is no GC happening

Check the memory usage before and after FullGC and consider the following questions:

  • Is there too much data?
    • "select * from large table" will read all the data from mysql into java memory
    • "select * from large table limit n" limit quantity
  • Is the data representation too bloated?
    • The object graph retrieves unused data associated with the object at one time
    • Object size versus memory footprint: The smallest Object in Java occupies 16byte (Integer 16 int 4)
  • Is there a memory leak?
    • static Map map = ...continuously storing objects in static map objects will cause memory overflow.
    • Recommended for long-lived objects, soft/weak references
    • It is not recommended to implement cache in Java. It is recommended that third-party cache implement redis/memorycacehe.

Memory tuning

New Generation Tuning

Characteristics of the new generation:

  • The new object is first allocated in Eden, and the allocation speed is particularly fast.
    • Each thread will allocate a private area in memory (TLAB thread-local allocation buffer). When a new object is created, it will first check whether there is available memory in TLAB. If there is, it will be done in TLAB first. Object allocation can avoid interference with memory usage when multiple threads create objects at the same time.
  • The recycling cost of dead objects is zero
    • When garbage collection occurs in the new generation, the copy algorithm (Eden+survival area from–>survival area to) is used. After copying, the memory in Eden and the survivor area from is released.
  • Most objects die immediately after use
  • The time of Minor GC is much lower than that of Full GC

Is the larger the new generation memory, the better?
-Xmn
Sets the initial and maximum size (in bytes) of the heap for the young generation (nursery). GC is
performed in this region more often than in other regions. If the size for the young generation is
too small, then a lot of minor garbage collections are performed. If the size is too large, then only
full garbage collections are performed, which can take a long time to complete. Oracle
recommends that you keep the size for the young generation greater than 25% and less than
50 % of the overall heap size.
[Set the initial size and maximum size of the new generation (in bytes). GC is performed more frequently in this area than in other areas. If the young generation is too small, multiple minor GC triggers will be performed. If the size is too large, only the young generation garbage collection is performed, which may take a long time. It is recommended that the new generation should account for 25% to 50% of the heap.

Suggestions for the size of the freshman area:

  • The new generation can accommodate all [concurrency * (request + response)] data
  • The survival area is large enough to retain [currently active objects + objects that need to be promoted]
    • Properly configure the promotion threshold to promote long-lived objects as quickly as possible (otherwise, memory in the survivor area will be consumed and copied continuously)

-XX:MaxTenuringThreshold=threshold Adjust the maximum promotion threshold
-XX:+PrintTenuringDistribution Print the surviving objects in the promotion area

Desired survivor size 48286924 bytes, new threshold 10 (max 10)
- age 1: 28992024 bytes, 28992024 total
- age 2: 1366864 bytes, 30358888 total
- age 3: 1425912 bytes, 31784800 total
...

Old generation tuning

Take CMS as an example

  • The bigger the old generation memory of CMS, the better.
  • Try not to tune the old generation first. If there is no Full GC, it proves that garbage collection is not caused by insufficient memory in the old generation. Even if Full GC occurs, try to tune the new generation first.
  • Observe the old generation memory usage when Full GC occurs, and increase the old generation memory default by 1/4 ~ 1/3.

-XX:CMSInitiatingOccupancyFraction=percent

GC tuning case

Case 1: Frequent business peaks of Full GC and Minor GC
are coming. A large number of objects are created to fill up the new generation space. The promotion threshold of the survivor area will be lowered, resulting in many objects with short life cycles being promoted to the old generation. Further triggering the occurrence of Full GC in the old generation.
First try to increase the memory size of the new generation. If the memory is sufficient, garbage collection will not be so frequent. At the same time, increase the survivor area and survivor area threshold, so that objects with short life cycles can stay in the new generation as much as possible. Further Reduce triggering GC in the old generation.

Case 2: Full GC occurred during the peak request period, and the single pause time was particularly long (CMS).
Check the GC log to determine which stage of CMS takes a long time (generally remarking takes a long time).

Therefore, it is necessary to recycle the new generation (-XX:+CMSScavengeBeforeRemark parameter setting) before remarking, so that there will be no new generation referencing the old generation, and then searching for the old generation. After the garbage collection of the new generation (through -XX:+UseParNewGC), there are fewer new generation objects, and the pressure of re-marking is lightened.
Insert image description here
Case 3: Full GC occurs when the old generation is sufficient (CMS jdk1.7)
1.8 uses metaspace as the implementation of the method area, and 1.7 uses the permanent generation as the implementation of the method area.

Guess you like

Origin blog.csdn.net/weixin_43994244/article/details/129261112