[In-depth understanding of JVM] 6. GC algorithm-how to find garbage + clear garbage + JVM memory generation model + common garbage collector [interview essential]

 Previous post: [In-depth understanding of JVM] 5. Run-time data areas + common instructions [required for interviews]

1. Basic knowledge of GC

1. The difference between JAVA and C++ GC

1、java

  • GC processing garbage

  • High development efficiency, low execution efficiency

2、C++

  • Hand waste disposal

  • Forgot to recycle garbage

    • Memory leak

  • Recycled many times

    • Unauthorized access

  • Low development efficiency, high execution efficiency

2. How to locate garbage.

1, RC: reference count reference count

  • Algorithm principle: Add a reference counter to an object. Whenever a reference to it, the counter value is increased by 1; when a reference is expired, the counter value is decreased by 1; when the reference count is 0, it means that the object is no longer used , Can be recycled.
  • Application: Microsoft COM/ActionScript3/Python
  • Advantages: simple to implement, high efficiency of judgment, usually a good algorithm.
  • Disadvantage: It is difficult to solve the problem of circular references. The loop body is garbage but their references are not 0

2. RS: Root Searching Root Reachable Algorithm

Root reachability algorithm
  • Algorithm principle: Use the object called "GC Roots" as the starting point to search downwards. The path traversed by the search becomes a reference chain. When an object is unreachable to GC Roots, it means that the object is no longer used and can be recycled.
  • Application: Java/C#/Lisp
  • Advantages: can solve the problem of circular references
  • Insufficiency: the algorithm is slightly complicated
  • Root object: GC roots (remember it is important)
    • Thread stack variable
    • Static variable
    • Constant pool
    • JNI pointer: the object referenced by C/C++

3. GC Algorithms for garbage removal

  • Mark-Sweep (Mark-Sweep)

  • Copying

  • Mark-Compact (mark compression)

1. Mark-Sweep (mark clearing)

 

  • First find useful, then find useless, then mark, then clear

  • Features:

    • 1. The algorithm is relatively simple, and the efficiency is higher when there are more surviving objects.

    • 2. Two scans are required (the first pass is to find those useful objects, and the second pass is to find and remove the useless objects), so the efficiency is relatively low, and it is easy to generate fragments (the ones that are cleared are not compressed or cleaned, so there will be A lot of empty)

2. Copying

 

 

  • Divide the memory into two and copy the useful ones to half of the memory. After all the copies are completed, the other half of the useless objects are cleared.

  • Find some useful ones, then copy the useful ones to other places, and then delete all the useless objects after copying them.

  • Features:

    • 1. It is suitable for the situation with few surviving objects, only scanning once, improving efficiency and no fragment

    • 2. Space is wasted. Moving and copying objects requires adjusting the references of the objects.

3. Mark-Compact (mark compression)

 

 

  • First find useful objects and then move to the front, let all useful objects go to one corner, and the rest are useless objects, clear them all

  • Features:

    • 1. Scanning twice requires moving the object, and the efficiency is low.

    • 2. No fragmentation, convenient object allocation, and no memory halving.

4. JVM memory generation model (used for generational garbage collection algorithm)

 

An object is first on the stack, cannot fit, and then goes to the Eden area. Garbage collection enters the survivor1 area (survivor area or from) and then recycles it into the survivor2 area. After multiple garbage collections, it enters the Old area.

Note: GC:

  • MinorGC/YGC: Triggered when the young generation space is exhausted
  • MajorGC/FullGC: Triggered when the old generation cannot continue to allocate space, the new generation, the old generation colleagues recycle
  • -Xmn: -X: non-standard parameters. m:Memory n: new
  • -Xms: s: minimum value
  • -Xmx: x: maximum

  1. Model used by some garbage collectors

    All GCs except Epsilon ZGC Shenandoah use logical generation model

    G1 is logical generation, and physical generation is not

    In addition, not only logical generation, but also physical generation

  2. In-heap space ( new generation + old generation ) + permanent generation (1.7) Perm Generation/ metadata area (1.8) Metaspace

    1. Permanent generation metadata storage:-Class, method compiled information, code compiled information and bytecode, etc.
    2. The permanent generation must specify the size limit (it is easy to overflow, this can't be changed), the metadata can be set or not, and there is no upper limit (limited by physical memory)
    3. String constant 1.7-permanent generation, -->1.8-heap
    4. MethodArea (concrete realization of 1.7 Perm Generation/ Metaspace (1.8) Metaspace ) logical concept-permanent generation, metadata
  3. Cenozoic = Eden + 2 suvivor areas 

    1. After YGC reclaims, most of the objects will be reclaimed and enter s0 alive
    2. YGC again, the living object eden + s0 -> s1
    3. YGC again, eden + s1 -> s0
    4. Enough age -> old age (15 CMS 6)
    5. Can't fit in s area -> old age
  4. Old age

    1. Diehard
    2. The old age is full of FGC Full GC
  5. GC Tuning (Generation)

    1. Minimize FGC
    2. MinorGC = YGC
    3. MajorGC = FGC
  6. Dynamic age: (not important)  https://www.jianshu.com/p/989d3b06a49d

  7. Allocation guarantee: (not important) During YGC (young gc), the survivor area is not enough space. The space guarantee goes directly to the old age. Reference: https://cloud.tencent.com/developer/article/1082730

  8. Object allocation process diagram: Garbage collection execution sequence: TLAB: thread local allocation.

Object allocation process diagram

 

Just understand:

 Code verification efficiency:

/**
 * 减号代表去掉属性
 * -XX:-DoEscapeAnalysis 去掉逃逸分析
 * -XX:-EliminateAllocations 去掉标量替换
 * -XX:-UseTLAB 去掉线程本地分配TLAB
 * -Xlog:c5_gc*
 * 逃逸分析 标量替换 线程专有对象分配
 */
public class TestTLAB {
    //User u;
    class User {
        int id;
        String name;

        public User(int id, String name) {
            this.id = id;
            this.name = name;
        }
    }

    void alloc(int i) {
        // 无逃逸
        new User(i, "name " + i);
    }

    public static void main(String[] args) {
        TestTLAB t = new TestTLAB();
        long start = System.currentTimeMillis();
        for (int i = 0; i < 1000_0000; i++) t.alloc(i);
        long end = System.currentTimeMillis();
        System.out.println(end - start);

        //for(;;);
    }
}

Normal operation: 400

 Modify parameter operation: 800

Notes:

  • Eden area+S1-->S2 over 50% will be lost to the Old area as shown above.

Argumentation: (refer to if you understand:  https://www.jianshu.com/p/989d3b06a49d )

uint ageTable::compute_tenuring_threshold(size_t survivor_capacity) {
    //survivor_capacity是survivor空间的大小
  size_t desired_survivor_size = (size_t)((((double) survivor_capacity)*TargetSurvivorRatio)/100);
  size_t total = 0;
  uint age = 1;
  while (age < table_size) {
    total += sizes[age];//sizes数组是每个年龄段对象大小
    if (total > desired_survivor_size) break;
    age++;
  }
  uint result = age < MaxTenuringThreshold ? age : MaxTenuringThreshold;
    ...
}

I extracted the code for calculating the age of promotion. Let's take a look at the calculation of dynamic age. There is a TargetSurvivorRatio value in the code.

-XX:TargetSurvivorRatio
target survival rate, the default is 50%

  1. Calculate an expected value through this ratio, desired_survivor_size.
  2. Then use a total counter to accumulate the total size of each age group.
  3. Stop when total is greater than desired_survivor_size.
  4. Then compare the current age and MaxTenuringThreshold to find the minimum value as the result

The overall representation is that the age is accumulated from young to old, and when a certain age group is added, when the accumulated sum exceeds the survivor area *TargetSurvivorRatio, it will be promoted from the age objects in this age group.

5. Common garbage collectors

Notes:

  • 1-4: They are all generations logically,
  • 5-6: There is no generational logic.
  • 10: Used for debugging.
  • Serial refers to the single-threaded series,
  • Parallel refers to multithreading.  
  • Common combinations: Serial+Serial Old; Parallel Scavenge+Parallel Old (this is the default if there is no tuning on many lines now); the ParNew+CMS combination is shown in the dotted line above.
  • CMS

     

  1. JDK was born, Serial followed to improve efficiency, and PS was born. In order to cooperate with CMS, PN was born. CMS was introduced in the later version of 1.4. CMS is a milestone GC that opened the process of concurrent recycling. However, CMS has many problems, so any A JDK version defaults to CMS concurrent garbage collection because STW cannot be tolerated
  2. Serial recycling in the young generation of Serial Single CPU is the most efficient, and the virtual machine is the default garbage collector in Client mode. safe point (safe point)  stop-the-world (STW) (thread stop) STW on the safe point. Now rarely used
    Serial

     

  3. PS (Parallel Scavenge) The young generation parallel recycling variant of Parallel Scavenge is for use with CMS
    Parallel Scavenge

     

  4. ParNew’s young generation cooperates with CMS's parallel recovery. PN response time is prioritized. With CMS, PS throughput is prioritized.
    Parallel New

     

  5. SerialOld 
    Serial Old
  6. ParallelOld
  7. ConcurrentMarkSweep is concurrent in the old age. Garbage collection and applications run at the same time, which reduces the STW time (200ms). There are many problems with CMS, so there is no version that defaults to CMS. You can only manually specify that CMS is MarkSweep. When the fragmentation reaches a certain level, when the allocated objects in the old age of CMS cannot be allocated, use SerialOld to recycle the old age. Imagine: PS + PO -> add memory for garbage collector -> PN + CMS + SerialOld (a few hours -Several days of STW) Dozens of G memory, single-threaded recycling -> G1 + FGC Dozens of G -> T memory server ZGC algorithm: three-color mark + Incremental Update
  8. G1(10ms) algorithm: three-color mark + SATB
  9. ZGC (1ms) PK C++ 算法:ColoredPointers + LoadBarrier
  10. Shenandoah 算法:ColoredPointers + WriteBarrier
  11. Eplison
  12. Extended reading on the difference between PS and PN: ▪ https://docs.oracle.com/en/java/javase/13/gctuning/ergonomics.html#GUID-3D0BB91E-9BFF-4EBB-B523-14493A860E73
  13. The relationship between garbage collector and memory size
    1. Serial dozens of megabytes
    2. PS hundreds of megabytes-a few G
    3. CMS - 20G
    4. G1-hundreds of G
    5. ZGC - 4T - 16T (JDK13)
  14. CMS
    1. Disadvantages:

1.8 The default garbage collection: PS + ParallelOld

Common garbage collector combination parameter settings: (1.8)

  • -XX:+UseSerialGC = Serial New (DefNew) + Serial Old

    • Small program. It will not be this option by default, HotSpot will automatically select the collector based on calculation and configuration and JDK version
  • -XX:+UseParNewGC = ParNew + SerialOld

  • -XX:+UseConc(urrent)MarkSweepGC = ParNew + CMS + Serial Old

  • -XX:+UseParallelGC = Parallel Scavenge + Parallel Old (1.8默认) 【PS + SerialOld】

  • -XX:+UseParallelOldGC = Parallel Scavenge + Parallel Old

  • -XX:+UseG1GC = G1

  • The default GC view method is not found in Linux, but UseParallelGC will be printed in windows 

    • java +XX:+PrintCommandLineFlags -version
    • Distinguish by GC log
  • What exactly is the default garbage collector of version 1.8 under Linux?

    • 1.8.0_181 Default (not visible) Copy MarkCompact
    • 1.8.0_222 Default PS + PO

Next: [In-depth understanding of JVM] 8. JVM actual combat tuning + arthas actual combat use + jvisualvm actual combat use + GC algorithm + JVM tuning how to locate problems + online troubleshooting + fixed bugs for small bugs without stopping the service + CMS+G1+ common essential parameters [interview]

Guess you like

Origin blog.csdn.net/zw764987243/article/details/109533242