JVM garbage collector

There is a wall between Java and C++, which is surrounded by memory allocation and garbage collection technology. People outside the wall want to go in, but people inside the wall want to come out. ——《Understanding Java Virtual Machine》

We think about three things:

  • Which memory needs to be reclaimed?
  • When to recycle?
  • How to recycle?

    In the Java memory runtime area, the program counter, the virtual machine stack, and the local method stack are born with the thread, and disappear with the thread; the stack frame in the stack is executed in an orderly manner with the entry and exit of the method. and push operations. Therefore, the memory allocation and recycling of these areas are deterministic, and there is no need to consider the problem of recycling too much. When the method ends or the thread ends, the memory will naturally follow the recycling.

    The Java heap is different from the method area. Only when the program is running does it know which objects will be created. The allocation and recovery of this part of the memory are dynamic, and the garbage collector focuses on this part of the memory.

    Garbage collection is mainly in the heap and method areas

Subject is dead?

Reference counting algorithm (Reference Counting):

Many textbooks determine whether an object is alive: add a reference counter to the object, and when a reference to it is made, the counter is incremented by 1; when the reference expires, the counter value is decremented by 1; any time the counter is 0, the object can no longer be used. .
But the Java language does not use reference counting to manage memory, mainly because it is difficult to solve the problem of mutual circular references between objects.

objA = objB;
objB = objA;

These two objects always refer to each other, so the reference count is not 0 and cannot be recycled.

root search algorithm

Java uses the root search algorithm (GC Roots Tracing) to determine whether the object is alive.
Through a series of objects named "GC Roots" as the starting point, the search starts from these nodes, and the path traversed by the search is called the Search Chain (Reference Chain). When an object is connected to the GC Roots without any reference chain ( In the words of graph theory, when the object is unreachable from GC Roots), it is proved that this object is not available.
The ones that can be used as GC Roots in java are:

  • A referenced object in the virtual machine stack.
  • The object referenced by the static properties of the class in the method area.
  • The object referenced by the constant in the method area.
  • The object referenced by the JNI (Native method) in the native method stack.

Talk about citations again

Reference definition in Java before JDK1.2: If the value stored in the data of reference type represents the starting address of another piece of memory, it is said that this piece of memory represents a reference.
Under this definition, the object has only two states of being referenced and not being referenced.
After JDK1.2, Java expanded the reference concept, dividing the reference into Strong Reference, Soft Reference, Weak Reference, Phantom Reference, and the reference strength decreased in turn.

  • Strong reference: It is ubiquitous in program code, similar to Object obj = new Object()this, as long as the strong reference is still there, garbage collection will never reclaim the referenced object.
  • Soft references: describe objects that are useful, but not necessary. In Java, it is represented by the java.lang.ref.SoftReference class. For soft-referenced objects, these objects will be included in the reclamation range and reclaimed a second time before a memory overflow exception occurs in the system. If there is still not enough memory for this collection, an out-of-memory exception will be thrown. Therefore, this point can be well used to solve the problem of OOM, and this feature is very suitable for implementing caching: such as web page caching, image caching, etc.
  • Weak References: Objects with only weak references have a shorter lifetime. In the process of scanning the memory area under the jurisdiction of the garbage collector thread, once an object with only weak references is found, its memory will be reclaimed regardless of whether the current memory space is sufficient or not. However, since the garbage collector is a very low-priority thread, objects that only have weak references will not necessarily be discovered quickly.
  • Virtual reference: The existence of virtual references in an object will not affect the lifetime, and it is impossible to obtain a real reference to an object through virtual references. The only use: can receive system notification when the object is GC, JAVA uses PhantomReference to implement virtual reference.
    write picture description here

To live or to die?

For objects that are unreachable in the root search algorithm, the GC will determine whether the object has covered the finalize method, and if not, it will be recycled directly. Otherwise, if the object has not executed the finalize method, it is put into the F-Queue queue, and a low-priority thread executes the finalize method of the object in the queue. Execute the finalize() method for self-rescue. The self-rescue process only needs to re-establish a connection with any object on the reference chain. For example, assign yourself (this keyword) to a class variable or member variable of an object. After the finalize method is executed, the GC will again determine whether the object is reachable. If it is not reachable, it will be recycled. Otherwise, the object will be "resurrected".

public class TestGC {

    public static TestGC testgc = null;
    protected void finalize() throws Throwable{
        super.finalize();
        System.out.println("执行finalize()");
        testgc= this;
    }
    public static void main(String[] args) throws Exception {
        testgc = new TestGC();
        testgc = null;
        System.gc();
        //因为finalize方法优先级很低,暂停0.5秒,等待finalize执行
        Thread.sleep(500);
        if(testgc!=null){
            System.out.println("自救成功");
        }else{
            System.out.println("死了");
        }
        //和上面代码完全相同,但自救失败
        testgc = null;
        System.gc();
        Thread.sleep(500);
        if(testgc!=null){
            System.out.println("自救成功");
        }else{
            System.out.println("死了");
        }
    }
}

result:

执行finalize()
自救成功
死了

Note: the finalize() method of any object will only be automatically called once by the system.

Garbage Collection Algorithms

mark-sweep algorithm

The algorithm is divided into two phases: "marking" and "clearing": first, all objects that need to be recovered are marked, and the marked objects are uniformly recovered after the marking is completed. The marking and clearing processes are not very efficient, and a large number of discontinuous memory fragments will be generated after marking and clearing.

replication algorithm

The copy algorithm divides the available memory into two equal-sized blocks by capacity, and only uses one of them at a time. When this block of memory is used up, copy the surviving objects to another block, and then clean up the used memory space at one time. In this way, one piece of memory is reclaimed each time, and the memory fragmentation problem is not considered when memory is allocated. But this comes at the cost of shrinking memory by half.
Most virtual machines use this algorithm to recycle the new generation. 98% of the objects in the new generation are dying, so there is no need to divide the memory space according to the ratio of 1:1, but into a larger Eden space and Two smaller Survivor spaces. Use Eden and a piece of Survivor each time. When recycling, copy Eden and the surviving objects in the Survivor just used to another Survivor at one time, and finally clean up Eden and the Survivor just used. The default size ratio of Eden and Survivor is 8:1.

mark-collate algorithm

The copy collection algorithm will perform more copy operations when the object survival rate is high, and the efficiency will become very low. Therefore, this algorithm cannot generally be used in the old age.
According to the characteristics of the old age, a "mark-sort" is proposed, which first marks all objects that need to be recycled, then moves all surviving objects to one end, and then directly clears the memory beyond the end boundary.

Generational Collection Algorithm

Divide java into the new generation and the old generation. In the new generation, a large number of objects die each time a garbage collection is collected, and only a few survive. Using the replication algorithm, only a small amount of surviving objects can be collected. Because the object's survival rate is high in the old age, there is no additional space to allocate it, so "mark-clean" or "mark-clean" must be used.

garbage collector

Serial collector

For the new generation;
using replication algorithm; single-threaded collection;
when performing garbage collection, all worker threads must be suspended until completion;
that is, "Stop The World";

ParNew collector

The ParNew garbage collector is a multithreaded version of the Serial collector.
Except for multithreading, the rest of the behavior and characteristics are the same as the Serial collector; for
example, the Serial collector can use control parameters, collection algorithms, Stop The World, memory allocation rules, recycling strategies, etc.;
the two collectors share a lot of code;

Serial Old collector

Serial Old is the old version of the Serial collector;
for the old generation;
using the "mark-sort" algorithm (and compression, Mark-Sweep-Compact); single-
threaded collection;

Difference Between Concurrent Garbage Collection and Parallel Garbage Collection

Parallel

Refers to multiple garbage collection threads working in parallel, but the user thread is still in a waiting state at this time;
such as ParNew, Parallel Scavenge, Parallel Old;

Concurrent

Refers to the simultaneous execution of the user thread and the garbage collection thread (but not necessarily in parallel, and may be executed alternately);
the user program continues to run, while the garbage collector thread runs on another CPU;
such as CMS, G1 (also parallel);

Difference between Minor GC and Full GC

Minor GC

Also known as Cenozoic GC, it refers to the garbage collection action that occurs in the Cenozoic;
because most Java objects are dying, so Minor GC is very frequent, and the recovery speed is generally faster;

Full GC

Also known as Major GC or old age GC, it refers to GC that occurs in the old age; the
occurrence of Full GC is often accompanied by at least one Minor GC (not absolute, the Parallel Sacvenge collector can choose to set the Major GC strategy);
Major GC is generally faster than Minor GC is more than 10 times slower;

Throughput and Collector Concerns Explained

  (A)、吞吐量(Throughput)

          CPU用于运行用户代码的时间与CPU总消耗时间的比值;

          即吞吐量=运行用户代码时间/(运行用户代码时间+垃圾收集时间);    

          高吞吐量即减少垃圾收集时间,让用户代码获得更长的运行时间;

    (B)、垃圾收集器期望的目标(关注点)

        (1)、停顿时间    

              停顿时间越短就适合需要与用户交互的程序;

              良好的响应速度能提升用户体验;

        (2)、吞吐量

              高吞吐量则可以高效率地利用CPU时间,尽快完成运算的任务;

              主要适合在后台计算而不需要太多交互的任务;

        (3)、覆盖区(Footprint)

              在达到前面两个目标的情况下,尽量减少堆的内存空间;

              可以获得更好的空间局部性;

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325769395&siteId=291194637