【JAVA Advanced Architect Guide】 Part 4: GC for garbage collection

Foreword

  In [JAVA Advanced Architect Guide] series two and three, we learned about the memory model and class loading mechanism of the JVM. Among the memory models, we said that from the perspective of the thread, the JVM is divided into thread-private areas (Virtual machine stack / local method stack / program counter) and thread public area (method area and java heap), where the thread private area memory is recycled with the end of the thread, GC mainly focuses on the heap and method area Memory.

GC recovery algorithm

  How does GC determine which objects need to be recycled? Generally speaking, there are two algorithms: reference counting algorithm and reachability analysis algorithm.

Reference counting algorithm

  A reference counter is held for each object. The initial test status is 0. The object is incremented every time it is referenced, otherwise it is decremented by one. Therefore, when GC performs garbage collection, it is judged that the reference counter = 0 for recycling. Otherwise, it will not be recycled. Obviously, the shortcoming of the reference counting algorithm is that it cannot solve the problem of circular dependency. If object A refers to object B, object B refers to object C, object C refers to object A, and circular dependency causes the three objects of ABC cannot be Recycling. This leads to the reachability analysis algorithm.

Reachability analysis algorithm

  The so-called reachability analysis algorithm is to use a series of objects named "GC Roots" as a starting point to start searching downward from these nodes. The path that the search takes is called a reference chain (Reference Chain). When GC Roots are not connected by any reference chain, it proves that this object can be recycled, otherwise it cannot be recycled. In JAVA, the objects that can be used as "GC Roots" include the following:
  a. Virtual machine stack (stack Objects referenced in the local variable table in the frame
  b. Objects referenced by class static properties in the
  method area c. Objects referenced by constants in the method area
  d. Objects referenced by the JNI in the local method stack

Java language object reference type

  Both the reference counting algorithm and the reachability analysis algorithm involve references to objects. In Java, references are divided into four types: strong references, soft references, weak references, and virtual references (ghost references). Weakening gradually. The so-called strong reference is a new object that we usually use most often, such as:

	Object object = new Object();

Strongly referenced objects will never be reclaimed by GC. Even in the case of insufficient memory, the JVM would rather throw OutOfMemory error and will not reclaim such objects. Therefore, when we look at the source code of many excellent frameworks, we often see the following Code:
file
  There is a comment behind // help gc, set the object to null, help GC to garbage collect, here is to eliminate strong references, so that the memory of useless objects can be recycled smoothly. Interested children's shoes can look at the source code of excellent frameworks For example, there must be a lot of such writing in JDK / Spring, which shows that the rigorous attitude of these source code authors and the profound programming skills are worth learning!
  And soft references, weak references, and virtual references all correspond in JDK. The implementation of the software corresponds to SoftReference / WeakReference / PhantomReference.Due to the limited space of the blog, not all knowledge points can be explained in detail.You can only tell the children's shoes that they have these knowledge points.Interested children's shoes can go on and learn.

GC recycling strategy

  After talking about how GC determines which objects need to be recycled, let's take a look at the GC's strategies for garbage collection.In general, there are three types: mark removal algorithm / copy algorithm / mark sorting algorithm.

1. Mark removal algorithm

  The mark removal algorithm is the most basic recycling algorithm, which is divided into two parts: marking and clearing: first mark all the objects that need to be recycled, this process is performed during the reachability analysis process. After the marking is completed, all marked Object: file

  The shortcomings of this algorithm are obvious, that is, it will generate a large number of discontinuous memory fragments, which often leads to the inability to allocate larger memory, which has to trigger garbage collection frequently.

2. Copy algorithm

  Since continuous memory space cannot be reclaimed, the memory is divided into two areas from the beginning. Usually only one area is used. When one of the areas is full of memory, when GC is triggered, find the objects that do not need to be reclaimed and replace them. All are transferred to another unused area, and arranged together to make it continuous, so the loop is the copy algorithm:
file

The replication algorithm improves the shortcomings of the discontinuous memory fragmentation in the clearly marked algorithm, but its shortcomings are also obvious.The memory utilization rate is not high, and only 50% of the memory can be used at a time.

3. Tag sorting algorithm

  Since the copy algorithm can only use half of the memory at a time, and the memory usage is not high, then continue to optimize, or use the entire area of ​​memory as the mark-clear algorithm.Different from the mark-clear algorithm, when garbage collection is performed, confirm that it is not necessary. Recycle the objects, and then organize these objects and move them to one end:
file
The advantage of the mark sorting algorithm is that the memory usage is more full, and there is no large amount of memory fragmentation.

Recycling algorithm in Heap

  The java heap uses generational collection for garbage collection. First of all, it is clear why the generation is in the heap? Or, why the java heap uses the generational collection algorithm for garbage collection? Because according to authoritative statistics, more than 80% of the objects They are all dying, that is, these objects are no longer used after the execution of the method is completed, and can be recycled, and the remaining 20% ​​of the objects still need to continue to be used and cannot be recycled.Therefore, according to the object The characteristics of the generational collection are summarized in one sentence. The life cycle of the object is different. The
  so-called generational collection is to divide the Java heap into a new generation and an old generation. The old generation uses a tag sorting algorithm, and the new generation uses a replication algorithm, which will The Cenozoic is divided into E0 and Survivor S0 and S1 (some are also called Survivor from and Survivor to) .By default, the ratio is 8: 1: 1. The ratio of the era is 1: 2 (that is, the young generation accounts for 1/3 of the entire heap area, and the old generation accounts for 2/3):
file
  As for the detailed workflow of the new generation and the old generation, it will not be repeated here. There are too many blogs on the Internet. It should be noted that the new generation The GC is called Minor GC or Young GC, and the GC that occurred in the old generation is called Full GC or Major GC.In general, the efficiency of Full GC will be more than ten times lower than Minor GC!

  After reading this article, I believe that children's shoes should have a certain understanding of JVM garbage collection. In the next article, let's learn the last knowledge point of the JVM article, and the most important knowledge point-JVM performance tuning Excellent, so stay tuned!

  If you feel that the blogger has written well, welcome to pay attention to the blogger's WeChat public account.Bloggers will share technical dry goods from time to time!
file

This article is published by OpenWrite, a blog post multi-platform platform !

Guess you like

Origin www.cnblogs.com/wukongbubai/p/12683510.html