JVM Series - in-depth understanding of JVM garbage collection

Foreword

In the Java virtual machine, the Java objects are loaded into a JVM, its life cycle is divided into seven phases:

As shown above, seven stages of the life cycle of the objects were: the creation stage, the application stage, the stage is not visible, not up phase, collection phase, the end of the stage, and the object memory space reallocation stage.

  • Creation phase

To create a stage can be divided into:

(1) assigned to the object space; (2) to construct an object; (3) from a superclass to a subclass of the static member initialization; (4) the recursive call super class constructor; (5) calls the constructor subclass;

  • Application stage

When the application is given the initial value initialized, switch into the application stage. This phase reference object having at least a strong, soft or explicit references, weak references cited or false;

  • Invisible stage

Can not find the object in the application any strong references, such as the implementation of the program is beyond the scope of the object. But this time the target is still likely to be held by a special GC Roots, for example, objects are native method stack or JNI references cited by the thread running and so on;

  • Unreachable stage

Object is not any strong references cited, and the garbage collector found unreachable;

  • Collection phase

Garbage collectors have found that this object is unreachable, and the garbage collector ready to re-allocate the memory object. If the garbage collector discovers that the object overrides the finalize()method, the garbage collector will collect the exemption of the object, and call the finalize()method. If the object does not override the finalize()method, waiting for the garbage collector reclaims the memory space of the object.

  • End stage

At this time, the object may be executed finalize()method (GC will not necessarily wait for the object's finalize () method performs End), or the object does not override the finalize()method, this time waiting for the garbage collector to collect the object's memory space.

  • Object space reallocation stage

When objects are recovered GC memory space, the life cycle of the object is completely over.

Above, an object is loaded into the life cycle of the JVM. In the Java virtual machine, the recovery object is not visible to the programmer, which means that once the object is not referenced by other objects, it may be marked as unreachable GC, GC and then wait for recovery. When the Java virtual machine recovery object is not referenced, it will undergo the target mark, and the object is the recycling process the garbage collector.

Garbage marking algorithm

In the Java virtual machine, garbage objects (When an object is not held by other objects when the objects are known as spam) labeling algorithm can be divided into counting references and reachability analysis (also part of the article accessibility analysis known as the root search algorithm).

Reference counting

In the "in-depth understanding of Java Virtual Machine", a book, define a reference counting method gives: adding a reference to the object counter in place whenever a reference to it, the reference value of the counter is increased by 1; when referring to failure when the counter value is decreased by 1; 0 any time counter object is no longer being used. But in the current mainstream commercial virtual machines are not using reference counting method, because it is very difficult to solve the problem of mutual references between objects, the following code:

Reference counting algorithm prove _1

Reference counting algorithm prove _2

As the code, when executed

TestReferenceCountingGC gc_1 = new TestReferenceCountingGC();
TestReferenceCountingGC gc_2 = new TestReferenceCountingGC();

gc_1.instance = gc_2;
gc_2.instance = gc_1;
复制代码

When, due new TestReferenceCountingGC()and new TestReferenceCountingGC()two objects is referenced twice, if the counting algorithm according referenced, new TestReferenceCountingGC()and new TestReferenceCountingGC()the counter value are referenced 2. When executed gc_1 = null;, and gc_2 = null;when, there will be 1 citations fails, new TestReferenceCountingGC()and new TestReferenceCountingGC()there are references 1, so if the Java virtual machine uses a reference counting algorithm marks garbage objects, the memory space of the two objects are not garbage collector recovered, GC logs should appear as follows:

[GC (System.gc()) [PSYoungGen: 9339K->4872K(76288K)] 9339K->4880K(251392K), 0.0057164 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
复制代码

In practice, however, there was a GC log as follows:

[GC (System.gc()) [PSYoungGen: 9339K->776K(76288K)] 9339K->784K(251392K), 0.0015327 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
复制代码

GC logs above the current proof garbage marking algorithm Java virtual machine, not using reference counting algorithm.

Reachability analysis

The main idea of ​​reachability analysis is through a series of objects called GC Roots as a starting point, then search down from this node, called the search path traversed by the chain of references (Reference Chain). When an object to GC Roots without any reference to the chain associated (in graph theory, that is, from the GC Roots to this object is unreachable), then do not use the time to prove this object.

In Java, you can have the following (part) as a GC Root are:

  • Virtual Machine stack (Local Variable Table stack frame) in the object reference;
  • Object attribute references the static process zone;
  • The final key area method modified constant references to objects;
  • JNI native method stacks referenced object;

In JDK 1.2 before referenced definition: the local variable table in the VM stack reference value represents types of data stored is the starting address of another memory, this is called a reference memory represents. But this argument can only be used to define the definition is referenced and not referenced these two states. To be able to describe such a class of objects: When the memory is enough, it is retained in memory, if memory space during garbage collection, the memory footprint is still very tight, you can recover these objects.

Thus, references made strong (Strong Reference), soft references (Soft Reference), weak reference (Weak Reference), the virtual reference (Phanton Reference):

  • Strong reference similar to Object obj = new Object()references of this kind, as long as there is a strong reference to the garbage collector will never recover such objects;

  • Soft references a relative weakening of strong references cited a number, you can make objects exempt some garbage collection, and only when the JVM think out of memory, will recover Soft references are pointing. JVM will ensure that before throwing OOM, clean up the soft reference object pointing;

  • Weak references and garbage collection can not be waived, provided access only when under way in the weak state of the object reference. Is associated with a weak reference to an object can only survive until the next garbage collection occurs. When the garbage collector job, regardless of the current memory is enough, will only recover lost objects are associated with weak references;

  • Phantom reference is also known as ghost or apparition cited references, you can not access it through the object. Providing a phantom reference object is only to ensure that finalize()the mechanism later, to do certain things, such as a system to receive notification of when the object is garbage collected.

Garbage collection algorithm

Mark - sweep algorithm

Mark - sweep algorithm is divided into two stages:

  • Marking phase: mark object can be recovered;
  • Clear phases: recovery of labeled target memory;

Mark - sweep algorithm when the most basic algorithm, because garbage collection algorithms are based on the basis of the latter mentioned above, this algorithm transformation, mark - sweep algorithm implementation process is as follows:

Mark - sweep algorithm has two main drawbacks: first mark and sweep efficiency is not high; Second As shown above, clear recyclable after labeling the target space, a large amount of discrete memory fragmentation, fragmentation too may lead to follow-up is not enough memory allocated to the larger object, causing trigger a new round of garbage collection action.

Replication algorithm

In order to solve the mark - sweep algorithm problems caused by memory fragmentation, therefore, put forward a copy algorithm. Replication algorithm the memory space is divided into two equal size, each with only one of them, then put another piece of memory space clean out:

Replication algorithm the problem of low efficiency of replication deficiency, and you do not want to waste 50% of the memory space, you need to provide extra space guarantees to respond memory is used by all objects are 100% survival in extreme cases.

Mark - Collation Algorithm

Copy the old algorithm is generally not used in years, because in the old era, the survival rate of most of the objects is relatively high, it will cause excessive replication algorithm to select the copy operation, resulting in low efficiency. While not using mark - sweep algorithm, because it will produce too much memory fragmentation, leading to easily trigger a new round of garbage collection action. So there has been a marked - Collation Algorithm (mark - compression algorithm). Mark - Finishing Algorithm and mark - sweep algorithms are different, the objects in memory after the tags, the surviving objects compressed into one end of the memory, so that they are sorted together compactly, and live objects outside the object boundary recycling.

Generational collection

Generational collection algorithm in conjunction with a variety of different algorithms to handle different garbage space, so before learning generational collection algorithm first need to understand the Java heap space division. Java heap is divided into a new generation (Young Generation) and the old year (Tenured Generation), while the new generation is subdivided into Eden space, From Survivor To Survivor space and space. Because Java heap inside, most of the objects are "born towards the evening off," and only a few of the life cycle of the object is relatively long, and even some object life cycle and life cycle of the virtual machine as long as, the use of different objects Life Cycle different garbage collection algorithm, which is the concept of generational collection.

According to the division of the Java heap space, most to garbage collection can be divided into two methods:

  • Minor GC: the new generation garbage collection;
  • Full GC: also known as Major GC, Full GC is usually accompanied by at least one Minor GC, its low frequency of collection, takes a long time.

When performing a Minor GC when the virtual machine to copy the Eden space objects to survive To Survivor space, while the surviving From Survivor space objects are also copied to the To Survivor space, then space and then Eden From Survivor space inside all Clear objects, this time the pointer pointing space to Survivor From Survivor space, that name becomes a space to Survivor From Survivor space to wait for the next Minor GC to come. Of course, not all new objects are allocated in the space of Eden, when a new object needs to occupy memory space than the space available Eden space is much larger, the new object will be allocated directly in the old era.

When the object is still alive after the new generation through a certain number of Minor GC, the VM will put the object was promoted to the old era. Virtual machine for each object defines a target age (Age) counter. When a new object in space through a Minor GC Eden still alive and can be accepted Survivor space, took to the age counter is set to 1, then each of the object through a Minor Gc, put the age of the object counter is incremented by one, when Age counter object to reach the threshold of old age, when promoted, the object will be promoted to the old era, the general virtual machine is set to 15.

Of course, the virtual machine does not necessarily need to target the age counter value has reached the threshold of old age for promotion to promotion of the object. If the sum of the same age Survivor space of all objects larger than half the size of the space Survivor, age greater than or equal to the age of the subject can enter years old, and without waiting until the value of the counter to the age old age meet promotion threshold.

summary

Had also intended to memory allocation and recovery strategies, GC log analysis also written here, but this chapter Miaole Miao space, feeling a bit excessive length, ha ha ha ... that .... I do not write it .

Guess you like

Origin juejin.im/post/5e81e0075188257382097586