Talk about Java's GC mechanism

On someone Valar
For reprint please retain the original link
Some pictures from Baidu, if infringement please contact deleted

Article directory

  • What is GC
  • JVM memory structure brief
  • Reachability analysis and GC Roots
  • Common garbage collection algorithm

1. What is GC

GC: garbage collection (Garbage Collection), in the computer industry refers to dynamic memory (memory) on a computer when no longer needed, it should be released to make room for storage, easy to use him. This storage resource management, known as garbage collection.

Some language is no garbage collection, like C, C++if necessary to release unwanted variable memory space is handled by themselves.

Some of the other languages, such as Java, C#support the garbage collector, Java Virtual Machine (JVM) or .NET CLR memory resources found in nervous and they will automatically go to clean up useless objects (not a reference to the object) occupied by memory space.

And we are mainly talking about Java in the GC today.


Mentioned above JVMwill automatically clean up useless objects, then we have a question:

  • JVMThe clean-up is which one object?
  • Which objects will be cleaned up, why not clean up A clean-B?
  • JVMHow it is clean?

These three questions will correspond to the next section 3 answered their questions.

2. JVM memory structure brief

We all know that, Java code to run on a virtual machine, the virtual machine and the memory division during the execution of the Java program will manage for a number of different data regions, which have their own purposes. In the "Java Virtual Machine Specification (Java SE 8)" described in the following memory area structure JVM runtime:

These are the Java Virtual Machine Specification, different virtual machine implementations may vary, but generally will abide by norms

  • Methods District: class information storage has been loaded in the virtual machine, constants, static variables
  • Heap: heap memory is the biggest piece of the Java virtual machine management. The only purpose is to store the object instance. Stored in a virtual machine stack just cited, and references to a heap of objects. GC major role in the region .
  • VM stack: the local variable table, like the operand stack. Virtual machine stack Java memory model described method executed: each time the method is performed simultaneously creates a stack frame (Stack Frame) for storing local variable table, stack operation, dynamic link, the method returns the address and other information.
  • Native method stacks: stack similar to virtual machines, providing services for the native method.
  • Program Counter: Record the current thread execution method of execution to the first few lines. If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instructions being executed. If you are performing a Natvie method, the counter value is null (Undefined).

3. reachability analysis and GC Roots

3.1 reachability analysis

Java to determine reachability analysis by an object is not "junk."

The basic idea of the method is carried out through a series of "GC Roots" object as a starting point for the search, if there is no reachable route between the "GC Roots" and an object, the object is not called up, but it should be noted that It was decided that the target will not necessarily become unreachable recyclable objects . Is determined to be unreachable objects to be recycled objects must be experienced at least twice the labeling process , the possibility of these two marks if the process is still not escape become recyclable objects, basically really become the object of recyclable .

Note that by its very nature is to identify all live objects to the rest of the space identified as "useless" and not find all the dead objects and reclaim the space they occupy.

Below, when the absence of a reference to object5,6,7 GC Roots, i.e. unreachable GC Roots, it is determined that they are not reachable.

3.2 GC Roots

For which objects can be treated as GC Roots online there are many claims, some not enough authority, there is not comprehensive enough. Finally found an official document of the eclipse, which has the following description: Garbage Collection Roots (own translation of a little, if any inaccuracies, please indicate)

A garbage collection root is an object that is accessible from outside the heap. The following reasons make an object a GC root:

1. System Class (被boostrap 或者系统类加载器加载的系统类)
    Class loaded by bootstrap/system class loader. For example, everything from the rt.jar like java.util.* .
   
2. JNI Local( 一些用户定义jni 代码或者jvm的内部代码局部变量)
    Local variable in native code, such as user defined JNI code or JVM internal code.
    
3. JNI Global( jni 代码中的全局变量)
    Global variable in native code, such as user defined JNI code or JVM internal code.
    
5. Thread Block(被阻塞的线程引用的对象)
    Object referred to from a currently active thread block.

6. Thread (正在运行的线程)
    A started, but not stopped, thread.
    
7. Busy Monitor(正在等待的线程)
    Everything that has called wait() or notify() or that is synchronized.
    For example, by calling synchronized(Object) or by entering a synchronized method. 
    Static method means class, non-static method means object.
    
8. Java Local(仍然在线程的栈中的方法的传入参数或方法内部创建的对象)
    Local variable.
    For example, input parameters or locally created objects of methods that are still in the stack of a thread.

9.Native Stack(本地方法栈中输入或输出参数,例如,用于文件/网络I/O的方法或反射的参数。)
    In or out parameters in native code, such as user defined JNI code or JVM internal code. 
    This is often the case as many methods have native parts and the objects handled as method parameters become GC roots.
    For example, parameters used for file/network I/O methods or reflection.
    
10.Finalizable(在回收队列中的对象)
    An object which is in a queue awaiting its finalizer to be run.
    
11. Unfinalized(覆盖了finalize方法但是还没有被放入回收队列中的对象)
    An object which has a finalize method, but has not been finalized and is not yet on the finalizer queue.

12.Unreachable(一个从任何其他根无法访问的对象,但由Memory Analyzer Tool 标记为根,以便该对象可以包含在分析中)
    An object which is unreachable from any other root, 
    but has been marked as a root by MAT to retain objects which otherwise would not be included in the analysis.

13. Java Stack Frame
    A Java stack frame, holding local variables.
    Only generated when the dump is parsed with the preference set to treat Java stack frames as objects.
    
14. Unknown
  An object of unknown root type. Some dumps, such as IBM Portable Heap Dump files, do not have root information. 
  For these dumps the MAT parser marks objects 
  which are have no inbound references or are unreachable from any other root as roots of this type. 
  This ensures that MAT retains all the objects in the dump.
复制代码

Under is a brief summary:

  • Loaded by the system class loader (system class loader) Object
  • Alive thread, the thread waits, or blocks included in the
  • The current method (Java method, native method) is called some of the parameters / local variables
  • District method static variables, constants referenced objects
  • Held by JVM - JVM because of the special purpose GC retention as an object, but in fact the realization of the JVM is associated. Some types may be known: the system class loader, a number of important known JVM exception class, some of the pre-assignment for processing and the abnormality some custom class loader and the like.

4. Common garbage collection algorithm

4.1 Mark-Sweep (mark - sweep) algorithm

This is the most basic garbage collection algorithm, it is the most basic reason is because it is the easiest to implement, is also the simplest ideas. Mark - sweep algorithm is divided into two phases: phase marking and clearance phase. Mark phase task is to mark all the objects that need to be recovered, the cleanup phase is to reclaim the space marked occupied by the object . FIG using the following procedure:

Advantages and disadvantages:

  • Tag can easily see from the figure - it is easier to remove algorithm
  • But there is a more serious problem is prone to memory fragmentation, too much fragmentation can lead to follow-up process requires not find enough space for the large object allocation space prematurely trigger a new garbage collection action.
4.2 Copying (Copy) algorithm

In order to address the shortcomings Mark-Sweep algorithm, Copying algorithm it was put out. It can be used by the memory capacity is divided into two equal size , uses only one of them. When this piece of memory is used up, it copies the still alive object to another piece on top of the memory space that has been used once and then clean out, so that is not prone to memory fragmentation problems. FIG using the following procedure:

Advantages and disadvantages:

  • This algorithm is simple, efficient and not prone to memory fragmentation
  • However, the use of memory space but then made a high price, because the available RAM reduced to half.
4.3 Mark-Compact (mark - finishing) algorithm

The algorithm mark phase and Mark-Sweep same, but after the completion flag, it is not directly recycled to clean up the object, but the live objects are moved to an end , and then clean off the end of the memory outside of the boundary. FIG using the following procedure:

Advantages and disadvantages:

  • Mark - Collation Algorithm in mark - sweep algorithm based on, has conducted a moving object, so higher costs
  • But it solves the problem of memory fragmentation.
4.4 Generational Collection (generational collection) algorithm

Generational collection algorithm is an algorithm currently most JVM's garbage collector uses. Its core idea is based on the object alive the memory of the life cycle is divided into several different areas . Under normal circumstances the heap area into the old year (Tenured Generation) and the new generation (Young Generation) , in the HotSpot, the designer will be incorporated into the method area is also included in GC generational collection , and its has a name, permanent behalf (PerGen Space) .

  • Old year: each is characterized by only a few objects need to be recycled garbage collection, are generally used 标记-整理(Mark-Compact), 标记-清除(Mark-Sweep)algorithm.
  • New Generation: characterized by each garbage has a lot of objects that need to be recovered during the recovery, both for the new generation of mining 取复制(Copying)algorithms.

Because every new generation garbage collection in most of the objects to be recovered, the number of operations that is less need to copy, but the practice is not in accordance with the ratio of 1: 1 to divide the space of the new generation, in general is the Cenozoic is divided into a larger space Eden (Garden of Eden, where Adam and Eve eating the forbidden fruit born doll that represents the area of memory allocated for the first time, more relevant) and two smaller Survivor space , each time Eden space and space in which a piece of Survivor, when recovered, copy the Eden and Survivor also live objects to another piece of Survivor space, and then clean out Eden and Survivor just used space. Generally the distribution ratio of eden 80%, survivor1 10%, survivor2 10%.

  • Permanent Generation: the relationship between the relational approach area and permanent generations much like Java interfaces and classes, the class implements the interface, and is permanently on behalf of HotSpot virtual machine for a virtual machine specification District method of implementation.

In the process of garbage collection areas are generally "cost" is low, because the main region in the process of recovery of two parts: discarding useless classes and constants . Constant recycling of waste and recovery of other objects in the year, but requires the following conditions is determined whether a type of unnecessary:

  1. All instances of the class have been recovered, Java heap class does not exist 任何实例;
  2. Corresponding to the class Class对象is not referenced anywhere (i.e. anywhere not reflected by the method of accessing the class);
  3. Load the class ClassLoaderhas been recovered.

Even if the above conditions are satisfied, but may not necessarily be recovered, Hotspot VM also provides a -Xnoclassgcparameter control (CLASS off the garbage collection).

Reference: cloud.tencent.com/developer/a...

Attached: HotSpot virtual machine has been canceled after the 1.8-generation permanent, instead yuan space, meta-information category is in the element space, the yuan, but introduce more storage space here, interested students can themselves understand the next.

5 Conclusion

This is about the basic concepts of Java GC, JVM memory structures, GC recovery of the basic mechanisms have been introduced in almost the same. We have questions, or find an error in the article can exchange messages at the bottom. Like Java garbage collector element space here and no detailed description will then have time to talk to separate out.

Guess you like

Origin juejin.im/post/5df2ec71f265da33d7442433