JVM- [02] recognized that the JVM garbage collection algorithm and collector

1. Object survival judge

1.1. Reference counting algorithm Reference Counting

  • When objects are added to a reference counter, whenever a reference to its place, the counter value is incremented; when referring to fail, a counter value is decremented; 0 any time counter object is no longer being used.
  • Mainstream JVM is not selected reference counting algorithm to manage memory, the main reason is that it is difficult to solve the problem of mutual circulation between object references.

1.2. Reachability analysis algorithm Reachability Analysis

  • By a series of objects called "GC-Roots" as a starting point, the search starts from the nodes down, called search path traversed reference chain (Reference Chain), when an object is not connected to any reference GC Roots chain when, then it proves that this object is the (unreachable graph theory) unavailable.
  • It can be used as an object of GC Roots:

VM stack (Local Variable Table War) referenced objects

Method static property class object referenced by region

The method of constant reference object region

JNI native method stack objects referenced


1.3. Reference types Reference

  • Strong references: Strong Reference

It refers to a similar Object object = new Object()type of reference, as long as there is strong references, the garbage collector will never recover the referenced object.

  • Soft references: Soft Reference

There used to describe some, but not necessary objects. The JDK provides SoftReference to achieve soft references

Before going to system memory overflow occurs, these objects will be recovered within a column into a second range of recovery. If the recovery has not enough memory, memory overflow exception will be thrown.

  • Weak references: Weak Reference

Used to describe non-essential object, its strength is more weaker, it is associated with a weak reference object than the soft references can only survive until the next garbage collection before they occur . The JDK provides WeakReference classes to implement a weak reference.

When the garbage collector job, regardless of the adequacy of current memory will only recover lost objects are associated with weak references.

  • False quote: Phantom Reference

Also known as ghost or phantom references cited, it is a reference to the weakest relationship. Whether there is a phantom reference object has completely will not affect their survival time, can not be achieved by a phantom reference object instance. JDK classes implemented to provide the virtual reference PhantomReference

The sole purpose of setting a virtual reference associated with this target is able to receive a notification when the target system is recovered collector .


1.3. Reference types Reference

  • Unreachable objects temporarily in a "probation" stage, to really proclaim the death of an object, at least twice to go through the labeling process:

    If the object during reachability analysis found no references chain and GC Roots connected, then it will be the first mark and conduct a screening with the condition that the object whether it is necessary to perform the finalize () method . When the object is not covered finalize () method, or a finalize () method has been invoked over the virtual machine , the virtual machine these two cases are considered "not necessary to perform."

    finalize () method is the last chance to escape death fate of the object, later GC will be F-Queue object in a second small mark, if the object to be successful in their own rescue finalize () in - just to reconnect with any reference to an object on the chain can be associated, such as his own (this keyword) assigned to the member variables of a class variable or an object, then it will be removed out "will be recovered" at the second mark collection; if the object has not escaped this time, it really is that essentially recovered.

    Note: If the object chanting method for determining the need to implement finalize (), then the object will be placed in a queue called the F-Queue, and then JVM will create a low priority Finalizer thread to execute it. JVM trigger this method does not ensure that it will perform the end, because if the object finalize method if the slow implementation or infinite loop, will likely lead to F-Queue Queue other objects wait forever, and even cause the entire system memory recall Ben collapse.


2. Garbage collection count

2.1 mark - sweep algorithm Mark-Sweep

  • Algorithm in two stages, namely the mark and sweep.

    1. Marking the premises need to be recovered objects
    1. After marking the completion of uniform recycling all marked objects
  • The main deficiency algorithm

    1. Efficiency, marking and clearing two process efficiency is not high
    2. Space problem, clear labeling regret not produce large amounts of contiguous space

    Too much space debris could cause future can not find enough contiguous memory to store the allocation of large objects and had to trigger another garbage collection.


2.2 replication algorithm Copying

  • According to the available memory capacity is divided into two equal size, wherein each time one. The current one runs out, will also move to the survival of the object on top of another piece, and then the memory space has been used one time clean out. Thus each stack is half the entire memory recovery zone, is also not considered when allocating memory complexities memory fragmentation etc., simple, efficient operation. The cost of memory will be reduced by half.

2.3 mark - Collation Algorithm Mark-Compact

  • Not directly recyclable objects to clean up after the mark, but to all surviving objects are moved to the end, then clean out directly to end border that memory.

2.4. Generational collection algorithm Generational Collection

  • The JVM heap memory is divided into the old and the new generation's, to take a different collection algorithms for different times.

    In each new generation of garbage are found to have died when a large number of objects to collect, only a few survive, then copy the selection algorithm.

    Old era because of the high survival rate of the object, there is no extra space is allocated to its guarantee, it must use the "mark - clean-up" or "mark - finishing" algorithm to recover.


3. Garbage collection count

3.1. Enumeration root

  • Reachability analysis to find references chain operations from GC Roots node, the method area now refers only have a few hundred, one by one to check the inside of the reference time-consuming.
  • Reachability analysis of sensitive execution time is reflected in the GC pauses, the analysis must be able to ensure a consistent snapshot,

    Here is the consistency throughout the entire period, the implementation of the system to open up like frozen in time on a node. If this does not satisfy the accuracy can not be guaranteed. This is one important reason must stop all Java threads execute GC carried out. Even in the CMS collector (known as a virtual standstill does not occur) in the enumeration root and it must be stalled.

  • Exact type GC mainstream JVM is used, so when the implementation of the system does not need to be left out completely halt the execution context to check all references and global position, JVM know where to store this information, the use of a set of data OopMap in HotSpot structure to achieve this objective.

    After loading the class, the type of each offset within the object HotSpot bar calculated in the JIT compilation process, is also recorded at a specific position in the stack and registers a position which is a reference. When GC scan, you can directly know this information.


3.2. Safety point Safepoint

  • When HotSpot recording stack and register which is a reference position, the "specific locations" called "safe point", ie program execution can not come to a halt in all places beginning GC, only to reach a safe point at a specific location in order to time out.

  • Safety point can not be too much nor too little, too much increases system load, too GC wait too long. So the choice of basic safety point is to "let the program whether the long-running characteristics" as the standard selected.

    Since each instruction execution time is very short because the instruction stream of the program is less likely to be too long run length is too long, it is characterized in a long sequence of instructions multiplexing , cycle skip , abnormal jump , etc.

  • How to ensure that all threads have occurred GC went to a safe point and then come to a halt, there are two options:

    Preemptive interrupt (Preemptive Suspension): When GC occurs, first of all threads in all the disruption, if it is found where there is not a security thread break point, then resume the thread, it went to the safe point.

    Active interrupt (Voluntary Suspension): When the GC needs to be interrupted thread, the thread does not directly operate, simply set a flag to flag this initiative to poll each thread of execution, interrupt flag is found to be true to himself when he interrupted pending . Polling place signs and safety point are coincident, plus another to create objects need to allocate local memory.


3.3. Security area Safe Region

  • It refers to a secure area section code fragment, made reference relationship does not change. GC is safe to start anywhere in this area.
  • When the thread executes the code in the Safe Region, he said he first entered the Safe Region, which this time, JVM to initiate GC, do not control Safe Region identified itself as the state of the thread. When the thread to leave Safe Region, it is to check whether the system has completed the enumeration root (or the entire process GC), if completed, that thread continues, otherwise it will have to wait to know it is safe to leave the Safe Region received the signals so far.

4. garbage collector

4.1. Serial Collector

  • Is a single-threaded garbage collector, it will only use one CPU or a garbage collection thread to complete the collection.
  • When it is carrying out garbage collection, you must suspend all other thread work, know the end of the collection.
  • Apply to Client.
  • The use of the new generation of replication algorithm, suspend all threads; old mark's use - Collation Algorithm, suspend all threads.

4.2. ParNew collector

  • Serial multi-threaded version, in addition to the use of multiple threads for garbage collection, the other acts including the available parameters Serial collector, collection algorithms, Stop The World, object allocation rule, recovery strategies are exactly the same with the Serial Collector

  • In addition to Serial collector, it currently only works with the CMS collector.

    ParNew collectors also use -XX:+UseConcMarkSweepGCthe default options for the new generation of collectors, you can use -XX:+UseParNewGCthe option to specify mandatory.

    ParNew single Nucleation no better than the Serial Collector effect

    You can use -XX: ParallelGCThreads parameter to limit the number of threads for garbage collection.


4.3. Parallel Scavenge collector

  • He is a new generation of processors, also using the copy algorithm collectors, but also a parallel multi-threaded processor.

  • Parallel Scavenge collector object is to reach a certain controlled.

    Throughput = time to run user code / (user code running time + time garbage collection)

  • It provides two parameter control throughput: maximum control garbage collection pause time -XX: MaxGCPauseMillis, directly throughput size -XX: GCTimeRatio

    -XX: MaxGCPauseMillis: Allowed values ​​is a number greater than 0 milliseconds, the collector will be possible to ensure cost recovery of memory does not exceed a set value, the GC pause time is shortened and the throughput at the expense of spatial exchange of the new generation.

    -XX: GCTimeRatio: parameter value is an integer greater than 0 and less than 100, that is, the ratio of the total garbage collection time period, such as: 19, the maximum allowable time is 1 / (1 + 19); 99, the maximum time is allowed 1 / (1 + 99)

  • Parallel Scavenge参数:-XX:UseAdaptiveSizePolicy

    -XX: UseAdaptiveSizePolicy open this parameter, you do not need to manually specify the size of the new generation, than the column Eden and Survivor areas, promotion details of the object's old size parameters. Virtual opportunity to collect according to the operation of the current system of performance monitoring, dynamically adjust these parameters to provide the most suitable pause time and maximum throughput, this adjustment is called adaptive strategy GC (GC Ergonomics)


4.4. Serial Old collectors

  • Serial old collector's edition, single-threaded, use the "mark - finishing" algorithm
  • Used when concurrent collection Concurrent Mode Failure occurs as CMS collector back element.

4.5. Parallel Old collectors

  • It is the Parallel Scavenge collector's version of the old. The use of multi-threading and "mark - finishing" algorithm.
  • Focus on throughput and CPU resources sensitive scenes, you can prioritize Paralled Scavenge + Parallel Old collector.

4.6 CMS (Concurrent Mark Swap) collector

  • CMS (Concurrent Mark Sweep) is a collector for the shortest recovery time objectives pause collector. Suitable for Internet sites and on the service side B / S system. Concurrent collection, low standstill.

  • CMS collector is based on the "mark - sweep" algorithm, the process is divided into four steps:

    The initial mark (CMS initial mark): just mark what the object can be linked directly to the GC Roots, very fast.

    Concurrent mark (CMS concurrent mark): GC Roots Tracing of the process.

    Relabeled (CMS remark): correction due to the user program continues to operate during the concurrent mark which led to marked record that part of the object from label changes. Pause time is generally longer than the initial mark, a short time than the concurrent mark.

    Concurrent Clear (CMS concurrent sweep)

    Where the initial marker , re-mark these two steps still need to "stop the world"

  • CMS few drawbacks:

    Very sensitive to CPU resource: Although it does not lead to a standstill user threads, but the thread is enabled, CPU consumption of computing resources, will lead to reference the slower, lower overall throughput. The default number of threads CMS enabled the recovery is (the number of CPU + 3) / 4. In other words, the less CPU, occupy more performance, the greater the impact on the program. In response to this situation, JVM provides a "incremental concurrent collector" (Incremental Concurrent Mark Swap / i-CMS), to simulate the use of preemptive multitasking, in concurrent mark and sweep of time for GC threads, user threads alternately run. Minimize the GC thread time exclusive resources, so that the entire process of garbage collection time will be longer, but the impact on the user becomes less.

    CMS can not handle " floating garbage (Floating Garbage) ", may appear "Concurrent Mode Failure" caused by the failure to produce another of Full GC. Floating garbage that is generated by the user thread is still running in the CMS complicated by heart clean up garbage, garbage appear after this part of the mark, no longer as treatments. Because user thread is still running, you need to set aside a portion of memory for user threads, so CMS can set the trigger percentages: -XX:CMSInitiatingOccupancyFraction=70and -XX:+UseCMSInitiatingOccupancyOnlythe former set the percentage, the percentage of the latter set up only settings, not to automatically adjust the JVM, if not set back the first 70 will be used, then it will automatically adjust with the JVM. If the CMS is running, reserved memory can not meet the need, there will be "Concurrent Mode Failure", this is the backup plan to use the JVM will enable Serial Old to re-old's collection. Therefore, the ratio can not be set too high, otherwise it will easily lead Concurrent Mode Failure, but lower performance.

    CMS is based on the "mark - sweep" algorithm, so after the end of the collection will have plenty of space debris. Although a lot of space, but can not find a contiguous space for large objects, which had to trigger a Full GC. To solve this problem, CMS provides a -XX: + UseCMSCompactAtFullCollection, in time for CMS to be open Full GC memory fragmentation merge sort, this process can not be performed concurrently, to solve the problem of space debris, but the pause time becomes longer. CMS also has a -XX: After CMSFullGCsBeforeCompaction, this parameter is used to set how many times the uncompressed Full GC execution, followed by a band compression, each entry is 0 is a Full GC are compressed.


4.7 G1 (Garbage-First) Collector

  • G1 garbage collector is designed for server applications. HotSpot developed to replace the CMS, the following features:

    Parallel and Concurrent: G1 can take full advantage of multi-CPU, multi-core hardware advantages in the environment, the use of multiple CPU to shorten the Stop-The-World's pause time, some of the other collectors would otherwise require GC pause action Java thread execution, G1 collector still allows concurrent Java programs by way continue.

    Generational collection: generational concept is still preserved in G1. G1 may not be required with other collectors will be able to independently manage the entire GC heap, it is possible to use different ways to deal with the newly created object and has survived for some time to get through GC old object multiple times to get a better collection results . G1 can manage the new generation and the old's own.

    Predictable pause: reduce dwell time is a common concern and CMS G1, G1 in addition to the pursuit of low pause, they have created a predictable pause time model, allows users to explicitly specify the period of time in a segment of length M milliseconds, consumed in the garbage collection time may not exceed N milliseconds, which is characteristic of almost real-time Java (RTSJ) of the garbage collector. G1 can be planned to avoid garbage collection in the entire JVM heap can be recovered for each region in the target value (the ratio of recovery of memory and time consumption in the region can get) were analyzed at the end of the screening recovery stage, for each a region where the recovery value of the object (memory recovery ratio of time spent in the region and could get) the final sort, the user can customize the dwell time, the G1 can be recovered part of the region! This makes the rest time is the users themselves can control!

    Integration space, no memory fragmentation produced: Because G1 uses a separate region (Region) concept, G1 is based on the whole "mark - finishing" collection algorithm is based on the "Copy" from the local (two Region) algorithm point of view to achieve, but in any case, these two algorithms mean no memory space debris during the G1 operations.

  • The scope of the collection of other collectors before the G1 is a whole new generation or old's, and G1 is no longer the case. When using the G1 collector, Java heap memory layout will vary greatly with other collectors, it will be the entire Java heap is divided into multiple independent regions of equal size (Region), although there are also retained the concept of the new generation and the old age but the new generation and the old year is no longer physically separated, and they are part of the Region (not necessarily consecutive) collection.

  • G1 collector model has been able to establish a predictable pause time, because it can be planned to avoid heap garbage collection across the region throughout Java. G1 track individual Region inside the garbage accumulation value size (space recovery obtained experience and the time required to recover), in the background to maintain a priority list, according to each collection time allowed, the maximum recovery value of priority Region ( this is the reason Garbage-First name). This use of memory space and Region division have priority area recovery methods to ensure the G1 collectors can obtain the highest possible collection efficiency in a limited time.

  • G1 collector, the object references between the Region and other objects between the collector and the new generation's old quote, virtual machines are using Remembered Set to avoid a full scan of the heap. Each Region has a G1 in the corresponding Remembered Set, virtual machine discovery procedure of Reference in the type of data write operation, it will produce a temporary interruption Write Barrier objects write operation, inspection Reference cited is in a different Region among (in the example of the generational check whether the object is a reference to the old era of the new generation of the object), and if so, then by CardTable the relevant reference information is recorded into the Region's Remembered Set the referenced object belongs. When the memory recovery, adding Remembered Set in the enumeration range GC root node can not guarantee there will not be missing the whole heap scan.

  • Remembered Set not calculated maintenance operation, the operating G1 collector can be divided into the following steps:

    Initial labels (Initial Marking)

    Concurrent mark (Concurrent Marking)

    The final mark (Final Marking)

    Filter Recycling (Live Data Counting and Evacuation)

Published 34 original articles · won praise 7 · views 8154

Guess you like

Origin blog.csdn.net/ilo114/article/details/100776479