Garbage Collection Mechanism and Memory Allocation

Summary

  The three areas of the program counter, virtual machine stack, and local method stack are born with the thread, and disappear with the thread; the stack frame in the stack is methodically popped and pushed by the executor as the method enters and exits. How much memory is allocated in each stack frame is basically known when the class structure is determined (although some optimizations will be made by the JIT compiler at runtime), so the memory allocation and reclamation of these areas are deterministic , in these areas, you don't need to think too much about recycling, because when the method ends or the thread ends, the memory will naturally follow the recycling. The Java heap is different from the method area. The memory required by multiple implementation classes in an interface may be different, and the memory required by multiple branches in a method may also be different. We can only know when the program is running. Which objects are created, the allocation and reclamation of this part of the memory are dynamic, and the garbage collector is concerned about this part of the memory

First, determine whether the object is alive algorithm

1. Reference counting algorithm

  Add a reference counter to the object. Whenever there is a reference to it, the counter value is incremented by 1; when the reference is invalid, the counter value is decremented by 1; the object whose counter is 0 at any time is impossible to be used again.

2. Reachability Analysis Algorithm

  The basic idea of ​​this algorithm is to use a series of objects called "GC Roots" as the starting point, and start searching downward from these nodes. The path traversed by the search is called reference connection. When an object has no reference to GC Roots When the chain is connected (in the words of graph theory, the object is unreachable from the GC Roots), it proves that the object is unavailable.

2. Garbage Collection Algorithm

1. Mark-Sweep Algorithm

  

  Just like its name, the algorithm is divided into two stages: "marking" and "clearing": first, all objects that need to be recycled are marked, and all marked objects are uniformly recycled after the marking is completed. It has two shortcomings: one is Efficiency problem, the efficiency of the two processes of marking and clearing is not high; the other is the space problem. After the mark is cleared, a large number of discontinuous memory fragments will be generated. Too much space fragmentation may lead to the need for larger allocations during the program running in the future. object, unable to find enough contiguous memory and had to trigger another garbage collection early.

2. Replication algorithm

  

  The copy algorithm divides the available memory into two equal-sized blocks according to the capacity, and only uses one of them at a time. When the memory of this block is used up, the surviving objects are copied to the other block, and then the used objects are copied. The memory space is cleared at one time. In this way, the entire half-area is reclaimed every time, and there is no need to consider complex situations such as memory fragmentation during memory allocation. Just move the top pointer of the heap and allocate memory in sequence. The implementation is simple and the operation is efficient. The cost is the memory Shrinking to half the original size is too expensive.

  IBM's special research shows that 98% of the objects in the new generation are "morning and dying", so it is not necessary to divide the memory space according to the ratio of 1:1, but divide the memory into a larger Eden space and two Block a smaller Survivor space, use Eden and one of the Survivors at a time. When recycling, copy the surviving objects in Eden and Survivor to another Survivor space at one time, and finally clear Eden and the Survivor space just used. The default size ratio of Eden and Survivor in HotSpot virtual machine is 8:1, that is, the available memory space in each new generation is 90% (80%+10%) of the entire new generation capacity. We have no way to guarantee that each recycling will have only No more than 10% of the objects survive. When the Survivor space is not enough, it needs to rely on other memory (here refers to the old age) for allocation guarantee

3. Marking-collating algorithm

  

  The copy collection algorithm needs to perform more copy operations when the object survival rate is high, and the efficiency will be lower. More importantly, if you don't want to waste 50% of the space, you need to have extra space for allocation guarantees to deal with the extreme situation where all objects in the used memory are 100% alive, so in the old age, you can't directly choose this kind of memory. Algorithm 
According to the characteristics of the old age, someone proposed a "mark-clean" algorithm. The marking process is still the same as the "mark-clean" algorithm, but the subsequent steps are not to clean up the recyclable objects directly, but to make all surviving objects clean up. Think of a move, and then directly clean up the memory outside the end boundary.

4. Generational collection algorithm

  At present, the garbage collection of commercial virtual machines adopts the "generational collection" algorithm. This algorithm does not have any new ideas. The knowledge divides the memory into several blocks according to the different life cycles of the objects. Generally, the java heap is divided into the new generation and the new generation. Old age, so that the most appropriate collection algorithm can be used according to the characteristics of each age. In the new generation, a large number of objects are found to die during each garbage collection, and only a small number survives. Then the replication algorithm is used, and the mobile phone can be completed with only a small cost of copying the surviving objects. In the old generation, the survival rate of objects is high. , it must be reclaimed using a "mark-sweep" or "mark-clean" algorithm without extra space to allocate it.

3. Garbage collector

1. Serial collector

  This collector is a single-threaded collector, but its "single-threaded" meaning does not only mean that it will only use one CPU or one collection thread to complete garbage collection work, but more importantly, when it performs garbage collection , must suspend all other worker threads until it finishes collecting.

  The Serial collector is a good choice for virtual machines running in Client mode.

2. ParNew collector

  The ParNew collector is actually a multi-threaded version of the Serial collector. In addition to using multiple threads for garbage collection, the rest of the behavior includes all the control parameters available to the Serial collector (for example: -XX: SurvivorRatio, -XX: PretenureSieThreshold, etc.), The collection algorithm, Stop The World, object allocation rules, recycling strategy, etc. are all the same as the Serial collector. 
ParNew is the preferred new generation collector in many virtual machines running in Server mode. One of them is not related to performance but very important. The reason is that, in addition to the Serial collector, currently only it can work with the CMS collector. CMS is the collector of the old generation, and the new generation can only choose one of the ParNew or Serial collectors. The ParNew collector also uses -XX: The default new generation collector after the +UseConcMarkSweepGC option, you can also use the -XX:+UserParNewGC option to force 
it The overhead of thread interaction, the collector cannot 100% surpass the SRJ collector in the environment of two CPUs implemented by hyper-threading technology. Of course, as the number of CPUs that can be used increases, it is still beneficial for the effective utilization of system resources during GC. The number of threads enabled by default is the same as the number of CPUs. In an environment with a lot of CPUs, you can use -XX :ParallelGCThreads parameter to limit the number of threads for garbage collection.

3. Parallel Scavenge Collector

 

  This collector is a new generation collector, which is also a collector using a replication algorithm, and a parallel multi-threaded collector, which looks the same as ParNew. The focus of collectors such as CMS is to shorten the pause time of user threads during garbage collection as much as possible, while the goal of Parallel Scavenge collector is to achieve a controllable throughput. The so-called throughput is the difference between the time the CPU spends running user code and the The ratio of the total CPU consumption time, that is, throughput = time to run user code / (time to run user code + garbage collection time) 
The shorter the pause time, the more suitable the program that needs to interact with the user. A good corresponding speed can improve user experience, while High throughput can efficiently use CPU time to complete the program’s computing tasks as soon as possible, which is mainly suitable for tasks that do not require too much interaction in the background. The 
Parallel Scavenge collector provides two transmissions for precise throughput control, namely The -XX:MaxGCPauseMillis parameter that controls the maximum garbage collection pause time and the -XX:GCTimeRatio parameter that directly sets the throughput size. The shortening of the GC pause time is in exchange for the sacrifice of throughput and the space of the new generation. The system adjusts the new generation to a smaller size. The collection of 300MB young generation is definitely more than the collection of 500MB blocks. This also causes garbage collection to occur more frequently. The original collection is once every 10 seconds with a pause of 100 milliseconds, but now it is collected every 5 seconds with a pause of 70 milliseconds each time. Pause times are indeed going down, but so is throughput. So this collector is also often referred to as a "throughput-first" collector. In addition to the above two parameters, there is also a parameter -XX: +UseAdaptiveSizePolicy, which is a switch parameter. When this parameter is turned on, there is no need to manually specify the size of the new generation (-Xmn) and the ratio of Eden to Survivor area. (-XX: SurvivorRatio), the age of the object to promote the old generation (-XX: PretenureSizeThreshold) and other detailed parameters, the virtual machine will dynamically adjust these parameters according to the current system operation and mobile phone performance monitoring information to provide the most appropriate pause time or maximum Throughput, this adjustment method is called GC adaptive adjustment strategy.

4. Serial Old collector

  Serial Old is the old version of the Serial collector. It is also a single-threaded collector and uses the "mark-sort" algorithm. The main significance of this collector is to use it for virtual machines in Client mode. If it is in Server mode, it has two main purposes: one is to use with the Parallel Scavenge collector in JDK1.5 and earlier versions, and the other is to serve as a backup plan for the CMS collector. Used when Concurrent Mode Failure occurs in concurrent collection.

5. Parallel Old Collector

  Parallel Old is the old version of the Parallel Scavenge collector. It uses multi-threading and the "mark-sort" algorithm. In Jdk1.5 and earlier versions, the new generation chose the Parallel Scavenge collector, and the old generation did not use the Serial Old collector. There is no choice. Due to the "drag" on the server-side application performance of the Serial Old collector in the old generation, the use of the Parallel Scavenge collector may not be able to maximize the throughput of the overall application.

  Until the emergence of the Parallel Old collector, the "throughput priority" collector finally has a veritable application combination. In the case where throughput is important and CPU resource-sensitive, Parallel Scavenge and Parallel Old collector can be given priority.

6. CSM collector

  The CMS collector is a collector whose goal is to obtain the shortest collection pause time. At present, a large part of Java applications are concentrated on the server side of Internet sites or B/S systems. Such applications pay special attention to the corresponding speed of services, and hope that the system pause time is shortest to bring users a better experience. The CMS collector is very suitable for the needs of this type of application. This collector is implemented based on the "mark-sweep" algorithm. The whole process is divided into four steps, including: 
initial marking, concurrent marking, re-marking, and concurrent clearing 
. Among them, the two steps of initial marking and re-marking still require "Stop" The World", the initial mark is just to mark the objects that GC Roots can directly associate with, which is very fast. The concurrent marking phase is the process of GC Roots Tracing, and the re-marking phase is to correct the user program during the concurrent marking period. The mark record of the part of the object that causes the mark to change. This phase is generally a little longer than the initial mark phase, but much shorter than the concurrent mark 
due to the longest time-consuming concurrent mark and concurrent clearing in the entire process. Both the process and the machine thread can work with the user thread, so, in general, the memory recovery process of the CMS collector is executed concurrently with the user thread 

advantage:

  Concurrent collection, low pause Disadvantages : 

  1. In the concurrent stage, although it will not cause the user thread to stop, it will cause the application to slow down because it occupies a part of the thread (or CPU resources), and the total throughput will be reduced 
  . 2. The CMS collector cannot handle floating garbage, There may be a "Concurrent Mode Failure" failure, resulting in another Full GC. 
  3. Since CMS is a collector based on the "mark-sweep" algorithm, a large amount of space debris will be generated after the collection is completed. When there are too many space fragments, it will bring a lot of trouble to the allocation of large objects. Often, there is a lot of space left in the old age. It has to start a Full GC in advance. To solve this problem, the CMS collector provides a - XX: +UseCMSCompactAtFullCollection switch parameter, used to start the process of merging and defragmenting memory fragments from the top of the CMS collector to when FullGC is to be performed. The process of memory defragmentation cannot be concurrent, and the space fragmentation problem is gone, but the pause time has to be longer. There is also a parameter -XX: CMSFullGCsBeforeCompaction, this parameter is used to set how many times the uncompressed Full GC is performed, followed by a compressed one (the default value is 0, which means that the defragmentation is performed every time it enters the Full GC)

7. G1 collector

  The G1 collector is one of the most cutting-edge achievements in the development of collector technology today. It is a garbage collector for server-side applications. The 
  characteristics of the G1 collector are: 
  1. Parallel and concurrent 
  2. Generational collection 
  3. Space integration comes from the whole It is based on the "marking-collating" algorithm. From a local point of view, it is based on the "copy" algorithm. 
  4. Predictable pause . Another advantage of G1 over CMS is that reducing pause time is a common concern of G1 and CMS. Point, in addition to the pursuit of low pause, it can also establish a predictable pause time model. 
  Using the G1 collector, the memory layout of the Java heap is very different from other collectors. It divides the entire Java heap into multiple independent objects of equal size. Although the concept of the new generation and the old generation is still retained in the region, the new generation and the old generation are no longer physically isolated. They are all part of the collection 
process of the Region (which does not need to be continuous): 
initial mark, concurrent mark, final mark , screening and recycling 

Fourth, understand the GC log

V. Summary of Garbage Collector Parameters

6. Memory allocation

  The memory allocation of new generation, old generation, and permanent generation 
  objects, in the general direction, is to allocate on the heap (but it may also be dismantled into scalar types after JIT compilation and indirectly allocated on the stack), and objects are mainly allocated in the new generation. In the Eden area of ​​the A combination of garbage collectors, as well as the settings of memory-related parameters in the virtual machine. In this section, the Client mode is used for testing, and it is verified that the Serial/Serial Old collector is used.

(1) Objects are first allocated in Eden

(2) Large objects directly enter the old age

(3) Long-lived objects will enter the old age

  The virtual machine defines an object age (Age) counter for each object. If the age increases to a certain extent, it will be promoted to the old age.

(4) Dynamic age distribution

  In order to better adapt to the memory conditions of different programs, the virtual machine does not always require the age of the object to be promoted to the old generation. If the sum of the sizes of all objects of the same age in the Survivor space is greater than half of the Survivor space, the age is greater than or equal to this Age objects can directly enter the old age

(5) Space allocation guarantee

  Before Minor GC occurs, the virtual machine first checks whether the maximum available continuous space in the old generation is greater than the total space of all objects in the new generation. If this condition is true, then Minor GC can be guaranteed to be safe. If not, the virtual machine checks whether the HandlePromotionFailure setting value allows guarantee failure. If it is allowed, it will continue to check whether the maximum available continuous space in the old generation is greater than the average size of objects promoted to the old generation. If it is greater, it will try to perform a Minor GC, although this Minor GC is risky; Or the HandlePromotionFailure setting does not allow wool, then a Full GC should also be performed instead. 
  Minor GC: Refers to the garbage collection action that occurs in the new generation. Because most of the java objects have the characteristics of rapid development, MinorGC is very frequent, and the recovery speed is generally faster. 
  Full GC/Major GC: refers to the GC that occurs in the old generation The Major GC is often accompanied by at least one Minor GC (but not absolutely, in the collection strategy of the ParallelScavenge collector, there is a strategy selection process to directly perform the Major GC).
The speed of MajorGC is generally more than 10 times slower than that of Minor GC.

7. Summary

Memory reclamation and garbage collectors are often one of the main factors affecting system performance and concurrency capabilities. The reason why virtual machines provide a variety of different collectors and provide a large number of adjustment parameters is that they can only be implemented according to actual application requirements. Select the optimal collection method to obtain the highest performance. There is no fixed collector, parameter combination, and no optimal tuning method, and the virtual machine does not have any necessary memory recovery behavior.

  This article is written with reference to "In-depth Understanding of JVM". The blogger recommends that you read this book in detail. It talks about a lot of in-depth knowledge about JVM, which can make us Java developers more familiar with their own writing. program. Write better code in the future! ! !

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324950308&siteId=291194637