1、GC

Garbage Collection Garbage Collection is often referred to as "GC", it was born in MIT's Lisp language in 1960, after more than half a century, it is now very mature.

In jvm, the program counter, virtual machine stack, and local method stack are all created and destroyed with the thread. The stack frame is pushed into and out of the stack with the entry and exit of the method, which realizes automatic memory cleanup. Therefore, Our memory garbage collection is mainly concentrated in the java heap and method area. During the running of the program, the allocation and use of this part of the memory are dynamic.

2. Object survival judgment

There are generally two ways to determine whether an object is alive:

1. Reference count: Each object has a reference count attribute. When a new reference is added, the count is increased by 1. When the reference is released, the count is decreased by 1. When the count is 0, it can be recycled. This method is simple and cannot solve the problem of circular reference of objects to each other.

2. Reachability Analysis: The search starts from the GC Roots downward, and the path traversed by the search is called the reference chain. When an object does not have any reference chain connected to GC Roots, the object is proved to be unavailable. Unreachable object.

In the Java language, GC Roots include:

1. Objects referenced in the virtual machine stack.

2. The object referenced by the class static attribute entity in the method area.

3. The object referenced by the constant in the method area.

4. Objects referenced by JNI in the native method stack.

3. Garbage Collection Algorithm

3.1 Mark-Sweep Algorithm

The "Mark-Sweep" (Mark-Sweep) algorithm, like its name, is divided into two stages: "marking" and "sweeping": first, all objects that need to be recycled are marked, and all objects that need to be recycled are uniformly recycled after the marking is completed. marked object.

There are two main disadvantages: one is the efficiency problem, the efficiency of the marking and clearing process is not high; the other is the space problem, after the mark is cleared, a large number of discontinuous memory fragments will be generated, and too much space fragmentation may cause, when When the program needs to allocate larger objects in the later running process, it cannot find enough contiguous memory and has to trigger another garbage collection action in advance.

3.2 Replication algorithm

"Copying" is a collection algorithm that divides the available memory into two equal-sized blocks according to their capacity, and only uses one of them at a time. When the memory of this block is used up, the surviving objects are copied to another block, and then the used memory space is cleaned up once.

In this way, memory is reclaimed for one of them every time, and there is no need to consider complex situations such as memory fragmentation during memory allocation. Just move the top pointer of the heap and allocate memory in sequence, which is simple to implement and efficient to run. It's just that the cost of this algorithm is that the memory is reduced to half of the original, and the efficiency of continuously copying long-lived objects is reduced.

3.3 Marker-compression algorithm

"Mark-Compact" (Mark-Compact) algorithm, the marking process is still the same as the "mark-clean" algorithm, but the subsequent step is not to clean up the recyclable objects directly, but to move all surviving objects to one end, and then directly Clean up memory outside of end boundaries

3.4 Generational Collection Algorithm

The "Generational Collection" algorithm divides the Java heap into the new generation and the old generation, so that the most appropriate collection algorithm can be adopted according to the characteristics of each generation. In the new generation, a large number of objects are found to die during each garbage collection, and only a few survive, so the replication algorithm is used, and the collection can be completed with only a small cost of copying the surviving objects. In the old age, because the object has a high survival rate and there is no extra space to allocate it, it must use the "mark-sweep" or "mark-clean" algorithm for recycling.

4. Garbage collector

4.1 Serial garbage collector Serial

The serial collector refers to a collector that uses a single thread for garbage collection. The serial collector has only one worker thread for each collection. For computers with weak concurrency capabilities, the focus and exclusivity of the serial collector are often have better performance. The serial collector can be used in the new generation and the old generation. According to the different heap space, it is divided into the new generation serial collector and the old generation serial collector.

The Serial collector is the oldest collector. Its disadvantage is that when the Serial collector wants to perform garbage collection, it must suspend all processes of the user, that is, stop the world ( service suspension ). Until now, it is still the default young generation collector for virtual machines running in client mode.

Parameter control: -XX:+UseSerialGC use serial collector

4.2 Parallel garbage collectors

4.2.1 Parallel garbage collectors

The parallel collector is an improvement on the basis of the serial collector. It can use multiple threads to perform garbage collection at the same time. For computers with strong computing power, it can effectively shorten the actual time required for garbage collection.

4.2.2 Parallel Garbage Collector - ParNew

The ParNew collector is a garbage collector that works in the new generation. It simply multithreads the serial collector. Its recovery strategy and algorithm are the same as the serial collector. The new generation is parallel, the old generation is serial; the new generation replication algorithm, the old generation mark-compression.

Parameter control: -XX:+UseParNewGC uses the ParNew collector -XX:ParallelGCThreads to limit the number of threads

4.2.3 Parallel Garbage Collector - Parallel

Parallel is a multi-threaded new generation garbage collector using a replication algorithm, and the Parallel collector is more concerned with the throughput of the system. The so-called throughput is the ratio of the time the CPU uses to run user code to the total CPU consumption time, that is, throughput = time to run user code / (time to run user code + garbage collection time).

The shorter the pause time, the more suitable the program needs to interact with the user, and the good response speed can improve the user's experience;

High throughput, on the other hand, can make the most efficient use of CPU time and complete the program's computing tasks as soon as possible, which is mainly suitable for tasks that do not require too much interaction in the background.

The adaptive adjustment strategy can be turned on through parameters. The virtual machine collects performance monitoring information according to the current system operation, and dynamically adjusts these parameters to provide the most suitable pause time or maximum throughput; you can also control the GC time through parameters not to exceed How many milliseconds or scale; young generation replication algorithm, old generation mark-compression

Parameter control: -XX:MaxGCPauseMillis sets the maximum garbage collection pause time -XX:GCTimeRatio sets the throughput size (default is 99) -XX:+UseAdaptiveSeizPolicy turns on adaptive mode

4.2.4 Parallel Garbage Collector - Parallel Old

The Parallel Old collector is an older version of the Parallel Scavenge collector. It uses multithreading and a "mark-to-sort" algorithm, and is also more concerned with throughput. In cases where throughput and CPU resources are sensitive, Parallel Scavenge and Parallel Old collector can be given priority.

Parameter control: -XX:+UseParallelOldGC Use ParallelOld collector -XX:ParallelGCThreads limit the number of threads

4.3 CMS Garbage Collector

CMS (Concurrent Mark Sweep) concurrent mark division, which uses mark division, works in the old age, and mainly focuses on the pause time of the system.

CMS is not an exclusive collector, that is to say, the application is still working non-stop during the CMS recycling process, and new garbage will be continuously generated, so in the process of using the CMS, you should ensure that the application memory is sufficient. Available, CMS will not wait until the application is saturated to collect garbage, but will start recycling at a certain threshold (default is 68), that is to say, when the space usage of the old generation reaches 68%, it will return to the execution. CMS. If the memory usage increases rapidly, there is insufficient memory during the execution of the CMS. At this time, the CMS collection will fail, and the virtual machine will start the old generation serial collector for garbage collection, which will cause the application Interrupts will not work until the garbage collection is completed. The GC pause time in this process may be long, so the setting of the threshold should be set according to the actual situation.

The disadvantage of the mark-and-sweep method is the problem of memory fragmentation. CMS provides some optimization settings. You can set a defragmentation after the CMS is completed, or you can set how many times the CMS is recycled and then defragmented.

Parameter control: -XX:CMSInitatingPermOccupancyFraction Set the threshold value -XX:+UserConcMarkSweepGC Use the cms garbage cleaner -XX:ConcGCThreads to limit the number of threads -XX:+UseCMSCompactAtFullCollection Set a defragmentation after the CMS is completed -XX:CMSFullGCsBeforeCompaction Set how many times CMS is recycled defrag after

4.4G1 (Garbage First) garbage collector

G1 (Garbage First) garbage collector is one of the most cutting-edge results of garbage collection technology today. As early as JDK7, it has joined the JVM's collector family and has become the garbage collection technology that HotSpot focuses on. Like the excellent CMS garbage collector, G1 is also a garbage collector that focuses on the minimum delay, and is also suitable for garbage collection of large-sized heap memory. The official also recommends using G1 instead of choosing CMS. The biggest feature of G1 is the introduction of the idea of partitioning, which weakens the concept of generation, rationally utilizes the resources of each cycle of garbage collection, and solves many defects of other collectors and even CMS.

Parallelism and Concurrency: G1 can take full advantage of hardware in a multi-CPU and multi-core environment, and use multiple CPUs (CPU or CPU core) to shorten the pause time of Stop-The-World. Some other collectors originally need to pause the execution of Java threads. GC action, the G1 collector can still let the Java program continue to execute in a concurrent manner.

Generational collection: As with other collectors, the generational concept is still preserved in G1. Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it can handle newly created objects and old objects that have survived for a while and survived multiple GCs in different ways to obtain better Collection effects.

Spatial integration: Different from the "mark-clean" algorithm of CMS, G1 is a collector based on the "mark-clean" algorithm as a whole, and based on the "copy" algorithm from a local (between two regions) perspective , in any case, both algorithms mean that G1 will not generate memory space fragmentation during operation, and can provide regular available memory after collection. This feature is beneficial for the program to run for a long time, and the next GC will not be triggered in advance because the contiguous memory space cannot be found when allocating large objects.

Predictable pause: This is another big advantage of G1 over CMS. Reducing pause time is a common concern of G1 and CMS, but in addition to pursuing low pause, G1 can also establish a predictable pause time model, which allows users to use The author explicitly specifies that in a time segment of length M milliseconds, the time spent on garbage collection shall not exceed N milliseconds, which is almost a feature of the garbage collector of Real Time Java (RTSJ).

Parameter control: -XX:+UseG1GC use G1 garbage collector -XX:ParallelGCThreads limit the number of threads -XX:MaxGCPauseMillis specify the maximum pause time

jvm GC algorithms and types