The core of jvm learning (5)---garbage collection algorithm and common garbage collector

1. Garbage collection algorithm

First of all, we need to be clear that garbage collection mainly includes the following stages

1.1. Marking phase

In the marking phase, we give examples of two algorithms
1. Reference counting algorithm
This algorithm is pythonthe marking algorithm used.

What is the reference counting algorithm: add a reference counter to the object, and whenever there is a reference to it, the counter value will be increased by 1; Any reference points to an object that can be considered "garbage"

This method is relatively simple to implement and highly efficient, but it cannot solve the problem of circular references, so this algorithm is not used in java (but it is used in Python)

2. Reachability Analysis Algorithm
This algorithm is javathe marking algorithm used.

The basic idea of ​​the reachability analysis algorithm is to use a series of objects named "GC Roots" as the starting point, and start searching downward from these nodes. The path traveled by the search is called the reference chain (Reference Chain). When an object When GC Roots is not connected by any reference chain, it proves that this object is not available.
The basic idea of ​​this algorithm is to use a series of objects called "GC Roots" as the starting point, search downward from these nodes, and the path traveled by the search is called a reference chain. When an object does not have any reference chain to GC Roots ( That is, when GC Roots to the object is unreachable), it proves that the object is unavailable.
insert image description here

2.1. Object finalization mechanism
insert image description here

I call this mechanism 救赎机制, when the object is unreachable, there is only one chance of redemption. If this method has been rewritten and has not been executed, this method will be executed.
If this object is determined to be necessary to execute the finalize() method, then this object will be placed in a queue called F-Queuc, and later automatically created by a virtual machine, a low-priority Finalizer thread to execute it.

1.2. Cleanup phase

1.2.1. Mark Sweep Algorithm

When the effective memory space (available memory) in the heap is exhausted, the entire program will be stopped (also known as stop the world), and then two tasks will be performed, the first is marking, and the second is clear.

  • Mark: The collector traverses from the reference root node and marks all referenced objects. Generally, it is recorded as a reachable object in the Header of the object.
  • Cleanup: The collector traverses the heap memory linearly from the beginning to the end, and if it finds that an object is not marked as a reachable object in its Header, it will be recycled.

1.2.2. Tag replication algorithm

  • Divide the memory into two blocks of the same size, and use one of them each time. When this block of memory is used up, copy the surviving object to another block, and then clean up the used space at one time.
    Advantages:
    1. There will be no memory fragmentation problem
    2. For areas with few surviving objects (new generation), simple and efficient

    Disadvantages:
    1. Waste of space and expensive to move objects

1.2.3. Marking Algorithm

According to a special marking algorithm based on the characteristics of the old age, the marking process is still the same as the "mark-clear" algorithm, but the subsequent steps do not directly recycle the recyclable objects, but let all surviving objects move to one end, and then clean up directly Memory outside the end boundary is dropped.
Advantages:
1. Eliminates the shortcomings of scattered and discontinuous memory areas in the mark-clear algorithm,
2. Eliminates the high cost of memory halving in the copy algorithm.

Disadvantages:
1. The cost of moving objects is large

1.3. References

The reason for memory overflow: there are a large number of useless but strongly referenced objects that cannot be recycled

The following introduces the four types of references that exist in java:

1. Strong references
The references we often use are all strong references, and the objects touched by the strong reference cannot be recycled even if the program reports an error.

2. Soft references
Before the program overflows the memory, the soft reference objects will be included in the scope of secondary recycling. If the reclaimed memory is still not enough to run the program, the soft reference objects will be recycled, which is generally used for cache.

3. Weak references
are recycled when they are discovered, and survive until the next memory recycling, and are generally used for dispensable caches.

4. Phantom references
have no effect on the program at all, but you can receive notifications after recycling

2. Common garbage collectors

Performance indicators of the garbage collector:
insert image description here

1: Throughput: that is, the percentage of program running time in the total time.
2: Low latency: the time for a single pause.

The two are mutually exclusive, and you can't have both.
Low latency shortens collection times, which leads to more wasteful work preparing for collections, and thus negatively impacts throughput.
insert image description here
Note: CMS has been deprecated in JDK9, and the dotted line indicates that the method has been deprecated in later releases

2.1.Serial collector

insert image description here
This collector is a single-threaded collector, but its "single-threaded" meaning does not only mean that it will only use one CPU or one collection thread to complete garbage collection work, but more importantly, when it collects garbage , must suspend all other worker threads until it collects (stop The world).

  • The serial collector is the most basic and oldest garbage collector. Before JDK1.3, the only option to recycle the new generation.
  • The serial collector serves as the default new generation garbage collector in client mode in HotSpot. The serial collector 采用复制算法、事行回收和"stop-the-world”机制performs memory reclamation.
  • In addition to the young generation, the serial collector also provides the serial old collector for performing old generation garbage collection. The Serial old collector also uses the serial recycling
    and "stop the world" mechanism, but the memory recycling algorithm uses 标记-压缩an algorithm.
  • serial old is the garbage collector that runs client模式under the default old age
  • Serial old has two main uses in server mode
    • ①Cooperate with the new generation of Parallelscavenge
    • ② As a backup garbage collection scheme for the old age CMs collector

2.2.ParNew collector

insert image description here

If serial cc is a single-threaded garbage collector in the young generation, then the ParNew collector is a multi-threaded version of the serial collector.

  • ParNew is the abbreviation of Parallel, New: can only deal with the new generation
  • There is almost no difference between the two garbage collectors except that the ParNew collector uses parallel collection to perform memory collection.
  • The ParNew collector also uses the replication algorithm and "stop-the-world" mechanism in the young generation.
  • ParNew is the default garbage collector for the new generation of many JVMs running in server mode.

2.3. Parallel collector

In addition to the ParNew collector in Hotspot's young generation, which is based on parallel recycling, the Parallel Scavenge collector also uses the replication algorithm, parallel recycling, and the "stop the world" mechanism.

Unlike the ParNew collector, the goal of the Parallel scavenge collector is to achieve
a controllable throughput (Throughput), which is also known as a throughput-first garbage collector.
The adaptive adjustment strategy is also an important difference between Parallel scavenge and ParNew.

2.4. CMS Recycler

insert image description here

  • Garbage collection is divided into five phases: the initial marking and re-marking shown in the figure above will have STW.
  • For re-marking, the incremental update algorithm in the three-color mark will be used.
  • Concurrent cleanup: If there are new objects at this stage, they will be marked as black without any processing. Main advantages: concurrent collection, low pause.
  • 1XX:+UseConcMarkSweepGC: enable cms

Disadvantages of CMS:
1. It will generate memory fragmentation, resulting in insufficient space available to user threads after concurrent clearing. In the case that large objects cannot be allocated, Full GC has to be triggered in advance.
2. The CMS collector is very sensitive to cPu resources. In the concurrency stage, although it will not cause users to pause, it will slow down the application and reduce the total throughput because it occupies a part of the thread.
3. The CMS collector cannot handle floating garbage. There may be a "Concurrent Mode railure" failure that causes another Full GC to occur. In the concurrent marking phase, since the program's worker threads and garbage collection threads are running at the same time or cross-running, if new garbage objects are generated during the concurrent marking phase, CMs will not be able to mark these garbage objects, which will eventually lead to these newly generated garbage objects. Garbage objects are not recovered in time, so the memory space that has not been recovered can only be released the next time Gc is executed.

2.5.G1 garbage collector

The G1 garbage collector is a cutting-edge achievement

The reason is that the business of the application program is becoming larger and more complex, and there are more and more users. Without cc, the normal operation of the application program cannot be guaranteed, and the Gc of STw often cannot keep up with the actual demand, so it will continue to Try to optimize cc. The G1 (Garbage-First) garbage collector is a new garbage collector introduced after Java7 update 4, and is one of the most cutting-edge achievements in the development of today's collector technology.

The official goal set for G1 is to obtain the highest possible throughput while the delay is controllable, so it takes on the heavy responsibility and expectation of a "full-featured collector".

  • Because G1 is a parallel collector, it divides the heap memory into many unrelated 区域(Region)(physically discontinuous). Use different Regions to represent Eden, Survivor O, Survivor 1, Old Generation, etc.

  • G1 Gc programmatically avoids region-wide garbage collection on the entire Java heap. G1 tracks the value of garbage accumulation in each Region (the size of the space obtained by recycling and the experience value of the time required for recycling), maintains a priority list in the background, and gives priority to recycling the Region with the highest value according to the allowed collection time each time. Since the focus of this method is to recycle the area with the largest amount of garbage (Region), we give G1 a name:垃圾优先(Garbage First) 。

ZGC collector (experimental)

Although the automatic garbage collection of the JVM reduces the work of developers and reduces the risk of memory leaks to a certain extent, but because the GC is automatic, some unpredictable things may sometimes have a harmful impact on the application.

Increased latency results in application throughput and performance

With the development of the times, the hardware will gradually become cheaper, and the memory used by the application will become larger and larger, but it cannot increase the delay and reduce the throughput.
ZGC guarantees that the delay will not exceed 10 milliseconds under any circumstances.

The most typical feature of ZGC is that it is a concurrent GC. Other features are as follows:

It can mark memory, copy and migrate (relocate) memory, all operations are concurrent, and it has a concurrent reference processor. Other
garbage collectors use store barriers, and ZGC uses load barriers to track memory.

lock->unlock->read->load read memory
use->assign->store->write write memory

  • ZGC can configure the size and strategy more flexibly. Compared with G1, it can better handle the release of very large objects.
  • ZGC has only one generation, no new generation, old age, etc., but ZGC can support partial compression, and ZGC still has high performance in memory recovery and migration (reclaim and relocate)
  • ZGC relies on NUMA-aware (unbalanced memory access), which requires our memory to support this feature

Most of the pictures and content are summarized from the full set of JVM tutorials of Song Hongkang of Shang Silicon Valley by Mr. Du Hongkang of Shang Silicon Valley (detailed explanation of java virtual machine)

Guess you like

Origin blog.csdn.net/faker1234546/article/details/128985770