HotSpot garbage collection algorithm implementation details

Root node enumeration

In the reachability analysis algorithm, due to the large number of GC Roots, it will take a lot of time to look up the reference chain from the GC Roots collection.

So far, all collectors must pause the user thread during the root node enumeration step , so there is no doubt that root node enumeration will face the same "Stop The World" problem as the previously mentioned memory defragmentation. Troubled.

Now the longest process of finding the reference chain in the reachability analysis algorithm can be done concurrently with the user thread, but the root node enumeration still must be carried out in a snapshot that can guarantee consistency. (“Consistency” here means that the execution subsystem looks like it is frozen at a certain point in time during the entire enumeration, and there will be no situation where the object reference relationship of the root node collection is still changing during the analysis process)

In HotSpot, a set of data structures called OopMap are used to store the addresses of objects. Once the class loading action is completed, HotSpot will calculate what type of data is at what offset within the object. During the just-in-time compilation process, it will also record which locations in the stack and registers are references at specific locations. . In this way, the collector can directly know this information when scanning, and does not need to start searching for GC Roots in the method area. With the assistance of OopMap, HotSpot can complete GC Roots enumeration quickly and accurately

safe spot

There are many instructions that may cause the reference relationship to change, or the content of the OopMap to change. If a corresponding OopMap is generated for each instruction, a large amount of additional storage space will be required, and the space cost associated with garbage collection will change. Unbearably high.

Therefore, the concept of safe point emerged. The safe point is similar to the archive point. When the thread executes to the safe point, an archive operation is performed, that is, the content in the OopMap is updated. Due to the emergence of safe points, it means that garbage collection cannot be performed at any time. The thread must reach the safe point before it can pause and start the garbage collection operation.

For garbage collection, all threads must execute to a safe point before they can be suspended. There are two main options to choose from.

  • Preemptive interruption : Preemptive interruption does not require the execution code of the thread to actively cooperate. When garbage collection occurs, the system first interrupts all user threads. If it is found that the place where the user thread was interrupted is not at a safe point, the thread will be restored. Execute, let it interrupt again after a while, until it reaches a safe point. Currently, almost no virtual machine implementation uses preemptive interrupts to pause threads in response to GC events.
  • Active interrupt : This method is equivalent to setting a flag bit to indicate whether to perform garbage collection, and letting all threads poll this flag bit. Once the interrupt flag is found to be true, it will actively interrupt and hang up at the nearest safe point. rise. (Since polling operations appear frequently in the code, it must be efficient enough. HotSpot uses memory protection traps to simplify the polling operation to only one assembly instruction.)

safe area

The safe point uses interrupts to solve the problem, so problems will arise. If the user thread is in the Sleep state or Blocked state and will not respond to the interrupt, the thread will not be in the safe point at this time, and the virtual machine cannot continue to wait. The thread is reactivated to allocate processor time. For this situation, you can use safe areas to solve it.

The safe zone ensures that the reference relationship will not change within a certain piece of code. Therefore, it is safe to start garbage collection anywhere in this zone. We can also think of the safe area as a safe point that has been extended and stretched.

When a user thread executes code in the safe area, it will first identify that it has entered the safe area, so that when the virtual machine initiates garbage collection during this period, it does not have to worry about these threads that have declared themselves to be in the safe area. . When the thread is about to leave the safe area, it needs to check whether the virtual machine has completed the root node enumeration (or other stages in the garbage collection process that require suspending the user thread). If it is completed, the thread will treat it as nothing happened and continue. execution; otherwise it must wait until it receives a signal that it can leave the safe area.

Memory set and card list

Due to the problem of cross-generation references, the garbage collector established a data structure called a remembered set (Remembered Set) in the new generation to avoid adding the entire old generation to the GC Roots scan range.

A memory set is an abstract data structure used to record a set of pointers from non-collection areas to collection areas.

In the memory set we can choose the information to record, as follows

  • Word length precision: Each record is accurate to one machine word length (that is, the number of addressing bits of the processor, such as the common 32-bit or 64-bit. This precision determines the length of the pointer used by the machine to access the physical memory address). The word contains span generation pointer.

  • Object precision: Each record is accurate to an object, and there are fields in the object that contain cross-generation pointers.

  • Card precision: Each record is accurate to a memory area where there are objects containing cross-generation pointers.

The third type of card precision refers to using a method called "Card Table" to implement the memory set. This is currently the most common implementation method.

The simplest form of the card table can be just a byte array (the reason why a byte array is used instead of a bit array is mainly for speed considerations. Modern computer hardware is minimum byte addressable), and the HotSpot virtual machine is indeed like this. made. as follows

CARD_TABLE [this address >> 9] = 0;

Each element of the byte array CARD_TABLE corresponds to a memory block of a specific size in the memory area it identifies. This memory block is called a "Card Page". Generally speaking, the card page size is the number of bytes in 2 to the N power.

image-20230908151710241

The memory of a card page usually contains more than one object. As long as there is a cross-generation pointer in the field of one (or more) objects in the card page, the value of the array element of the corresponding card table is marked as 1, which is called this The element is dirty (Dirty), otherwise it is marked as 0. When garbage collection occurs, as long as the dirty elements in the card table are filtered out, you can easily find out which card page memory blocks contain cross-generation pointers, add them to GC Roots and scan them together.

write barrier

The above memory set only reduces the problem of GC Roots scanning range, but it does not solve the problem of how to maintain the card table elements, such as when they become dirty, who will make them dirty, etc.

In the HotSpot virtual machine, the status of the card table is maintained through the Write Barrier technology. The write barrier can be seen as the AOP aspect of the action of "reference type field assignment" at the virtual machine level . When the reference object is assigned, a circular (Around) notification is generated for the program to perform additional actions. That is to say, before and after the assignment, Within the coverage of the write barrier. The part of the write barrier before the assignment is called the pre-write barrier (Pre-Write Barrier) , and the part after the assignment is called the post-write barrier (Post-Write Barrier) .

The use of write barriers may cause false sharing problems , because the cache system of modern CPUs is stored in units of cache lines. When multiple threads modify independent variables, if these variables happen to share the same Cache lines, will affect each other (write back, invalidate or synchronize) and cause performance degradation.

To avoid the false sharing problem, a simple solution is not to use an unconditional write barrier, but to check the card table mark first and only mark the card table element as dirty if it has not been marked before.

if (CARD_TABLE [this address >> 9] != 0)
	CARD_TABLE [this address >> 9] = 0;

tricolor marking

Tricolor Marking is a garbage collection algorithm that is mainly used in the marking phase of the mark-sweep garbage collector to identify and mark active objects. This algorithm uses gray, white, and black to represent the state of the object, hence the name "three-color marking".

The following is the basic principle and workflow of the three-color marking algorithm:

  1. Initial state :

    • All objects are marked white (White), indicating that they have not been visited or marked.
  2. Marking process :

    • Starting from the root node (usually the entry point of the program or a global variable), the object graph is traversed.
    • When an object is accessed, it is marked gray (Grey), indicating that it has been accessed but the object it refers to has not been marked.
    • For objects referenced by gray objects, their statuses are also marked as gray, and they are added to the queue to be processed.
  3. Iteration tags :

    • Repeat the marking process until there are no more gray objects. This means that all reachable objects are marked gray or black.
  4. Cleanup phase :

    • After marking ends, all unmarked objects are considered unreachable, so they are marked white.
    • The garbage collector can clean up all white objects and reclaim the memory they occupy.
  5. Black objects :

    • All remaining gray and black objects are considered live objects and they will remain in memory until the next garbage collection process.

The key advantages of the three-color marking algorithm are its incrementality and concurrency. Since the marking process is incremental, it can be broken down into small steps, each of which allows the program to continue executing. This reduces pause times caused by garbage collection. In addition, it can be executed concurrently with a multi-threaded environment, because marking and cleaning do not require global pauses, only the reference relationship of the object is ensured not to be changed during the marking process.

Three-color markers will have problems with concurrent updates, and user threads may modify reference relationships. There are two possible consequences. One is to incorrectly mark an originally dead object as alive (tolerable, only floating garbage will be generated, and can be processed in the next garbage collection), and the other is to incorrectly mark an originally living object as dead (not tolerable)

image-20230908154218357

Wilson theoretically proved in 1994 that if and only if the following two conditions are met at the same time, the problem of "object disappearance" will occur , that is, objects that should be black are mistakenly marked as white:

  • The evaluator inserts one or more new references from the black object to the white object;
  • The evaluator removes all direct or indirect references from the gray object to the white object.

Therefore, if we want to solve the problem of object disappearance during concurrent scanning, we only need to destroy either of these two conditions. This resulted in two solutions: Incremental Update (Incremental Update) and Original Snapshot (Snapshot At The Beginning, SATB).

What incremental updates want to destroy is the first condition . When the black object inserts a new reference relationship pointing to the white object, the newly inserted reference is recorded. After the concurrent scan is completed, these recorded reference relationships are The black object in is the root, scan again. This can be simplified to understand that once the black object has a new reference to the white object, it changes back to a gray object.

When pointing to a reference relationship of a white object, the newly inserted reference will be recorded. After the concurrent scan is completed, the black object in these recorded reference relationships will be used as the root and the scan will be performed again. This can be simplified to understand that once the black object has a new reference to the white object, it changes back to a gray object.

What the original snapshot wants to destroy is the second condition . When the gray object wants to delete the reference relationship pointing to the white object, the reference to be deleted is recorded. After the concurrent scan is completed, these recorded reference relationships are The gray object is the root, scan again. This can also be simplified to understand that no matter whether the reference relationship is deleted or not, the search will be based on the object graph snapshot at the moment when the scan just started.

Guess you like

Origin blog.csdn.net/m0_51545690/article/details/132762656