JVM garbage collector (1)

Table of contents

1. How to consider GC

2. How to determine if an object is "dead"

3. Generational collection theory

4. Garbage collection algorithm

5. HotSpot algorithm implementation details

1. How to consider GC

Garbage Collection (GC) has a longer history than Java and was born at MIT in 1960.

Three things to consider with GC

What memory needs to be reclaimed
Timing of recycling
How to recycle

1. Which memory needs to be recycled?

Thread private space: program counter, virtual machine stack, local method stack, stack frames in the stack are pushed in and out as methods enter and exit. How much memory is allocated in each stack frame can be determined after the class is loaded. When the method or thread ends, all the occupied memory can naturally be recycled without thinking too much.

The Java heap and method area, the memory space shared by these two threads, are highly uncertain. Because their contents are only known at runtime and are constantly changing. The garbage collector focuses on this part of memory, allocating and recycling it.

2. Conservative GC and accurate GC

The access positioning method of objects is determined by the GC method used by the virtual machine:

Conservative GC: use handle access
Exact GC: using direct pointer access

1. Conservative GC

During GC, reachability analysis is generally used to mark surviving objects, which involves the issue of reference chain traversal of GCRoots.

This involves the issue of how to determine whether the data in a stack is a reference address or basic data.

There are two ways:

Alignment check: Check the number of bits in the data. Addresses in the JVM are all 32-bit. If they are not 32-bit, they are definitely not reference addresses.
Upper and lower boundary check: If the number of bits of the data matches the number of bits of the address, check whether it is within the range of the heap memory. If it exceeds the range of the heap, it means that it is not a reference address.

However, if the data is a 32-bit data that meets the heap memory range, the JVM cannot determine whether it belongs to the reference address. GC based on this idea is called "conservative GC"

Characteristics of conservative GC:

Unable to accurately determine whether a piece of data is basic data or a reference address
False reference problem. If it cannot be judged, it will be conservatively believed that the data is a reference address, resulting in additional records of the reference relationships of some objects, causing them to not be cleaned up.
Need to introduce handle pool as the middle layer

Why must conservative GC introduce handle pools?

For example, there is a reference address A, pointing to an object B, and there happens to be a basic data C whose value is the same as the reference address of object B.

If the address of B is modified, the virtual machine believes that both A and C are references to B, so their values should be modified.

The problem is, C is basic data and it certainly shouldn't be modified.

The virtual machine cannot judge, so a handle pool is introduced. All references must point to the handle pool, and then the actual object is found from the handle pool.

In this way, to change the address of the object, you only need to change the mapping address of the handle pool, and the data on the stack does not need to be modified. This eliminates the possibility of incorrectly modifying basic data.

JDK 1.0 uses handles for object location.

2. Accurate GC

The characteristic of accurate GC is that it can accurately know whether a piece of data on the stack represents basic data or a reference address.

How to do it?

During the compilation process, you can know the type of the variable, record the type information, and store it in OOPMap.
Later, according to OOPMap, you can know the meaning of the data.

Therefore, accurate GC can directly locate objects through pointers, and access objects is more efficient.

2. How to determine if an object is "dead"

Almost all object instances are stored in the heap. The garbage collector must first determine which objects in the heap are still "alive (useful)" and which objects are "dead (can no longer be referenced through any means)".

1. Reference counting algorithm

Reference Counting algorithm has a simple principle and high judgment efficiency, but 主流的JVM都没有使用它.

The method of reference counting is to set a counter for each object. Whenever it is referenced somewhere, the counter is +1; when the reference expires, the counter is -1. Once the counter reaches 0, the object is dead.

benefit:

Amortize memory management operations to every reference operation when the program is running.
When managing memory, you do not need to know the specific details of the objects at runtime, that is, you do not need to know the location of each object, you only need to check the reference count of each object.

shortcoming:

It is difficult to solve the problem of circular references between objects
- As long as there is a circular reference, the counters of the two objects will never be 0, but they are actually disconnected from the program and should be recycled.
Need to ensure thread safety
- Because in a multi-threaded environment, there will be multiple threads performing various operations on the same batch of objects, modifications to the reference count need to ensure atomicity and visibility.
Each object takes up additional memory to store the reference count
When recycling large data structures, you will still face the problem of STW, because the objects recycled in a single time may be very large.

2. Reachability analysis algorithm

The memory management subsystems of current mainstream commercial programming languages (such as Java and C#) all use reachability analysis to determine whether objects are alive or dead.

The basic idea is to 一系列称为“Gc Roots”的根对象search downward according to the reference relationship by using it as the starting node set. The path traveled by the search is called the "reference chain".

If there is no reference chain between an object and Gc Roots (that is, it is unreachable), it means that it cannot be referenced, then it can die.

1. Which objects can be used as Gc Roots

Simply put, the objects of Gc Roots all meet one condition: they cannot be recycled currently or never.

References within the JVM (for example: wrapper type objects corresponding to basic types, resident exception objects, system class loaders)
Objects referenced in the local variable table in the stack frame
Static variable of reference type of class in method area
方法区中的常量引用的对象(For example: references in the string constant pool)
The object referenced by the Native method in the native method stack
Object held by synchronization lock (synchronized keyword)
分代收集时，需要考虑其他区域对本区域的跨代引用

In addition to these, according to the garbage collector selected by the user and the currently recovered memory area, other objects can be temporarily added to form a complete Gc Roots collection.

For example, generational collection and partial recycling, if garbage collection is only performed on a certain area in the Java heap, objects in this area may also be referenced by objects in other areas, so objects in these associated areas need to be added to Gc Roots In the collection, useful objects will not be recycled.

In order to prevent the Gc Roots collection from being too large, different garbage collectors have also made their own optimizations.

3. What is a citation?

Whether it is reference counting or reachability analysis, whether the object is alive is determined by "whether the object is referenced."

1. Early definition

In JDK 1.2, the definition of reference was still very thin: if the value stored in the reference type data represents the starting address of another memory, it is said that the reference type data represents a certain memory or an object. Quote.

An object has only two states: "referenced" and "unreferenced". Reasonable, but not flexible enough.

Limitations of early definitions

In many scenarios, garbage collection needs to be done as much as possible. We hope that the virtual machine can perform special processing on some objects:

If the memory space is sufficient, they will not be recycled.
The memory space is still tight after garbage collection, so recycle them to free up some memory space.

2. Four reference states

Strongly Reference
- This is the most traditional definition of reference, which refers to the reference assignment operation in program code.
- No matter what the situation, the garbage collector will never reclaim objects that are strongly referenced.
Soft Reference
- Describe some objects that are useful but do not necessarily exist
- Represented in Java by the java.lang.ref.SoftReference class
- Objects that are softly referenced will be included in the recycling scope for secondary recycling before the system OOM. If the memory is still insufficient, OOM will actually be thrown.
Weak Reference
- Similar to soft references, but weaker. Objects that are weakly referenced can only survive until the next garbage collection occurs.
- When the garbage collector starts working, objects that are only weakly referenced will be recycled regardless of whether the current memory is sufficient.
- Weak references must be used in conjunction with a reference queue (ReferenceQueue). If the object referenced by a weak reference is garbage collected, the Java virtual machine will add the weak reference to the reference queue associated with it.
Phantom Reference
- Virtual references must be used in conjunction with a reference queue (ReferenceQueue)
- If an object has a virtual reference, the virtual reference is added to the reference queue associated with it before it is recycled.
  
  The program can know whether the referenced object will be garbage collected by determining whether a virtual reference has been added to the reference queue, and then do something
- Use PhantomReference to implement virtual references in Java

4. Does the subject “must die”?

Objects determined to be unreachable by the reachability analysis algorithm will not be recycled immediately, but will be in the "death delay" stage.

要真正判断一个对象死亡，要经过两次标记过程：

For the first time, a reachability analysis was performed and it was found that there was no reference chain connected to GC Roots.
The second time, a screening is performed to determine whether it is necessary to execute the finalize() method on this object. Successfully executing the finalize() method is the object's last chance to avoid death.

finalize()

If this object does not cover the finalize() method, or the finalize() method has been called by the virtual machine, then "there is no need to execute the finalize() method."

If this object is judged to be "necessary to execute the finalize() method", it will be put into an F-Queue queue, and will be later executed by a Finalizer thread to execute their finalize() method. This Finalizer thread is created by the virtual machine and has a low scheduling priority.

When executing their finalize() method, the virtual machine will only ensure that the method is started to be executed, but will not wait for it to complete execution, because the finalize() method may execute slowly or in an infinite loop if you wait for each finalize() to complete execution. , may cause other objects of F-Queue to wait forever, or even cause the entire memory recycling subsystem to collapse.

After finalize() is executed, the collector will mark the objects in the F-Queue twice. As long as the object successfully joins the reference chain during the finalize() phase, it can continue to survive.

This kind of self-rescue opportunity is only once, because the finalize() method of an object is automatically called by the virtual machine at most once.

(This method is actually not recommended at all!)

5. Recycling method area

The virtual machine specification mentions that it is not mandatory for the virtual machine to implement garbage collection in the method area.

The efficiency of garbage collection in the method area is relatively low: one garbage collection of the heap space can release 70~99% of the memory space, but not much memory can be recovered in the method area.

The method area mainly recycles two parts:

obsolete constants
Class no longer used

How to recycle

A constant is easy to detect, as long as it is not referenced anywhere, it can be recycled.

Type unloading is troublesome and requires three conditions to be met at the same time:

All objects of this class have been recycled
The class loader that loaded this class has been recycled (more difficult to implement)
The Class object of this class is not referenced. i.e. it is not possible to access this class through reflection

Memory recycling in the method area is necessary

In scenarios where reflection, dynamic proxy, CGLib and other bytecode frameworks are used extensively, and in scenarios where class loaders are frequently customized, the virtual machine does need to have the ability to unload types, otherwise the method area may be OOM.

3. Generational collection theory

1. Two early generation hypotheses

Generational Collection theory is based on two generational hypotheses:

Weak Generation Hypothesis: Most objects are "live and die"
Strong Generational Hypothesis: The more times an object survives garbage collection, the harder it is to die.

Note that there is an important hypothesis that has not been mentioned at this time.

2. The practical significance of the hypothesis

These two hypotheses lay the design principles for commonly used garbage collectors:

The collector should divide the Java heap into different areas, and then allocate the recycled objects to different areas for storage based on their age (the number of times they have survived garbage collection).
- If most objects in an area are born and disappear, put them together and recycle them more frequently.
- If the remaining objects are hard-to-die objects, put them together and the virtual machine will recycle this area less frequently.

This takes into account the time overhead of garbage collection and the effective use of memory.

3. Improvement of generational collection theory

After the Java heap is divided into different areas, the garbage collector can only recycle a certain area or a few areas at a time. Here is what happens:

Minor GC: only garbage collection for the new generation
Major GC: only for garbage collection in the old generation
Full GC: full heap collection

For different areas, there are also matching garbage collection algorithms "mark-copy", "mark-clear", and "mark-organize".

4. New generation and old generation

The generational collection theory implemented by commercial JVM will at least divide the Java heap into two parts:

Young Generation
Old Generation

The meaning of these two parts is: a large number of objects will die during each garbage collection in the new generation, and a small number of objects that survive each collection will be gradually promoted to the old generation for storage.

5. Generational collection and cross-generational reference

Generational collection cannot simply divide areas and then collect them separately, because 对象不是孤立的，对象之间可能存在跨代引用.

To consider this factor, then 要进行一次新生代的GC，为了找到被老年代引用的对象，就必须遍历整个老年代，这是非常耗时的.

So added 第三条假说：跨代引用相对于同代引用来说，占极少数.

This hypothesis is not derived out of thin air, but is an implicit basis deduced from the first two hypotheses: two objects that have a mutual reference relationship tend to survive or die at the same time.

For example, if an object in the new generation has a cross-generation reference, it will survive every garbage collection. After a few cycles, it will be promoted to the old generation, and the cross-generation reference will be erased.

6. Garbage collection under cross-generation references

The theory shows that the number of cross-generational citations is relatively small.

Remembered Set

Instead of scanning the entire old generation for a small number of cross-generation references, a global data structure (Remembered Set) can be established on the new generation.

Divide the old generation into several small blocks and identify which block has cross-generation references. When a new generation GC occurs, you only need to traverse this small area.

These old generation objects will be added to GC Roots for reachability analysis to prevent the new generation objects they refer to from being recycled.

This is obviously more efficient than recording each reference or traversing the entire old generation.

Card Table

The solution provided by HotSpot is a technology called Card Table.

This technology divides the entire heap into cards with a size of 512 bytes, and maintains a card table to store an identification bit for each card.

This flag indicates whether the corresponding card may have a reference to the new generation object. If it is possible, then we consider the card dirty.

When performing the new generation GC, search for dirty cards in the card table, and then add the objects in the dirty cards to the GC Roots of Minor GC.

After completing the scan of all dirty cards, the Java virtual machine will clear the flag bits of all dirty cards.

Note that objects will be copied during the new generation GC, and the address of the object will change. Therefore, the identification bit of the card where the reference is located should also be updated at this time to ensure that the dirty card must contain a reference to the new generation object.

7. Garbage collection for different generations

Partial GC: refers to the target is not the entire Java heap, specifically divided into the following:
- Minor GC/Young GC: only garbage collection for the young generation (very common)
- Old generation collection (Major GC/Old GC): only garbage collection for the old generation. (Only the CMS collector has this behavior of collecting the old generation separately)
- Mixed GC: Garbage collection for the entire new generation and part of the old generation. (Only the G1 collector has this behavior)
Full Heap Collection (Full GC): Garbage collection for the entire Java heap and method area

8. Dynamic generational age judgment

The age threshold for an object to be promoted to the old generation can be set through the parameter -XX:MaxTenuringThreshold. The default is 15

When Hotspot traverses all objects, it accumulates the size occupied by them in ascending order of age.

当累积的某个年龄大小超过了 survivor 区的一半时, take the smaller value between this age and MaxTenuringThreshold as the new promotion age threshold.

Then the generational age objects that exceed this new threshold will be promoted to the old generation, freeing up survivor space in advance.

4. Garbage collection algorithm

1 Overview

From the perspective of how to determine the death of an object, garbage collection algorithms can be divided into two categories:

Reference counting garbage collection (direct garbage collection), mainstream JVM does not use this method
Tracking garbage collection (indirect garbage collection), common garbage collection belongs to this method

2. Mark-clear algorithm

Mark-Sweep algorithm (Mark-Sweep) is the earliest and most basic GC algorithm.

It is divided into two stages: mark and clear.

Mark all living objects, and then collect all unmarked objects uniformly

Mark-and-sweep algorithm flaws

执行效率不稳定, if the Java heap contains a large number of objects that need to be recycled, the efficiency of the marking and clearing phases will be very low.
After marking and clearing, 产生大量不连续的内存碎片it will cause the next garbage collection to be triggered in advance if there are not enough large objects to allocate.

Most of the subsequent garbage collection algorithms are based on mark-sweep and improved upon it.

3. Mark-copy algorithm

1. Initial replication algorithm

In order to solve the problem of low efficiency of the clearing algorithm when facing a large number of objects, the idea of the copying algorithm is:

“半区复制”，将可用内存按容量划分为大小相等的两块，每次只使用其中的一块
当这块空间用完，就把这里面还存活的对象复制到另一块内存中，然后把已使用的内存空间全部清理掉。

Advantages and Disadvantages

advantage:

If most objects in the memory need to be recycled, only a small number of objects need to be copied, which is very efficient.
Each time, the entire half area is recycled.不会产生空间碎片

shortcoming:

If most objects in the memory need to survive, a large number of memory copies will occur, which is inefficient.
The available memory has been reduced to half of the original size.空间浪费太大了

2. Eden and Survivor

Most JVMs use a copy algorithm to recycle the new generation.

IBM found that 98% of the objects in the new generation will not survive the first garbage collection, so there is no need to "allocate memory in half".

The specific method is, 把新生代分为一块较大的Eden空间，和两块较小的Survivor空间.

Each time memory is allocated, only Eden and one of the Survivors are used. Garbage collection occurs, copy the surviving objects in Eden and Survivor to another Survivor, and then clean up Eden and the last Survivor directly.

HotSpot's default Eden and Survivor size ratio is 8:1, that is, the available memory space of each new generation accounts for 90% of the entire new generation space. A redundant Survivor space is used to copy surviving objects. This waste of space is allowed.

If there are many surviving objects and the Survivor cannot fit them in, other memory areas (mostly the old generation) will be needed for "allocation guarantee".

3. Details of new generation garbage collection

The JVM triggers a Minor GC, and the surviving objects in the Eden area and Survivor from area will be copied to the Survivor to area.
Then exchange the from and to pointers to ensure that the Survivor area pointed to by to is still empty during the next Minor GC.
The Java virtual machine records how many times the objects in the Survivor area have been copied back and forth.
- If the number of times an object is copied is 15, the object will be promoted to the old generation.
- In addition, if a single Survivor area is already 50% occupied, objects with a higher number of copies will also be promoted to the old generation in advance.

4. Allocation guarantee mechanism

Since a smaller memory area is used to copy surviving objects, if this area cannot hold all the surviving objects, the space in the old generation will be used for allocation guarantee.

The specific method is to directly promote some surviving objects to the old generation.

Before JDK 1.6:
- Before young GC occurs, it will first check whether the maximum available continuous space in the old generation is greater than the space of all objects in the new generation. If it is greater than this, then young GC is definitely safe this time
- If not, check whether the "Allow guarantee failure" parameter of the virtual machine is turned on.
- If it is enabled, it will continue to check whether the maximum available continuous space in the old generation is greater than the average size of objects promoted to the old generation. If it is greater, a Minor GC will be attempted.
- If it is not enabled, a Full GC will be performed.
After JDK 1.6:
- As long as the continuous space of the old generation is larger than the total size of the new generation objects or the average size of previous promotions, Minor GC will be performed, otherwise Full GC will be performed.

4. Marking-sorting algorithm

For the old generation, the survival rate of objects every time they are recycled is relatively high, so it is not suitable to use the algorithm of copying surviving objects. Moreover, there is no memory space "guaranteed" for the old generation.

For the old age, a "collation algorithm" was proposed. It also marks it first, but instead of deleting it directly in place, it copies all surviving objects neatly to one side of the memory to achieve the effect of organizing the memory space.

Moving live objects has both advantages and disadvantages:

shortcoming:

After garbage collection in the old generation, a large number of objects will survive, and moving this data is time-consuming.
Moving the memory address of the surviving object means that the reference address needs to be re-modified. This update operation is also very expensive.
This kind of object movement operation must suspend the user application for the entire process (read barriers can solve this problem)

advantage:

No memory fragmentation

The difference between this method and "mark-clear" is only the way of processing "marked objects". One deletes them in place and the other moves them.

Moving is more time-consuming, but it is very convenient to access. However 如果要针对碎片化的内存进行特殊设计，则比较麻烦, it significantly reduces access efficiency, which actually lowers the overall efficiency.

A balanced approach is to usually use the clearing algorithm to allow the existence of fragments but not optimize access to it. If memory fragmentation affects object allocation, use a defragmentation algorithm to obtain a regular memory space. The CMS collector is this strategy.

5. Summary

Mark-clear:
- Mark live objects, then delete unmarked objects.
- The disadvantage is that a large number of memory fragments will be generated. When larger objects need to be allocated in the future, the next garbage collection may be triggered in advance.
Mark-Copy:
- Divide the entire heap into two parts and use only one piece at a time.
- Mark the surviving objects and copy them to the other half of the memory, clearing the entire previous block of memory.
- The advantage is that there is no need to consider memory fragmentation, and memory can be allocated directly in sequence.
- The disadvantage is that memory utilization is not high. The reason for dividing the heap in half is so that there will be no situation where there are too many surviving objects and the other half of the space cannot be accommodated.
- The idea of the Enen area and the survivor area is that if the survivor area cannot accommodate surviving objects, the old generation guarantee will be used. In this way, the survivor area can be set smaller to improve memory utilization.
Tag-organize:
- Mark all living objects, move them to one end of the memory, and clean up the memory outside the boundary.
- The advantage is that you can get a regular memory space
- The disadvantage is that if there are many surviving objects, the moving process will be time-consuming.

6. What algorithm is generally used in the new generation and the old generation?

The new generation generally uses the "mark copy" algorithm, and the old generation generally uses the "mark clear" and "mark sorting" algorithms.

1. Why does the new generation not use the clearing algorithm?

In the new generation, a large number of objects will die during each garbage collection, and only a few will survive.

So if 删除大量的死亡对象，效率肯定不如复制少量的存活对象更高.

And, clear the algorithm 会带来大量内存碎片. Memory will be allocated frequently for new objects in the new generation 碎片过多肯定会导致垃圾回收多次触发.

Moreover, the copy algorithm has been improved and can compress the Survivor area, and the memory utilization rate is not low. If there is insufficient space during copying, it is guaranteed by the old generation.

2. Why does the old generation not use the clearing algorithm?

The characteristic of the old generation is that there are more surviving objects. If you mark dead objects and then clear them, the efficiency is actually good.

However, the clearing algorithm will bring a lot of memory fragments. For this environment, a free list must be designed to allocate memory, which will reduce the efficiency of normal memory allocation.

Therefore, we chose a sorting algorithm that is less efficient during GC. It can bring regular memory space and is more convenient to use in daily life.

3. Why does the old generation not use the replication algorithm?

On the one hand, it is absolutely impossible to divide the entire old generation into two equal-sized blocks. Such a waste of memory space is too great. Moreover, the new generation's large Eden area and small survivor area cannot be used, because there is no extra space to allocate and guarantee it.

On the other hand, the survival rate of objects in the old generation is high, and replication is not efficient.

So choose to use the "mark cleaning" or "mark sorting" algorithm for recycling.

5. HotSpot algorithm implementation details

1. Root node enumeration and OOPMap

Before the virtual machine GCs, it needs to first mark surviving objects through reachability analysis, and reachability analysis is divided into two stages:

Root node enumeration to determine all GCRoots
Find the reference chain along the root node

Root node enumeration requires STW

All garbage collectors must first pause the user thread when enumerating the root node. The purpose here is to ensure consistency and achieve accurate garbage collection.
Because the reference chain of the root node is constantly changing during the running of the user process, a state at a certain time must be extracted to ensure the accuracy of the analysis.
Finding them every time is time-consuming, and the time spent suspending user threads should be minimized.

How to shorten the time for root node to enumerate STW

There are two main categories of fixed nodes that can be used as GC Roots:

Global references (constants, static properties of classes)
Execution context (reference object in stack frame)

The stupidest method requires a complete scan of the method area and stack space to find these objects, causing STW to take a very long time.

One idea is to exchange space for time, which is the idea of OOPMap.

OOPMap

Record the reference type and its corresponding location information in a hash table, so that the hash table can be read directly during GC instead of scanning area by area.

What HotSpot does is to use a set of data structures called "OopMap".

Once the class loading action is completed, HotSpot records the offset in memory of each data type in the class.

In this way, OOPMap can be used to quickly complete the root node enumeration during GC, thus reducing the STW time.

2. Safe point

With the help of OopMap, HotSpot can quickly complete the enumeration of the root node.

However, during the running of the program, there are often some operations that cause the reference to change, so the content of the OopMap needs to be updated, otherwise an error will occur when taking this OOPMap to the GC.

There are usually many such operations. If a corresponding OopMap is generated for each changing moment, the cost of this extra space will be too high.

HotSpot does not do this, but only records the relevant information of the object reference at the "safe point".

This means that only when the program reaches a safe point can the user thread pause. The OopMap corresponding to this safe point is used to enumerate the root node and start garbage collection.

When the Java virtual machine receives a Stop-the-world request, it will wait for all threads to reach the safe point before allowing the thread requesting Stop-the-world to perform exclusive work.

At a safe point, no bytecode is executed.

Where does the safe spot appear?

Safe points should not be too sparse, otherwise garbage collection will not be carried out in time.
It cannot be too intensive, otherwise the cost of additional maintenance of OOPMap will be very high, and there is no need to perform such frequent GC.

The criteria for selecting safe points are:

Does it have characteristics that allow the program to execute for a long time?
For example, the reuse of instructions includes method calls, loops, exception handling, etc. Only instructions with these capabilities will generate safe points

How safepoints work

How to make all user threads execute to the nearest safe point and then stop when GC occurs?

There are two ways to pause the user thread:

preemptive interrupt
- When GC occurs, the system first interrupts all user threads.
- If it is found that the interruption position of some user threads is not at a safe point, let it continue to run, run to a safe point and then interrupt again.
proactive interrupt
- The thread will not be interrupted directly, but a flag will be set globally.
- The user thread will continuously poll this flag bit. When the flag bit is found to be true, the thread will actively interrupt and suspend at the nearest safe point.

Comparison of two methods:

The disadvantage of preemptive interrupts is that the performance is unstable, resulting in uncontrollable time consumption for the action of pausing the user thread, and it involves repeated interrupts, which is not efficient. No virtual machine is designed like this
Active interruption is more reasonable, virtual machines are designed this way

3. Safe area

1. Why is a safe area needed?

User threads usually interrupt automatically after reaching a safe point through active interruption.

However, some inactive user threads are blocked or suspended during GC. When they are inactive, they have no way to poll the flag bit, and they cannot interrupt themselves at a safe point.

This leads to a problem. It is possible that after the GC starts root node enumeration, these threads resume running and modify the references, resulting in inaccurate garbage collection.

This problem needs to be solved by introducing a safe area.

The idea of the safe zone is to ensure that the reference relationship will not change in a certain piece of code, so it is safe to start garbage collection from anywhere in this zone.

That is, the safe area is similar to an elongated safe point

2. How the safe area works

When the user thread executes the code in the safe area, it will first identify that it has entered the safe area.
After that, the virtual machine will initiate GC and set the global interrupt flag bit.
If a thread wants to leave the safe area, it will first check the active interrupt flag bit
If a thread is blocked or suspended in a safe area, after it resumes running, it will first check the active interrupt flag bit.
- If the flag bit is true, it means that GC will be or is being initiated, then the current thread will continue to block.
- If the flag is false, normal operation resumes