Java GC - Principle of Garbage Collector

The speed at which Java allocates space from the heap can be compared to the speed at which other languages ​​allocate space from the stack.

 

The heap of the C++ language is regarded as a yard, and each object has its own piece of territory. Objects can be destroyed after a while, but of course the site needs to be reused. It's different in the JVM: it's more like a conveyor belt. Every time a new object is allocated, it moves forward by one space, so that the object allocation speed is fast. The java "heap pointer" is simply moved to an area that hasn't been allocated yet, with the speed of C++'s allocation on the stack. Of course, in the actual process, there will be negotiable extra overhead in bookkeeping, but it's small compared to finding free space.

 

As you may have realized, the heap in Java doesn't necessarily work exactly like a conveyor belt. Because of this, it causes frequent memory paging -- moving it in and out of the hard disk, and thus appears to require more memory than it actually needs. Paging can significantly impact performance, and eventually memory resources are exhausted after enough objects are created. Therefore, here is the core key of jvm: the garbage collector.

When the garbage collector works, it will reclaim the space and make the objects in the heap compactly arranged. Such a "heap pointer" can easily be moved closer to the beginning of the "carousel", thus avoiding page faults as much as possible. By rearranging objects by the garbage collector, a high-speed heap model with enough space for allocation is achieved.

 

Let's first look at the garbage collection mechanism of other languages: reference counting is a simple but very slow garbage collection technique. Each object contains a reference counter. When there is a reference to the connected object, the reference count is +1, and when the reference leaves the effect or is set to null, the reference count is -1. While the overhead of managing reference counts is small, it is one that persists throughout the life of the program. The garbage collector will traverse all the object lists. When it finds that the reference count of an object is 0, it will release the space occupied by it (this mechanism causes the object resources to be released immediately once the count reaches 0). Such methods include An obvious flaw is that if there is a circular reference between objects, "the object is reclaimed but the reference count is not 0" may occur. For the garbage collection period, locating such a group of mutually self-referential objects is again a lot of overhead. Reference counts are often used to describe how garbage collectors do garbage collection, but are not used by any garbage collector.

 

In a more general pattern, the garbage collector is not based on reference counting techniques. Central idea: for any "living" object. Must be able to trace back to the reference on the stack or static storage that kept it alive. This chain of references may traverse several object hierarchies. Thus, if you start from the stack or static storage, and facilitate all references, you can find all "alive" objects. For a found reference, the object it refers to must be traced, then all the references contained in this object, and so on, until the network of all "references rooted in the stack and static storage" is found and accessed. The objects you have visited should of course be "live". This solves the problem of "interactively self-referential groups of objects" - a phenomenon that is never detected at all and thus automatically reclaimed.

 

Under this idea, the Java virtual machine will adopt an "adaptive" recycling technology. As for how to deal with these surviving objects found, it depends on the implementation of the Java virtual machine.

1. stop-and-copy

This means suspending the program's execution (not in background recycling mode), and then copying all "live" objects from the current heap to another new heap. What is not copied is garbage, recycling. When the object is copied to the new heap, it is one after the other compact and direct memory allocation is simple and direct following the previous method.

When the heap is moved from one place to another, all references to it must be corrected. Those located in the heap or static storage area can be corrected by themselves, but there may be other references to these objects. They can be dynamically used when they are traversed. was found. (Mapping of old address - new address)

 

Disadvantages: "Copying" GC efficiency will be reduced, 1) First, two heaps must be obtained, 2) The split back and forth between the two heaps takes twice as much time to maintain as it actually takes, 3) Copy: program After entering the recessive state, there may be very little garbage, or even no garbage, but the GC will still waste all the memory from one place - another new heap.

Java processing: 1) Allocate several larger pieces of memory from the heap as needed, and copy actions occur between these larger memories, reducing the creation of new heaps. 2)

JVM check: if there is no new garbage, it will switch to a new mode (adaptive), which is called mark-and-sweep

 

3. Mark-and-sweep

The idea is still to start from the stack or static storage area, traverse all references, find all "live" objects, and whenever it finds an or object, it will mark the object, and this process will not reclaim any object.

After all markings are completed, the cleanup starts. The objects that are not marked are GC, and the memory is released, and there is no copy operation.

Disadvantages: After doing this, the remaining memory space is obviously not continuous, and it needs to be rearranged to connect. And this mode is slow.

 

Java GC theory:

The above two ideas are carried out under the tentative procedure.

As mentioned above, in the current Java VM, memory is allocated in larger "blocks", and if the object is larger, it will occupy a separate block. Strictly speaking, custom-copy requires that all surviving objects must be copied from the old heap to the new heap before the objects are released, which will result in a large number of copy blocks, which can be COPYed in the discarded blocks when the GC collects them.

Each block has a corresponding generation count to record whether it is alive or not. Usually, when the block is referenced somewhere, the generation count will be +1, and the GC will manage the newly allocated block after the last collection action, which is very useful for dealing with a large number of short-lived Temporary objects are especially effective.

The GC does a full cleanup on a regular basis, large objects are still not copied (algebra +1), those blocks containing small objects are copied and sorted. The JVM will monitor: all objects are stable, the GC is inefficient, and the "mark-sweep" mode is switched. Similarly, the JVM tracks the effect of this mode and returns to the "stop-copy" mode if a large number of fragments are found.

 

Adaptive technology: adaptive, generational, stop-copy, mark-sweep GC.

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326994206&siteId=291194637