Detailed G1 garbage collector

When G1 (Garbadge First Collector) As a latest JVM garbage collector, can solve the CMS Concurrent Mode Failed problem, try to shorten processing large heap pause, garbage collection is completed in the G1 memory compression, reducing the generation of memory fragmentation. G1 performance in the heap memory is relatively large when a relatively high throughput and short dwell time, and has become the default Java collector 9

GC principle of G1


G1 memory structures and traditional memory space is divided into a relatively different. G1 divides the memory has become equal to the size of a plurality of Region (default 512K), the Region logically contiguous physical memory addresses are not continuous. While each Region is marked as E, S, O, H, respectively, Eden, Survivor, Old, Humongous. Where E, S belongs to the young generation, O and H belongs years old

Diagram is as follows:

Here Insert Picture Description

H represents Humongous. It can be understood literally represent large objects (hereinafter referred to H objects). When the object allocation is more than equal to half the size of the Region when the will is considered to be a giant object. H objects default assignment in the old time, can prevent memory copy of the GC when a large object. If it finds no place for heap memory by H when the object will trigger a GC operation

Intergenerational references

Young GC during the time zone of the target Young There may also cited Old District, this is the problem of inter-generational references. In order to solve when Young GC and scan the entire decade old, G1 introduced Card Tableand Remember Setthe concept, the basic idea is to use space for time. The two data structures are designed to handle a reference to Young Old zone region. Young cited Old zone to zone does not need to be dealt with separately, because the Young area relatively large changes in the object itself, no need to waste space to record

  • RSet: full name Remembered Sets, used to record all of this Region external reference points, each maintains a RSet Region
  • Card: JVM memory divided into separate fixed Card size. Here the concept can be compared on a physical memory page

The following figure shows that RSetthe Cardrelations. Each Regionis divided into a plurality Card, wherein the green portion Cardindicates that Cardthere is a reference to the object other Cardobjects, this reference relationship represented by solid blue line. RSetIs actually a HashTable, Key is the starting address of the Region, Value is Card Table(array of bytes), the byte array subscripts denote Cardthe space address, when the address space is referenced is marked asdirty_card

Here Insert Picture Description


SATB full name (Snapshot At The Beginning) is literally a snapshot of live objects before the start of GC. SATB role is to ensure the correctness of the concurrent mark phase. How to understand this sentence?

First to introduce three color marker algorithm

Here Insert Picture Description

  • Black: root object, or the object of the child object is scanned
  • Gray: scanned object itself, but it has not been scanned in a sub-object of the object
  • White: After the object is not scanned, the scan is completed all of the objects and, ultimately, for the unreachable white object, that object is garbage

GC C before scanning color as follows:

Here Insert Picture Description

In the concurrent mark phase, the application threads changed this relationship references


The following results

Here Insert Picture Description

In the phase of the scan results are relabeled

Here Insert Picture Description

In this case C will be treated as garbage collection. Snapshot of live objects turned out to be A, B, C, now turned into A, B, and the complete destruction of the Snapshot, this practice is obviously unreasonable

G1 uses a pre-write barriersolution to this problem. It simply is concurrent marking phase, when the change of the reference relationship, by pre-write barrierrecorded and stored in a queue that this function will change, this queue is called the source code in the JVM satb_mark_queue. In the remark phase will scan the queue, in this way, the old object reference points will be marked on their children and grandchildren will be recursively on the mark, so it will not leak mark any objects, integrity snapshot of it guaranteed

The entire R SATB interpretation of the large:

In fact, just use pre-write barrier to the old reference value down like every time you change the reference relationship. In this way, such as when concurrent marker to reach an object, all the object references all have varying types of field record, you will not miss any object in the snapshot was alive. Of course, there are likely to object snapshot is living, but with concurrent GC it may already be dead, but SATB or will it survive the GC. The incremental update CMS design allows it to be re-scanned at all stages remark thread stacks and entire young gen as root; G1 phase of SATB designed remark only need to scan the rest of the satb_mark_queue, to solve the CMS garbage collector to re-mark phase for a long time STW of the potential risks. "

SATB way to record live objects, that is, the moment the object snapshot, but after that the object inside may become garbage, called floating garbage (floating garbage), this object can only wait until the next recycling collection out. In the GC process newly allocated objects are as alive, other unreachable object is dead

How do I know which objects are allocated after the start of a new GC it?

In Region by top-at-mark-start (TAMS) pointers, respectively, and nextTAMS prevTAMS assigned to record the new object. Diagram is as follows:

Here Insert Picture Description

Each region recorded two top-at-mark-start (TAMS) pointers, respectively, and prevTAMS nextTAMS. TAMS above object is newly allocated, it has been regarded as an implicit marked. The entire R big explanation

Marking G1 is concurrent with the two bitmap: a prevBitmap Marking resulting recording the n-1 concurrent objects wheel viable state. Since the n-1 th wheel concurrent marking has been completed, this information may be used as a bitmap. A nextBitmap records n-th wheel marking the concurrent result. The results of the current bitmap is going to be or concurrent marking, and has not been completed, so it can not be used.

Wherein the top region is the current allocation pointer, [bottom, top) is the current region that has been used (used) portion, [top, end) is not used can allocate space (unused)

  • [bottom, prevTAMS): This part of the survival of the object information may be learned by prevBitmap
  • [prevTAMS, nextTAMS): This part of the object in the n-1 concurrent marking is implicit wheel viable
  • [nextTAMS, top): This part of the object in the n-th wheel concurrent marking is implicit viable

The G1 GC mode

Young GC

Young GC recovery of all the young generations Region. When the E zone can not be assigned a new object is triggered . Object E area will move to the S area, when the S region of space is not enough time, the object E area will be promoted directly to the O area, while the data movement S area to the new S zone, if part of the object S area of a certain age , will be promoted to O area

Yung GC process diagram is as follows:

Here Insert Picture Description

Mixed GC

Mixed GC translation called mixed recycling. It is called mixed because the Region to recycle all of the young generation's old part of Region +

1. Why is the old age part of the Region?

2, when the trigger Mixed GC?

In fact, these two issues can be answered together. Recycling part of the old year is the parameter -XX:MaxGCPauseMillisused to specify a target pause time G1 collection process, the default value of 200ms, of course, this is just a desired value. The power of G1 is that he has a pause prediction model (Pause Prediction Model), it will have to select the selection part of the Region, to try to meet the dwell time

Mixed GC trigger is also controlled by a number of parameters. For example, XX:InitiatingHeapOccupancyPercentrepresent old's percentage of the total heap size, the default value is 45%, the threshold is reached will trigger a Mixed GC

Mixed GC can be divided into two stages:

1, numerals global concurrent (global concurrent marking)

Global concurrent tag can be further subdivided into the following steps:

  • Initial labels (initial mark, STW). It marked the beginning of an object from GC Root directly reachable. The initial marking phase borrow young GC pauses, thus no additional, separate pause stage
  • Concurrent mark (Concurrent Marking). This phase began on the heap object tag from the GC Root, marking thread and application threads execute in parallel, and collect information for each Region of live objects. SATB write barrier in the scan process mentioned above will be recorded in the reference
  • The final mark (Remark, STW). Marking objects changes in concurrent mark phase, will be recovered
  • Garbage (Cleanup, part STW). If you find that at this stage there is no region live objects will be recovered in its entirety can be assigned region list. Region Clear Air

2, copy live objects (Evacuation)

Evacuation stage is full of pauses. It is responsible for part of the region where the live objects are copied to the empty region go (parallel copy), and then recovering the original region of space. Evacuation can freely select any of the plurality of phase region configured to collect separate collection set (collection set, referred CSet), CSet set dependent on the selected Region in the above-mentioned pause prediction model , the stage does not have all the live objects evacuate the region, only a small number of selected high-yield region to evacuate, which suspended overhead can (within certain limits) controllable

Mixed GC schematic cleaning process is as follows:

Here Insert Picture Description

Full GC

G1 is the garbage collection process and applications concurrent execution, when Mixed GC as fast as applications for memory speed, Mixed G1 will be downgraded to a Full GC, using the Serial GC. Full GC will lead to a long period of STW, should be avoided

Causes G1 Full GC may have two:

  • Evacuation when there is not enough to-space to store objects for promotion
  • Concurrent processing is complete before running out of space
Published 177 original articles · won praise 407 · views 80000 +

Guess you like