Analysis of GC technology

introduce

    G1 GC, the full name of Garbage-First Garbage Collector, is enabled by the -XX:+UseG1GC parameter. The G1 collector is a collector that works on different partitions in the heap. The partition can be either the young generation or the old generation, and the partitions of the same generation do not need to be consecutive. And the number of each generation partition can be dynamically adjusted. The purpose of setting partitions for the old generation is that some partitions in the old generation have more garbage, and some partitions have less garbage, so that when recycling, you can focus on collecting the partitions with more garbage, which is also the origin of the name of G1. However, this algorithm is not suitable for garbage collection of the new generation, because the garbage collection algorithm of the new generation is a replication algorithm, but the new generation also uses the partition mechanism mainly because it is convenient to adjust the size of the generation.
    G1 GC is designed to replace CMS. Compared with CMS, G1 has the following advantages:
1. Predictable pause model
2. Avoids the garbage fragmentation of CMS
3. The performance of super heap is better

G1 Key Concepts

Region

    The concept of Region in G1 is different from the concept of partition in traditional garbage collection algorithms. G1 divides the heap memory into 1024 partitions by default, and the units of subsequent garbage collection are all in Region. Region is the basis for implementing the G1 algorithm. The size of each Region is equal. The size of the Region can be set by the -XX:G1HeapRegionSize parameter. As shown in the figure below: E in the figure represents the Eden area, S represents the Survivor, O represents the Old area, and H represents the humongous, which represents a giant object (an object with half the size of the Region space). It can be seen from the figure that each area is not logically continuous. And a Region is Eden at one moment, and may belong to the old generation at another moment. When G1 performs garbage cleaning, it copies the objects of one Region to another Region.

SATB

The full name of SATB is Snapchat-At-The_Beginning. SATB is a means of maintaining concurrent GCs. The foundation of G1 concurrency is SATB. SATB can be understood as taking a snapshot of the objects in the heap memory before the GC starts. At this time, the live objects are considered to be alive, thus forming an object graph. During the GC collection, the objects of the new generation are also considered as live objects, and other unreachable objects are considered as garbage objects.
How to find the objects allocated during the GC process? Each region records two top-at-mark-start (TAMS) pointers, prevTAMS and nextTAMS. Objects above TAMS are newly allocated and thus considered implicitly marked. In this way, we find the newly allocated objects during the GC process and consider these objects to be live objects.
The problem of object allocation during GC is solved, so how to solve the problem of reference changes during GC? The solution given by G1 is through Write Barrier. Write Barrier is to make a circle cut on the assignment of the reference field. Through Write Barrier, you can know which reference objects have changed.

RSet

The full name of RSet is Remember Set, and there is an RSet in each Region, which records the relationship between objects in other Regions referencing this Region object (who refers to my object). There is another data structure in G1, Collection Set (CSet). CSet records the collection of Regions to be collected by GC. Regions in CSet can be of any generation. During GC, for old->young and old->old cross-generational object references, just scan the RSet in the corresponding CSet.

Pause prediction model

G1收集器突出表现出来的一点是通过一个停顿预测模型来根据用户配置的停顿时间来选择CSet的大小,从而达到用户期待的应用程序暂停时间。通过-XX:MaxGCPauseMillis参数来设置。这一点有点类似于ParallelScavenge收集器。关于停顿时间的设置并不是越短越好。设置的时间越短意味着每次收集的CSet越小,导致垃圾逐步积累变多,最终不得不退化成Serial GC;停顿时间设置的过长,那么会导致每次都会产生长时间的停顿,影响了程序对外的响应时间。

#G1回收的过程
G1垃圾回收分为两个阶段:
1、全局并发标记阶段(Global Concurrent marking)
2、拷贝存活对象阶段(evacuation)

全局并发标记阶段

    全局并发标记阶段是基于SATB的,与CMS有些类似,但是也有不同的地方,主要的几个阶段如下:
初始标记:该阶段会STW。扫描根集合,将所有通过根集合直达的对象压入扫描栈,等待后续的处理。在G1中初始标记阶段是借助Young GC的暂停进行的,不需要额外的暂停。虽然加长了Young GC的暂停时间,但是从总体上来说还是提高的GC的效率。
并发标记:该阶段不需要STW。这个阶段不断的从扫描栈中取出对象进行扫描,将扫描到的对象的字段再压入扫描栈中,依次递归,直到扫描栈为空,也就是说trace了所有GCRoot直达的对象。同时这个阶段还会扫描SATB write barrier所记录下的引用。
最终标记:也叫Remark,这个阶段也是STW的。这个阶段会处理在并发标记阶段write barrier记录下的引用,同时进行弱引用的处理。这个阶段与CMS的最大的区别是CMS在这个阶段会扫描整个根集合,Eden也会作为根集合的一部分被扫描,因此耗时可能会很长。
清理: 该阶段会STW。清点和重置标记状态。这个阶段有点像mark-sweep中的sweep阶段,这个阶段并不会实际上去做垃圾的收集,只是去根据停顿模型来预测出CSet,等待evacuation阶段来回收。

拷贝存活对象阶段

    Evacuation阶段是全暂停的。该阶段把一部分Region里的活对象拷贝到另一部分Region中,从而实现垃圾的回收清理。Evacuation阶段从第一阶段选出来的Region中筛选出任意多个Region作为垃圾收集的目标,这些要收集的Region叫CSet,通过RSet实现。
筛选出CSet之后,G1将并行的将这些Region里的存活对象拷贝到其他Region中,这点类似于ParalledScavenge的拷贝过程,整个过程是完全暂停的。关于停顿时间的控制,就是通过选择CSet的数量来达到控制时间长短的目标。

G1的收集模式:

YoungGC:收集年轻代里的Region
MixGC:年轻代的所有Region+全局并发标记阶段选出的收益高的Region
无论是YoungGC还是MixGC都只是并发拷贝的阶段。

分代G1模式下选择CSet有两种子模式,分别对应YoungGC和mixedGC:
YoungGC:CSet就是所有年轻代里面的Region
MixedGC:CSet是所有年轻代里的Region加上在全局并发标记阶段标记出来的收益高的Region

    G1的运行过程是这样的,会在Young GC和Mix GC之间不断的切换运行,同时定期的做全局并发标记,在实在赶不上回收速度的情况下使用Full GC(Serial GC)。初始标记是搭在YoungGC上执行的,在进行全局并发标记的时候不会做Mix GC,在做Mix GC的时候也不会启动初始标记阶段。当MixGC赶不上对象产生的速度的时候就退化成Full GC,这一点是需要重点调优的地方。

G1最佳实践

    在使用G1垃圾收集器的时候遵循以下实践可以少走不少弯路:

不断调优暂停时间指标

    通过XX:MaxGCPauseMillis=x可以设置启动应用程序暂停的时间,G1在运行的时候会根据这个参数选择CSet来满足响应时间的设置。一般情况下这个值设置到100ms或者200ms都是可以的(不同情况下会不一样),但如果设置成50ms就不太合理。暂停时间设置的太短,就会导致出现G1跟不上垃圾产生的速度。最终退化成Full GC。所以对这个参数的调优是一个持续的过程,逐步调整到最佳状态。

不要设置新生代和老年代的大小

    G1收集器在运行的时候会调整新生代和老年代的大小。通过改变代的大小来调整对象晋升的速度以及晋升年龄,从而达到我们为收集器设置的暂停时间目标。设置了新生代大小相当于放弃了G1为我们做的自动调优。我们需要做的只是设置整个堆内存的大小,剩下的交给G1自己去分配各个代的大小。

关注Evacuation Failure

Evacuation Failure类似于CMS里面的晋升失败,堆空间的垃圾太多导致无法完成Region之间的拷贝,于是不得不退化成Full GC来做一次全局范围内的垃圾收集。

G1常用参数

参数/默认值 含义

-XX:+UseG1GC 使用 G1 垃圾收集器
-XX:MaxGCPauseMillis=200 设置期望达到的最大GC停顿时间指标(JVM会尽力实现,但不保证达到)
-XX:InitiatingHeapOccupancyPercent=45 启动并发GC周期时的堆内存占用百分比. G1之类的垃圾收集器用它来触发并发GC周期,基于整个堆的使用率,而不只是某一代内存的使用比. 值为 0 则表示”一直执行GC循环”. 默认值为 45.
-XX:NewRatio=n 新生代与老生代(new/old generation)的大小比例(Ratio). 默认值为 2.
-XX:SurvivorRatio=n eden/survivor 空间大小的比例(Ratio). 默认值为 8.
-XX:MaxTenuringThreshold=n 提升年老代的最大临界值(tenuring threshold). 默认值为 15.
-XX:ParallelGCThreads=n 设置垃圾收集器在并行阶段使用的线程数,默认值随JVM运行的平台不同而不同.
-XX:ConcGCThreads=n 并发垃圾收集器使用的线程数量. 默认值随JVM运行的平台不同而不同.
-XX:G1ReservePercent=n 设置堆内存保留为假天花板的总量,以降低提升失败的可能性. 默认值是 10.
-XX:G1HeapRegionSize=n 使用G1时Java堆会被分为大小统一的的区(region)。此参数可以指定每个heap区的大小. 默认值将根据 heap size 算出最优解. 最小值为 1Mb, 最大值为 32Mb.

G1日志分析

copy code
//新生代GC
2018-05-03T10:21:43.209-0800: [GC pause (G1 Humongous Allocation) (young) (initial-mark), 0.0035356 secs]  //初始标记,耗时0.0035秒
   [Parallel Time: 2.4 ms, GC Workers: 8]  //并行8个线程,耗时2.4ms
      [GC Worker Start (ms): Min: 813.1, Avg: 813.7, Max: 813.9, Diff: 0.7]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.5, Sum: 9.1]   //每个扫描root的线程耗时
      [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]   //更新RS的耗时,G1中每块区域都有一个RS与之对应,RS记录了该区域被其他区域引用的对象。回收时,就把RS作为根集的一部分,从而加快回收
         [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]  //Processed Buffers就是记录引用变化的缓存空间
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]   //扫描RS
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]  //根扫描耗时
      [Object Copy (ms): Min: 0.0, Avg: 0.5, Max: 1.3, Diff: 1.3, Sum: 3.6] //对象拷贝
      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.2]   
         [Termination Attempts: Min: 1, Avg: 1.8, Max: 4, Diff: 3, Sum: 14]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 1.6, Avg: 1.8, Max: 2.3, Diff: 0.8, Sum: 14.1]   //GC线程耗时
      [GC Worker End (ms): Min: 815.4, Avg: 815.4, Max: 815.4, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.1 ms]   //清空CardTable耗时,RS是依赖CardTable记录区域存活对象的
   [Other: 1.1 ms]
      [Choose CSet: 0.0 ms]   //选取CSet
      [Ref Proc: 0.9 ms]  //弱引用、软引用的处理耗时
      [Ref Enq: 0.0 ms]   //弱引用、软引用的入队耗时
      [Redirty Cards: 0.1 ms]
      [Humongous Register: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.0 ms]   //释放被回收区域的耗时(包含他们的RS)
   [Eden: 5120.0K(24.0M)->0.0B(12.0M) Survivors: 0.0B->2048.0K Heap: 16.0M(50.0M)->12.4M(50.0M)]
 [Times: user=0.01 sys=0.00, real=0.01 secs] 
 //根区域扫描
2018-05-03T10:21:43.213-0800: [GC concurrent-root-region-scan-start]
2018-05-03T10:21:43.214-0800: [GC concurrent-root-region-scan-end, 0.0012422 secs]
// 并发标记
2018-05-03T10:21:43.214-0800: [GC concurrent-mark-start]
2018-05-03T10:21:43.214-0800: [GC concurrent-mark-end, 0.0004063 secs]
//重新标记又叫最终标记
2018-05-03T10:21:43.214-0800: [GC remark 2018-05-03T10:21:43.215-0800: [Finalize Marking, 0.0003736 secs] 2018-05-03T10:21:43.215-0800: [GC ref-proc, 0.0000533 secs] 2018-05-03T10:21:43.215-0800: [Unloading, 0.0007439 secs], 0.0013442 secs]
 [Times: user=0.00 sys=0.00, real=0.00 secs] 
 //独占清理
2018-05-03T10:21:43.216-0800: [GC cleanup 13M->13M(50M), 0.0004002 secs]
 [Times: user=0.01 sys=0.00, real=0.00 secs]
copy code


    这是一段完整的GC日志。从整体上看,并发标记周期和混合回收的前后都有可能穿插着新生代GC。其中并发标记周期主要是回收老年代空间,当然也包含了一次新生代GC。 

http://news.alj9141.cn/
http://news.vdf1425.cn/
http://news.miv2453.cn/
http://news.vdx0926.cn/
http://news.smc5776.cn/
http://news.ffn3573.cn/
http://news.rdj9135.cn/
http://news.mtu9335.cn/
http://news.gzv8338.cn/
http://news.xum5501.cn/
http://news.jiq1934.cn/
http://news.syh5891.cn/
http://news.yvr8830.cn/
http://news.aua2439.cn/
http://news.ath0401.cn/
http://news.gmx2930.cn/
http://news.pzf7790.cn/
http://news.ass0795.cn/
http://news.mox2684.cn/
http://news.oqc1977.cn/
http://news.bcu6005.cn/
http://news.ajj5951.cn/
http://news.xwt5617.cn/
http://news.rlv0165.cn/
http://news.shg1037.cn/
http://news.akj0836.cn/
http://news.ipc6507.cn/
http://news.kri6555.cn/
http://news.mzj8672.cn/
http://news.azq7227.cn/
http://news.zce9839.cn/
http://news.gjc9646.cn/
http://news.myo1179.cn/
http://news.ogr7085.cn/
http://news.bah1564.cn/
http://news.mjg4415.cn/
http://news.dkk2480.cn/
http://news.qru6126.cn/
http://news.ocs5821.cn/
http://news.wne9476.cn/
http://news.xuh4863.cn/
http://news.icb3050.cn/
http://news.tfe0886.cn/
http://news.xgs5975.cn/
http://news.umx9976.cn/
http://news.eyf3292.cn/
http://news.wxm6819.cn/
http://news.ewv7964.cn/
http://news.wdr5566.cn/
http://news.qdn5355.cn/
http://news.kpp1176.cn/
http://news.rxi1689.cn/
http://news.vja2045.cn/
http://news.qry8357.cn/
http://news.pck8038.cn/
http://news.hiv8337.cn/
http://news.bjl7141.cn/
http://news.qou5361.cn/

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325806997&siteId=291194637