Garbage collection (2) CMS

A, CMS

Full name of Concurrent Mark Sweep, is a concurrent use of mark - sweep garbage collection algorithm.

1, parallel, STW time is short.
2, there is no compression and finishing, memory fragmentation.

Objects in the marking process, according to the mark, the divided into three categories:

1, a white object is not marked indicates itself;
2, gray object representing itself is labeled, but the internal reference has not been processed;
3, black object representing itself labeled, internal references are processed;

Garbage collection (2) CMS

Two, CMS collection process is divided into five steps.

CMS GC assumed before the stack structure as shown below:
Garbage collection (2) CMS

1, initial labels (InitialMarking)

This is a STW process, mainly in two steps
1, mark the object's old GC Roots reachable;
2, traversing the object's new generation of old objects under the GC Roots can be reachable;
3, this process is not more than up to the old further up the object's scan.

result:
Garbage collection (2) CMS

2, concurrent mark and precleaning (Marking & Precleaning & AbortablePreclean)

The application stage and GC threads concurrently executing threads, traversing the live objects InitialMarking stage marked, and then recursively mark objects those objects reachable.
This process application threads running, Young GC may also occur, will occur the following conditions:
1, the new generation of the object's promotion to the old
2, in the Assigned years old
3 years old and refer to the new object changes.

result:

Garbage collection (2) CMS

2.1, how to deal with changes in the object during concurrent mark it?

CMS uses the one talked about the Card Table to solve this problem
when Card object reference occurs during the concurrent mark where the change in the Card Table is recorded as "dirty card" so that re-marked at the back of these objects will also to traverse as GC Root

但是Young GC如果发生,比方说:
1、并发标记还未扫描到脏卡1.
2、Young GC扫描完脏卡,并改变dirty到clean.
3、并发标记扫描,发现卡1已不是脏卡,则不会处理,这就造成了漏标。

2.2、如果解决以上的问题呢?

CMS中,有另一种数据结构(Mod Union Table)
Mod Union Table是一个位向量,每个单元的大小只有1位,每个单元对应一个Card(Card的大小是512字节,Card Table每一个单元的大小是1个字节)
在新生代GC处理dirty card之前,先把该card在Mod Union Table里面的对应项置位。
这样,CMS在执行重新标记阶段的时候,就会扫描Mod Union Table和card table里面被标记的项。

3、重新标记(STW的过程)

1、遍历新生代对象,重新标记
2、根据GC Roots,重新标记
3、遍历老年代的Dirty Card和Mod Union Table,重新标记

在第1步骤中,需要遍历新生代的全部对象,如果新生代的使用率很高,需要遍历处理的对象也很多,这对于这个阶段的总耗时来说,是个灾难(因为可能大量的对象是暂时存活的,而且这些对象也可能引用大量的老年代对象,造成很多应该回收的老年代对象而没有被回收,遍历递归的次数也增加不少),如果在这之前发生一次YGC,这样就可以避免扫描无效的对象。

CMS算法中提供了一个参数:CMSScavengeBeforeRemark,默认并没有开启,如果开启该参数,在执行该阶段之前,会强制触发一次YGC,可以减少新生代对象的遍历时间,回收的也更彻底一点。

4、并发清理

清理在标记阶段收集标识为不可达的对象

5、重置

清除数据结构,准备下一次并发收集。

Guess you like

Origin blog.51cto.com/janephp/2427903