You do not know the CMS GC

Before G1 came out, CMS is definitely a standard OLTP system. Even G1 came out a few years, a lot of JVM instance production environment or using a combination of ParNew + CMS. But even its been so widely used, there are still many students it has a deep misunderstanding. This paper focuses on ParNew under CMS classic combination, several garbage collection mode + trigger to correct a few concepts.

Backgroud

More people may only know the CMS, without knowing Backgroud CMS. In fact we are talking about CMS, that is, it contains CMS 5 stages, is the Background the CMS , as shown below:

You do not know the CMS GC

Description :

  • Figure initialization mark phase is serial, this is JDK7 behavior. After JDK8 default parallel, parameters can be -XX:+CMSParallelInitialMarkEnabledcontrolled.

  • From the chart, CMS there are two phases are completely STW (Stop The World), that initialization mark and final mark (re-mark).

  • Other stages are concurrent, so CMS is called Concurrent Mark&Sweep, but I think also need to add a front Mostly is the most appropriate, that is a CMS Mostly Concurrent Mark and Sweep Garbage Collector, because it had no way to be completely concurrent.

Not just a CMS, is G1, and JDK11 of ZGC not be completely concurrent. I understand that for all the current GC, only the C Four Azul is fully concurrent.

Why is there a Background Image? We all know that CMS configuration garbage collection, then there are two important parameters: -XX: CMSInitiatingOccupancyFraction = 75 -XX: + UseCMSInitiatingOccupancyOnly , which means that only two parameters in the Old District accounted for only 75% of the memory trigger is activated the CMS . Note that this is only the satisfaction of the trigger CMS GC conditions. As for when the real trigger CMS GC, decided by a background scanning thread. CMSThread default 2 seconds scan time to determine whether to trigger the CMS, this parameter can change this scan time interval, such as -XX: CMSWaitDuration = 5000, in addition you can see this thread by jstack log:

"Concurrent Mark-Sweep GC Thread" os_prio=2 tid=0x000000001870f800 nid=0x0f4 waiting on condition

Foregroud

The term first heard God say stupid. Of course, God is not just their own stupid fabricated out of a noun, this term comes from openjdk source reference concurrentMarkSweepGeneration.cpp:

void CMSCollector::collect_in_foreground(bool clear_all_soft_refs, GCCause::Cause cause) {
    case Resizing: {
        // nothing to be done in this state. 即这个阶段啥都没做
        _collectorState = Resetting;
        break;
    }  
    case Precleaning:
        // 预清理啥都没干
    case AbortablePreclean:
        // Elide(省略,取消的意思,相当于这个阶段也啥都没做) the preclean phase
        _collectorState = FinalMarking;
        break;
    default:
        ShouldNotReachHere();
}

Source more, I do not all posted, interested students can download the source code view.

Scene It happens, such as business thread requests to allocate memory, but the memory is not enough, so could trigger a CMS GC, this process would have to wait for the memory allocation successful business thread in order to continue to the next to go, so the whole process must STW, so this the whole process is kind of CMS GC STW, but in order to improve efficiency, it will not go at each stage, taking only some stage, we can see by the above source, these saved stage is parallel phases: Precleaning, AbortablePreclean, Resizing. But anyway if you take a similar foreground of this CMS GC, then the entire business process threads are not available, the efficiency will affect large one.

foreground collection mode in fact is what has happened FullGC, by the analysis of this comparison it shows FullGC CMS Backgroud collect mode gap is still very large.

If triggered FullGC, that is ParNew + CMS combined worst-case scenario. Because this time can not handle the concurrent model has, and the whole process is single-threaded, fully STW, it could even compressed stack (whether to compress the pile in the back of the MSC paragraph description), really can not be bad! Imagine if this time than larger businesses, resulting in service due FullGC complete pause for a few seconds or even 10 seconds of the user experience impact is much larger.

Also, do not think like a lot G1, G1 is also the presence of FullGC junk level:

The G1 garbage collector is designed to avoid full collections, but when the concurrent collections can't reclaim memory fast enough a fall back full GC will occur. The current implementation of the full GC for G1 uses a single threaded mark-sweep-compact algorithm.

From the original: http://openjdk.java.net/jeps/307

MSC

MSC stands for Mark Sweep Compact , namely mark - clean up - compression, MSC is an algorithm , please note Compact, that is, it will compress Heap, which is very important.

This is a garbage collection algorithm foreground CMS will be used in specific situations. As for when it will be compressed using MSC, look at the source code, still concurrentMarkSweepGeneration.cppin:

//a method used by foreground collection to determine what type of collection 
//(compacting or not, continuing or fresh)it should do.
void CMSCollector::decide_foreground_collection_type(){
  ... ... 
  *should_compact =
    UseCMSCompactAtFullCollection &&
    ((_full_gcs_since_conc_gc >= CMSFullGCsBeforeCompaction) ||
     GCCause::is_user_requested_gc(gch->gc_cause()) ||
     gch->incremental_collection_will_fail(true /* consult_young */));    
  ... ... 
}

Seen from this source, under the foreground collection mode MSC mode if the compression algorithm, then -XX:+UseCMSCompactAtFullCollectionthere are three possible under the premise:

  1. CMS concurrent GC after the last execution, then execution parameters -XX:CMSFullGCsBeforeCompaction=0number of Full GC specified, 0 will FullGC after each compression, while 0 is the default;

  2. Call System.gc (), which of course would meet -XX:-DisableExplicitGC;

  3. Promotion guarantee failure, that is not expected to have enough space to accommodate objects Old district next YoungGC promotion;

HOW?

Fragmentation problem has been labeled cleaning algorithm used by most people criticized CMS place: mark Backgroud CMS cleaning algorithm used can lead to memory fragmentation problems, which lay hidden FullGC occurs that causes a long STW's .

FullGC such a terrible, there are ways to relieve it, or try to avoid it during the day, even the peak periods occur? Have! I give you a share dishonest, I do not remember how many years ago, where hearsay before they get this recipe, but some business before Ali is said to also use this recipe, no matter where the prescriptions come, anyway, certainly useful. The recipe is simple: in the lowest peak business period (for example, many businesses can choose the continent in the dead of night when the morning 2,3 points) forced trigger FullGC (requires a combination of parameters -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0, these two parameters is by default, showing the trigger FullGC compression heap) to optimize heap memory fragmentation and compression, reducing the probability of FullGC occur during peak periods (can only reduce, not eliminate).

There may be a small part of the students even forced to trigger FullGC do not know, I do good in the end, Buddha to the West:

# 没有开启-XX:+DisableExplicitGC的前提下调用System.gc()就会发生FullGC
System.gc();

或者通过jmap命令触发:
# jmap -histo:live pid

to sum up

By convention, the last to a summary:

  • Backgroud trigger mode under normal circumstances CMS GC, which is complicated by the collection mode, the impact on small business, Hello I'm so good.

  • When not handle concurrent mode, it will degenerate into Foreground mode, the business recovery process threads are not available, this time triggered FullGC.

  • The above-mentioned next satisfies several conditions MSC decide whether to use compression algorithms stack.

  • CMSFullGCsBeforeCompaction decide after how many times FullGC compact the heap, the specific configuration of how much to you, but do not recommend too much, otherwise the heap before using MSC compression algorithm, due to the problem of memory fragmentation, resulting in promotion failure, in short, this is a trade-off.

Guess you like

Origin blog.51cto.com/14230003/2437694