Reference about GC space size and GC timing issues

http://www.importnew.com/13954.html

quote


[HotSpot VM] How to reduce the GC time of the young generation?


whb1984's blog

whb1984 2014-10-21

Hello everyone, please briefly describe the situation:

the system has been online for more than 10 hours, and the CMS recovery is acceptable only once. The problem now is that the minor GC time is getting more and more with the system running time long, most of them exceed 0.1s. And roughly as the memory of the old age increases, the minor GC time gradually becomes longer. From my understanding, the GC recovery time of the new generation should have nothing to do with the memory size of the old generation. Why does the minor GC take longer and longer?


Machine configuration: 
cpu: Intel(R) Xeon(R) CPU E5430 @ 2.66GHz × 8
memory: 16G              

jvm startup parameters:
-server -Xloggc:./log/gcviewer.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbosegc -Xms2048M -Xmx2048M -Xmn190M -XX:+DisableExplicitGC -XX:+ScavengeBeforeFullGC -XX:ParallelGCThreads=8 -XX:+ UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 -XX:+AggressiveOpts -XX:MaxGCPauseMillis=100 -XX:+HeapDumpOnOutOfMemoryError -XX:PermSize= 128M -XX:MaxPermSize=256M


gcviewer screenshot: The gray part is the GC time, and the blue part is the memory growth.


Cut out part of the gc log:
at the beginning: the beginning is acceptable, 0.02~0.03s
2014-10-21T16:12:00.390+0800: 333.022: [GC 333.022: [ParNew: 169158K->17105K(175104K), 0.0196470 secs ] 853779K->701725K(2077696K), 0.0203990 secs] [Times: user=0.13 sys=0.00, real=0.02 secs]
2014-10-21T16:12:02.701+0800: 335.333: [GC 335.333: [ParNew: 172753K->14227K(175104K), 0.0230520 secs] 857373K->698847K(2077696K), 0.0238350 secs] [Times: user=0.15 sys=0.00, real=0.03 secs]
2014-10-21T16:12:04.020+0800: 336.652: [GC 336.652: [ParNew: 169875K->16849K(175104K), 0.0217180 secs] 854495K->701470K(2077696K), 0.0224960 secs] [Times: user=0.14 sys=0.00, real=0.02 secs]
2014-10-21T16:12:05.425+0800: 338.057: [GC 338.057: [ParNew: 172497K->17403K(175104K), 0.0208420 secs] 857118K->702024K(2077696K), 0.0215930 secs] [Times: user=0.16 sys=0.00, real=0.03 secs]
2014-10-21T16:12:06.712+0800: 339.344: [GC 339.345: [ParNew: 173051K->15236K(175104K), 0.0201730 secs] 857672K->699857K(2077696K), 0.0209250 secs] [Times: user=0.15 sys=0.00, real=0.02 secs]
2014-10-21T16:12:08.034+0800: 340.666: [GC 340.667: [ParNew: 170884K->17870K(175104K), 0.0200380 secs] 855505K-=>702491K(2077696K 1.5 user secs] [Times 0.0:20 user800 secs] = 0.01, real = 0.02 SECS]
2014-10-21T16: 12: 09.602 + 0800: 342.234: [GC 342.235: [Parnew: 173518k-> 16237K (175104K), 0.0223470 SECS] 858139K-> 700858K (2077696K), 0.0230970 SECS ] [Times: user=0.17 sys=0.00, real=0.03 secs]
2014-10-21T16:12:11.041+0800: 343.673: [GC 343.674: [ParNew: 171885K->19456K(175104K), 0.0208920 506 secs] 8 >704101K(2077696K), 0.0216500 secs] [Times: user=0.15 sys=0.00, real=0.02 secs]

Running for a while: slowly rising, it has been 0.12s
2014-10-21T20:58:10.920+0800: 17503.552: [GC 17503.553: [ParNew: 175104K->18907K(175104K), 0.1173460 secs] 1065519K->909322K(2077696K), 0.1181020 secs] [Times: user=0.91 sys=0.01, real=0.12 secs]
2014-10-21T20:58:17.784+0800: 17510.416: [GC 17510.417: [ParNew: 174555K->18403K(175104K), 0.1167540 secs] 1064970K->908818K(2077696K), 0.1175530 secs] [Times: user=0.92 sys=0.00, real=0.12 secs]
2014-10-21T20:58:23.240+0800: 17515.872: [GC 17515.872: [ParNew: 174051K->16445K(175104K), 0.1189100 secs] 1064466K->906861K(2077696K), 0.1197100 secs] [Times: user=0.92 sys=0.01, real=0.12 secs]
2014-10-21T20:58:31.390+0800: 17524.022: [GC 17524.022: [ParNew: 172093K->19456K(175104K), 0.1188790 secs] 1062509K->909974K(2077696K), 0.1196710 secs] [Times: user=0.93 sys=0.00, real=0.12 secs]
2014-10-21T20: 58: 35.824 + 0800: 17528.456: [GC 17528.457: [ParNew: 175104K-> 18279K (175104K), 0.1181890 secs] 1065622K-> 908798K (2077696K), 0.1189490 secs] [Times: user = 0.93 sys 0.00 =, = 0.12 secs Real]
2014-10-21T20: 58: 44.207 + 0800: 17536.839: [the GC 17536.840: [ParNew: 173927K-> 16366K (175104K), .1174270 secs] 1064446K-> 906884K (2077696K), .1182030 secs ] [Times: user=0.92 sys=0.00, real=0.12 secs]

The highest period of time: about 0.2s.
2014-10-22T03: 36: 12.582 + 0800: 41385.215: [PARNEW: 175104K-> 19456K (175104K), 0.1745820 SECS] 692711K-> 537299K (2077696K), 0.1754410 SECS] [Times: user = 1.37 sys =0.00, real=0.18 secs]
2014-10-22T03:36:19.549+0800: 41392.181: [GC 41392.182: [ParNew: 175104K->19456K(175104K), 0.1753550 secs] 692947K->537562K(2077696K), 0.1761940 secs] [Times: user=1.38 sys=0.00, real=0.18 secs]
2014-10-22T03:36:25.216+0800: 41397.848: [GC 41397.848: [ParNew: 175104K->19456K(175104K), 0.1746610 secs] 693210K->537878K(2077696K), 0.1754360 secs] [Times: user=1.37 sys=0.01, real=0.18 secs]
2014-10-22T03:36:29.907+0800: 41402.539: [GC 41402.539: [ParNew: 175104K->19456K(175104K), 0.1751190 secs] 693526K->538168K(2077696K), 0.1759340 secs] [Times: user=1.38 sys=0.01, real=0.17 secs]
2014-10-22T03:36:35.684+0800: 41408.317: [GC 41408.317: [ParNew: 175104K->19456K(175104K), 0.1750010 secs] 693816K->538480K(2077696K), 0.1758160 secs] [Times: user=1.38 sys=0.00, real=0.18 secs]
2014-10-22T03: 36: 40.335 + 0800: 41412.967: [GC 41412.967: [ParNew: 175104K-> 19456K (175104K), 0.1753520 secs] 694128K-> 538801K (2077696K), 0.1761410 secs] [Times: user = 1.38 sys =0.01, real=0.18 secs]

Finally, summarize the following questions:
1. Are the jvm parameter settings and garbage collector settings reasonable?
2. Why is the minor gc time getting longer and longer (190m Cenozoic, what is the concept of 0.1s~0.2s time per gc? Is it too long?)
3. How to reduce the minor gc time? (Now cms gc time is negligible, mainly because the minor gc time is too long)


Spinner


youlong699's blog

youlong699 2014-10-22

-Xmn190M is too small. Visually observe that your S area is full every time, so that every time minorgc will have a large number of objects promoted to the old area. You can use jstat -gcutil to see the occupancy rate of S0/S1

and , as the available memory in the old area decreases and the fragmentation increases (after a few cmsgc), the time-consuming to allocate space in the old area will become longer.

Suggestion, try to increase Xmn. If you have a large number of long-lived objects, such as some caches, old is to occupy a larger proportion, but your proportion is too different:)

Spinner


whb1984's blog

whb1984 2014-10 -twenty two


youlong699 wrote

- Xmn190M is too small. Visually observe that your S area is full every time, so that every time minorgc will have a large number of objects promoted to the old area. You can use jstat -gcutil to see the occupancy rate of S0/S1

and , as the available memory in the old area decreases and the fragmentation increases (after a few cmsgc), the time-consuming to allocate space in the old area will become longer.

Suggestion, try to increase Xmn. If you have a large number of long-lived objects, such as some caches, the old will occupy a larger proportion, but your proportion is too different:)


I feel the reply from the above friend, about you Let me explain the reason:

Quote

As the available memory in the old area decreases and the fragmentation increases (after a few cmsgc), the time-consuming to allocate space in the old area will become longer by

1. This can be seen according to the above gcviewer graph and gc log. So, before the first cms of the jvm, the time of minorgc can reach about 0.12s. At this time, the jvm has not done the cms, so can the relationship with the cms be ruled out?
2. "As the available memory in the old area decreases, it will take longer to allocate space in the old area." What is the reason for this? Is it because of the old promotion operation of minor gc? However, for example, in the following minorgc, the amount of memory recovered by the new generation is as much as the total memory recovery (175104 - 18907 = 1065519 - 909322), indicating that no promotion has occurred, but the GC time is also 0.12s. There are still many minorgcs without promotion like this, and the time is relatively long.

quote

2014-10-21T20: 58: 10.920 + 0800: 17503.552: [GC 17503.553: [PARNEW: 175104K-> 18907K (175104K), 0.1173460 SECS] 1065519K-> 909322K (2077696K), 0.1181020 SECS] [TIMES: user = 0.91 SYS =0.01, real=0.12 secs]

3. About increasing the young generation size. This has also been tried. It can be said that the memory allocated by the new generation will increase, the frequency of minorgc will become lower, and the single gc time will become longer, but overall throughput will be improved. But now we put the throughput in the second place, just hope that the minor gc can be reduced, even if the throughput is sacrificed. Regarding the 190m memory of the new generation, it can actually be reduced, but a lower value will cause the gc to be too frequent and the throughput to be reduced too much, which is not desirable.

This is some of my personal understanding, I hope you can give some advice.
Spinner


pulsar_lxl's blog

pulsar_lxl 2014-11-05

The problem I encountered is similar to that of the landlord. My basic configuration is Xmx=3g, NewRatio=8, parNew + CMS. As the memory usage increases in the old age, the response time gradually becomes longer , the system throughput also dropped drastically.

youlong699 wrote


and , as the available memory in the old area decreases and fragmentation increases (after a few cmsgc), the time-consuming to allocate space in the old area will become longer.



Is there any theoretical basis for this man's statement? I only know that CMS uses the free linked list to allocate memory, and the performance of the free linked list is degraded relative to the collision pointer. Is it getting lower?

Spinner


xxd82329's blog

xxd82329 2014-11-13

Simply put, when doing Minor GC, JVM must scan Old gen, because objects in Old gen may have references to Young gen. So, as the Old gen gets bigger and bigger, this scan takes more and more time.

I also agree with the first friend's statement that Young gen is too small. In general, I tend to set the Young gen to one-third of the Heap, and sometimes to one-half. . .

Hope it will be useful to the owner.

Spinner


youlong699's blog

youlong699 2015-02-03

saw that this topic was mentioned in the gc maillist, thought of this post, and posted it by the way.
To sum up, the increase in perm gen and code cache, and the need to scan its reference to young gen, lead to longer scanning time:
http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2015 -January/002100.html

Spinner


rink1969's blog

rink1969 2015-02-13

I also encountered a similar problem recently, please add it for your convenience.
I finally found that the problem was the problem of the jvmti agent. A lot of jvmti objects and jni objects are continuously created in the agent. These are also strong roots, to be scanned during YGC.
But after a CMS GC, these are all recycled, and YGC is normal.

Spinner


fh63045's blog

fh63045 2015-03-24

-server -Xloggc:./log/gcviewer.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbosegc -Xms2048M -Xmx2048M -Xmn190M -XX:+DisableExplicitGC -XX:+ScavengeBeforeFullGC -XX:ParallelGCThreads=8 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 -XX:+AggressiveOpts -XX:MaxGCPauseMillis=100 -XX:+HeapDumpOnOutOfMemoryError -XX:PermSize=128M -XX:MaxPermSize=256M

1. The new generation is too small, the general setting is 1-1.5 times the size of the old generation after FGC
2. The setting age is too long to use -XX:+PrintTenuringDistribution to judge the specific age distribution.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326357460&siteId=291194637