Remember once hidden deep analysis of online JVM tragedy, troubleshoot and solve.

1, background paper

This article will give you on a special case of JVM optimization, this optimization case itself because novice engineers to optimize the JVM may know a dabbler, and do not know where zhao came a very special JVM parameter error bit is set, Full GC online system has led to the frequent problems.

But we have a lot of follow-up optimization case are actually a variety of bizarre scene, because it is all kinds of strange scenes to let everyone out gradually accumulated abundant practical experience optimized JVM

Learn more scenes, their future in dealing with JVM performance issues can be more handy.

2, cause problems

This scene occurred roughly following procedure: One day a novice engineer team is probably a whim, I thought I saw an online JVM parameters, that learned martial arts masterpiece Cheats, so in the day when a system of on-line, given a free hand to set up a JVM arguments

What this argument is it?

Do not worry, follow the following case studies to see now just know that he set up a strange argument, then the accident happened.

Because large companies are generally in very good access similar Zabbix, OpenFalcon or company from some of the monitoring system of research, surveillance systems generally do, you can make a direct access into the system, then you can see above each Some load the machine's CPU, disk, memory, network.

Frequency line graph and you can see your JVM memory usage fluctuations line chart, and your JVM GC occurred. If you include a business report indicators, you can also be seen in the monitoring system.

And usually set some alarm for online operation of machines and systems, for example, you can set up within 10 minutes if the system found a JVM occurred more than 3 times Full GC, you must send an alert to be sent to you SMS, email or IM tools like nails.

Monitoring system like this is not in our column category, I suggest that you yourself can go to access to information, fact-based command-line tools we explain, for example jstat, you can pass some commands on linux, let jstat automatically jvm monitor the monitoring results can be output to a file of the machine go.

Then the next day you can go to access the file, you can also see some statistical gc jvm that machine.

So, there is no visualization tools, with the most simple command-line tool, in fact, can also play a similar effect.

So the day after the engineers to set up a JVM parameters, a direct result of the line of Full GC JVM frequently received the alarm, everyone is very strange, so he began the investigation of the system.

3, GC View Log

Before we have to explain how he had to start the system when the GC log output, so once found alarm, log in directly to the line of the machine, and then you see the corresponding GC logs.

At this point we see a large number of Full GC GC recorded in the log.

Full GC Why then is it due?

In the log, I saw a word "Metadata GC Threshold", the log similar to the following:

【Full GC(Metadata GC Threshold)xxxxx, xxxxx】

From this we know that this frequent Full GC, actually after JDK 1.8 Metadata Metadata area lead, which is similar to we said before permanent generations.

This area is generally Metadata is loaded into the JVM going to put some of the classes.

So in this case it is very strange, because Metadata why the region is frequently filled, and then trigger Full GC? Full GC but we all know that will lead to the recovery of old CMS's, will recover Metadata area itself.

Let's take a look at the following figure:


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



4, view memory usage Metaspace

Then we of course would like to see memory usage Metaspace area, and you can simply point to observe by jstat, if the monitoring system, he will show you a graph out fluctuations Metaspace memory area occupied, this is similar to the following .


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



Looks memory Metaspace area presents a state of fluctuation, he will always be first increased, after reaching a peak, they will put Metaspace area to fill, then naturally it will trigger a Full GC, Full GC will take Metaspace area garbage collection, so the next memory footprint Metaspace area has become very small.

5, a comprehensive analysis of ideas

See here, I believe we must have a little feeling, it is clear that this system is in operation, there are constantly new class produce is loaded into Metaspace area to go, then stop the Metaspace area is full, then the trigger Full GC away parts of a class Metaspace recovery area.

然后这个过程反复的不断的循环,进而造成Metaspace区域反复被占满,然后反复导致Full GC的发生,如下图所示。


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



6、到底是什么类不停的被加载?

接着我们就有点奇怪了,到底是什么类不停的被加载到JVM的Metaspace区域里去?

这个时候就需要在JVM启动参数中加入如下两个参数了:

“-XX:TraceClassLoading -XX:TraceClassUnloading”

这两个参数,顾名思义,就是追踪类加载和类卸载的情况,他会通过日志打印出来JVM中加载了哪些类,卸载了哪些类。

加入这两个参数之后,我们就可以看到在Tomcat的catalina.out日志文件中,输出了一堆日志,里面显示类似如下的内容:

【Loaded sun.reflect.GeneratedSerializationConstructorAccessor from __JVM_Defined_Class】

明显可以看到,JVM在运行期间不停的加载了大量的所谓“GeneratedSerializationConstructorAccessor”类到了Metaspace区域里去

如下图所示


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



相信就是因为JVM运行期间不停的加载这种奇怪的类,然后不停的把Metaspace区域占满,才会引发不停的执行Full GC的。

这是一个非常实用的技巧,各位同学一定要掌握,频繁Full GC不光是老年代触发的,有时候也会因为Metaspace区域的类太多而触发。

到此为止,已经慢慢接近真相了。

7、为什么会频繁加载奇怪的类?

接着遇到类似这种问题,我们就应该找一下Google或者是百度了,当然推荐是用Google。你完全可以看看那种不停加载的类,到底是什么类,是你自己写的类?还是说JDK内置的类?

比如上面的那个类,如果你查阅一些资料,很容易就会搞明白,那个类大概是在你使用Java中的反射时加载的,所谓反射代码类似如下所示。

Method method = XXX.class.getDeclaredMethod(xx,xx);

method.invoke(target,params);

友情提示一下,反射是Java中最最基础的一个概念,不懂的朋友自己查一下资料。

简单来说,就是通过XXX.class获取到某个类,然后通过geteDeclaredMethod获取到那个类的方法。

这个方法就是一个Method对象,接着通过Method.invoke可以去调用那个类的某个对象的方法,大概就这个意思。

在执行这种反射代码时,JVM会在你反射调用一定次数之后就动态生成一些类,就是我们之前看到的那种莫名其妙的类

下次你再执行反射的时候,就是直接调用这些类的方法,这是JVM的一个底层优化的机制。

看到这里,有的小伙伴是不是有点蒙?

其实这倒无所谓,这段话看的蒙丝毫不影响你进行JVM优化的

你只要记住一个结论:如果你在代码里大量用了类似上面的反射的东西,那么JVM就是会动态的去生成一些类放入Metaspace区域里的。

所以上面看到的那些奇怪的类,就是由于不停的执行反射的代码才生成的,如下图所示。


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



8、JVM创建的奇怪类有什么玄机?

那么接下来我们就很奇怪一件事情,就是JVM为什么要不停的创建那些奇怪的类然后放入Metaspace中去?

其实这就要从一个点入手来分析一下了,因为上面说的那种JVM自己创建的奇怪的类,他们的Class对象都是SoftReference,也就是软引用的。

大家可千万别说连类的Class是什么都没听说过?简单来说,每个类其实本身自己也是一个对象,就是一个Class对象,一个Class对象就代表了一个类。同时这个Class对象代表的类,可以派生出来很多实例对象。

举例来说,Class Student,这就是一个类,他本身是由一个Class类型的对象表示的。

但是如果你走一个Student student = new Student(),这就是实例化了这个Student类的一个对象,这是一个Student类型的实例对象。

所以我们这里所说的Class对象,就是JVM在发射过程中动态生成的类的Class对象,他们都是SoftReference软引用的。

所谓的软引用,最早我们再一篇文章里说过,正常情况下不会回收,但是如果内存比较紧张的时候就会回收这些对象。

那么SoftReference对象到底在GC的时候要不要回收是通过什么公式来判断的呢?

是如下的一个公式:

clock - timestamp <= freespace * SoftRefLRUPolicyMSPerMB

这个公式的意思就是说,“clock - timestamp”代表了一个软引用对象他有多久没被访问过了,freespace代表JVM中的空闲内存空间,SoftRefLRUPolicyMSPerMB代表每一MB空闲内存空间可以允许SoftReference对象存活多久。

举个例子,假如说现在JVM创建了一大堆的奇怪的类出来,这些类本身的Class对象都是被SoftReference软引用的。

然后现在JVM里的空间内存空间有3000MB,SoftRefLRUPolicyMSPerMB的默认值是1000毫秒,那么就意味着,此时那些奇怪的SoftReference软引用的Class对象,可以存活3000 * 1000 = 3000秒,就是50分钟左右。

当然上面都是举例而已,大家都知道,一般来说发生GC时,其实JVM内部或多或少总有一些空间内存的,所以基本上如果不是快要发生OOM内存溢出了,一般软引用也不会被回收。

所以大家就知道了,按理说JVM应该会随着反射代码的执行,动态的创建一些奇怪的类,他们的Class对象都是软引用的,正常情况下不会被回收,但是也不应该快速增长才对。

9、为什么JVM创建的奇怪的类会不停的变多?

那么究竟为什么JVM创建的那些奇怪的类会不停的变多呢?

原因很简单,因为文章开头那个新手工程师不知道从哪里扒出来了SoftRefLRUPolicyMSPerMB这个JVM启动参数,他直接把这个参数设置为0了。

他想的是,一旦这个参数设置为0,任何软引用对象就可以尽快释放掉,不用留存,尽量给内存释放空间出来,这样不就可以提高内存利用效率了么?

真是想的很傻很天真。

实际上一旦这个参数设置为0之后,直接导致clock - timestamp <= freespace * SoftRefLRUPolicyMSPerMB这个公式的右半边是0,就导致所有的软引用对象,比如JVM生成的那些奇怪的Class对象,刚创建出来就可能被一次Young GC给带着立马回收掉一些。

如下图所示。


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



比如JVM好不容易给你弄出来100个奇怪的类,结果因为你瞎设置软引用的参数,导致突然一次GC就给你回收掉几十个类

接着JVM在反射代码执行的过程中,就会继续创建这种奇怪的类,在JVM的机制之下,会导致这种奇怪类越来越多。

也许下一次gc又会回收掉一些奇怪的类,但是马上JVM还会继续生成这种类,最终就会导致Metaspace区域被放满了,一旦Metaspace区域被占满了,就会触发Full GC,然后回收掉很多类,接着再次重复上述循环,如下图所示。


Remember once hidden deep tragedy JVM line analysis, investigation, resolve



其实很多人会有一个疑问,到底为什么软引用的类因为错误的参数设置被快速回收之后,就会导致JVM不停创建更多的新的类呢?

其实大家不用去扣这里的细节,这里有大量的底层JDK源码的实现,异常复杂,要真的说清楚,得好几篇文章才能讲清楚JDK底层源码的这些细节。

大家只要记住这个结论,明白这个道理就好。

10、如何解决这个问题?

虽然底层JDK的一些实现细节我们没分析,但是大致梳理出来了一个思路,大家也很清楚问题所在和原因了

解决方案很简单。在有大量反射代码的场景下,大家只要把

-XX:SoftRefLRUPolicyMSPerMB=0

这个参数设置大一些即可,千万别让一些新手同学设置为0,可以设置个1000,2000,3000,或者5000毫秒,都可以。

Increase this value, it is to make the Class object of some kind of reflection during the JVM to automatically create a soft references do not be easily recycled, after that time we optimize this parameter, you can see the stable operation of the system.

Basically Metaspace memory footprint area is stable, not a back and forth fluctuations.



Guess you like

Origin blog.51cto.com/14480698/2437111