jvm crash排查回忆录

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/singgel/article/details/89233860

前言:

文中部分链接需要cross greatwall。

1.问题背景

时间:2019-04-03

情景:行情后端stock-api第五轮压测

描述:测试在不断的加压过程中,当QPS:500-600时。10.10.21.27 机器部署的服务stock-api,因为采用skyWalking的agent组件出现JVM crash现象。

具体测试报告:第五轮压测

2.排查过程

2.1 crash log概要

 

JVM的crash log日志得出以下概要信息:

JVM信息:

      • Java VM: Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode linux-amd64 compressed oops)
      • JRE version: Java(TM) SE Runtime Environment (8.0_45-b14) (build 1.8.0_45-b14)

系统信息:

      • OS:DISTRIB_ID=Ubuntu
      • DISTRIB_RELEASE=14.04
      • DISTRIB_CODENAME=trusty
      • DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"

运行时CPU、内存信息:

      • CPU:total 40 (10 cores per cpu, 2 threads per core) family 6 model 63 stepping 2, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2
      • Memory: 4k page, physical 65733504k(5283224k free), swap 0k(0k free)

      • load average:21.87 16.41 12.45

2.2 crash log分析

2.2.1 手工分析

当前的异常信息概要:

  • 错误信号类型:SIGSEGV
  • 信号码:(0xb)
  • IP/PC寄存器值(执行指令的代码地址):0x00007f9801d60f9b
  • 进程号:2944
  • 线程号:140287005390592
  • 导致问题的动态链接库函数的地址(问题帧信息):V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b

V代表虚拟机帧,其他类型:

    • C:本地帧
    • j:解释的Java帧
    • v:虚拟机生成的存根栈帧
    • J:其他帧类型。包括编译后的Java帧

导致crash线程信息分析:

通过对线程详细信息(寄存器、栈帧等)和线程栈分析导致虚拟机非预期终止的操作码是在jvm的内部导致。

导致问题的栈顶信息:

Stack: [0x00007f971d0f7000,0x00007f971d1f8000], sp=0x00007f971d1f6a30, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b
V [libjvm.so+0x4c2830] CMTask::deal_with_reference(oopDesc*)+0x180
V [libjvm.so+0x462164] ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x64
V [libjvm.so+0x648d40] InstanceMirrorKlass::oop_oop_iterate_nv(oopDesc*, G1CMOopClosure*)+0x40
V [libjvm.so+0x4c40a7] CMBitMapClosure::do_bit(unsigned long)+0xa7
V [libjvm.so+0x4bd6c4] CMTask::do_marking_step(double, bool, bool)+0x914
V [libjvm.so+0x4c3d8d] CMConcurrentMarkingTask::work(unsigned int)+0xdd
V [libjvm.so+0xacc17f] GangWorker::loop()+0xcf
V [libjvm.so+0x910de8] java_start(Thread*)+0x108

2.2.2 工具分析

采用第三方辅助工具:CrashAnalysis(GitHub上的一个crash文件分析工具),得到以下内容:

诊断信息:

这是jvm的错误导致的问题
请根据后面给的问题点来进行分析,需要根据openjdk的实现来帮助分析问题。
线程信息中的上下文也会告诉你代码执行到什么地方出的错
在运行过程信息栏目中查看内部错误信息。
这种错误有两个大方向可以排查
1,操作系统方面:是否是系统资源问题或者是参数问题导致
2,有第三方动态库的调用,导致错误
如果不是以上情况,有可能是jdk的bug,换个系统,或者换个jdk吧。

可能问题点:

问题模块:
# V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b

-------------------------------------------------------
异常模块:
# SIGSEGV (0xb) at pc=0x00007f9801d60f9b, pid=2944, tid=140287005390592
-------------------------------------------------------

线程信息:

正在执行的线程信息:
Current thread (0x00007f97fc0a6000): ConcurrentGCThread [stack: 0x00007f971d0f7000,0x00007f971d1f8000] [id=3005]

-------------------------------------------------------
对应的堆栈信息:
Stack: [0x00007f971d0f7000,0x00007f971d1f8000], sp=0x00007f971d1f6a30, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b
V [libjvm.so+0x4c2830] CMTask::deal_with_reference(oopDesc*)+0x180
V [libjvm.so+0x462164] ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x64
V [libjvm.so+0x648d40] InstanceMirrorKlass::oop_oop_iterate_nv(oopDesc*, G1CMOopClosure*)+0x40
V [libjvm.so+0x4c40a7] CMBitMapClosure::do_bit(unsigned long)+0xa7
V [libjvm.so+0x4bd6c4] CMTask::do_marking_step(double, bool, bool)+0x914
V [libjvm.so+0x4c3d8d] CMConcurrentMarkingTask::work(unsigned int)+0xdd
V [libjvm.so+0xacc17f] GangWorker::loop()+0xcf
V [libjvm.so+0x910de8] java_start(Thread*)+0x108

-------------------------------------------------------

运行过程信息:

jvm异常信息:
Event: 1065.499 Thread 0x00007f92c80cd800 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000eee2d518) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.499 Thread 0x00007f92c80cd800 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000eeef34b0) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.500 Thread 0x00007f92d400c000 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000ee887148) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.501 Thread 0x00007f92d400c000 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000ee888e28) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.501 Thread 0x00007f92d4060000 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000ef4140c0) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.501 Thread 0x00007f92d4058800 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000ee532ab0) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.502 Thread 0x00007f92d4058800 Exception <a 'java/lang/ClassNotFoundException': java.util.stream.Collectors$$Lambda$30/1653198164> (0x00000000ee534790) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1067.101 Thread 0x00007f95e933a800 Exception <a 'java/net/ConnectException'> (0x00000000f891e000) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 709]
Event: 1093.851 Thread 0x00007f92c8034800 Implicit null exception at 0x00007f97f2817963 to 0x00007f97f2817b11
Event: 1100.826 Thread 0x00007f95e933a800 Exception <a 'java/net/ConnectException'> (0x00000000fa3bd060) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 709]

-------------------------------------------------------
编译事件:
Event: 1100.733 Thread 0x00007f97fd6cb800 nmethod 46822 0x00007f97f0ceee50 code [0x00007f97f0cef100, 0x00007f97f0cf0408]
Event: 1100.735 Thread 0x00007f97fd6c9800 nmethod 46823 0x00007f97f0cddd90 code [0x00007f97f0cde1a0, 0x00007f97f0ce20c8]
Event: 1100.805 Thread 0x00007f97fd6b4000 46824 4 java.util.concurrent.LinkedTransferQueue::xfer (292 bytes)
Event: 1100.824 Thread 0x00007f97fd6b4000 nmethod 46824 0x00007f97ee2c4350 code [0x00007f97ee2c44e0, 0x00007f97ee2c52f8]
Event: 1103.449 Thread 0x00007f97fd6d0000 46826 3 org.apache.catalina.util.LifecycleSupport::fireLifecycleEvent (59 bytes)
Event: 1103.449 Thread 0x00007f97fd6cd800 46825 3 org.apache.catalina.util.LifecycleBase::fireLifecycleEvent (10 bytes)
Event: 1103.449 Thread 0x00007f97fd6c7800 46827 3 org.apache.catalina.LifecycleEvent::<init> (16 bytes)
Event: 1103.451 Thread 0x00007f97fd6cd800 nmethod 46825 0x00007f97f0c7bd50 code [0x00007f97f0c7bec0, 0x00007f97f0c7c0e8]
Event: 1103.451 Thread 0x00007f97fd6c7800 nmethod 46827 0x00007f97f0d10550 code [0x00007f97f0d106e0, 0x00007f97f0d10b28]
Event: 1103.452 Thread 0x00007f97fd6d0000 nmethod 46826 0x00007f97efd957d0 code [0x00007f97efd959e0, 0x00007f97efd963a8]

-------------------------------------------------------
事件信息:
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT PACKING pc=0x00007f97f2a081f5 sp=0x00007f92b2cc5300
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92b2cc50c8 mode 1
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT PACKING pc=0x00007f97f293d004 sp=0x00007f92b2cc55e0
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92b2cc5340 mode 1
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT PACKING pc=0x00007f97f3bc3280 sp=0x00007f92b2cc5650
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92b2cc5238 mode 1
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT PACKING pc=0x00007f97f41d4d8c sp=0x00007f92aa53f9f0
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92aa53f438 mode 1
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT PACKING pc=0x00007f97f41d394c sp=0x00007f92aa53f900
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92aa53f3d0 mode 1

-------------------------------------------------------

系统信息:

机器内存信息:
Memory: 4k page, physical 65733504k(5283224k free), swap 0k(0k free)

-------------------------------------------------------

2.3 问题排查

通过2.2.1和2.2.2的问题分析,得出基本结论。

目前所运行的项目的JVM参数配置没有问题,所在机器的负载CPU和内存没有问题。

排查JVM的源码:

g1ConcurrentMark.cpp的源码查看(不是很能看懂),发现源码版本已经更改,bugfix的Issues未找到。

Oracle Java Bug Database搜索(查找G1垃圾回收的deal_with_reference方法),找到以下答复:

Comments
This issue is duplicate of JDK-8168914 as reported. Issue observed on 8u144 b01.
This issue is already fixed in 8u152 b04.

Kindly update to latest Java version to avoid this issue - http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
Please do let us know if you still observe the issue.

该问题是有jdk的版本导致,之前已经有人发现。在jdk8中,update版本在152是修复。

对JDK的bugfix历史记录进行查看,发现该问题号:JDK-8168914 : Crash in ClassLoaderData/JNIHandleBlock::oops_do during concurrent marking在备份中有记录。

3.当前结论

本次JVM crash的原因在于由于当前jdk版本8u45 b14版本过低,升级至8u152 b04以上可解决。

2019-04-11目前新版本为8u201,jdk地址:https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

---------------------------------------------------------------------------------------------------------------------------------------------

以上为2019-04-11的排查结果,如有疑问请沟通,如有更新,请关注~

参考链接:

1.https://blog.csdn.net/chenssy/article/details/78271744

2.http://www.raychase.net/1459

3.https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8191009

4.https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8168914

5.https://hg.openjdk.java.net/jdk/jdk

6.https://github.com/openjdk

7.https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

猜你喜欢

转载自blog.csdn.net/singgel/article/details/89233860
今日推荐