Gateway网关堆外内存泄漏OOM分析处理

持续创作,加速成长!这是我参与「掘金日新计划 · 6 月更文挑战」的第15天,点击查看活动详情


1.写在前面

经过前一篇文章的描述,我们谈到了Gateway网关使用时,常见的一些问题处理。

那我们今天就来谈谈Gateway网关,发生堆外内存泄漏OOM的问题。

2.OOM问题发生

  • 查看报错日志:

image.png

  • 使用top查看进程内存使用情况:

image.png

可以看到,RES为3.4G,RES实际占用的内存大小。

出现这样的情况是这样的:

跑完900万的接口数据,gateway网关一直占用内存不释放

3.问题寻找

我们在github上,可以看到有很多的兄弟,都提出了这个问题,具体可见:issue2090issue2245 等等

出现OutOfDirectMemoryError的错,堆外内存泄漏导致的,

我们可以通过jvm内存分析(过程如下):

  • 通过jps找到java进程的pid
jps

# 结果如下
20859 gateway.jar
复制代码
  • 通过jmap堆内存使用情况
jmap -heap 20859

# 结果如下
Attaching to process ID 20859, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.281-b09

using thread-local object allocation.
Parallel GC with 8 thread(s)

Heap Configuration:
   MinHeapFreeRatio         = 0
   MaxHeapFreeRatio         = 100
   MaxHeapSize              = 6442450944 (6144.0MB)
   NewSize                  = 357564416 (341.0MB)
   MaxNewSize               = 2147483648 (2048.0MB)
   OldSize                  = 716177408 (683.0MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 2042101760 (1947.5MB)
   used     = 367695456 (350.6617126464844MB)
   free     = 1674406304 (1596.8382873535156MB)
   18.0057362077784% used
From Space:
   capacity = 19922944 (19.0MB)
   used     = 0 (0.0MB)
   free     = 19922944 (19.0MB)
   0.0% used
To Space:
   capacity = 19398656 (18.5MB)
   used     = 0 (0.0MB)
   free     = 19398656 (18.5MB)
   0.0% used
PS Old Generation
   capacity = 726138880 (692.5MB)
   used     = 53896160 (51.399383544921875MB)
   free     = 672242720 (641.1006164550781MB)
   7.4222936526962995% used

33813 interned Strings occupying 3327048 bytes.
复制代码

1.可以看到,最大堆内存MaxHeapSize分配了6G

2.默认使用的垃圾收集器是: Parallel GC

UseParallelGC:Parallel Scavenge(新生代)+Parallel Old(老年代)

3.Eden Space伊甸区used:使用350MB,PS Old Generation老年区used:使用51MB

从上可以看出,Eden Space+PS Old Generation远远不会超过3.4G

这就说明堆内存空间是正常的,结合OutOfDirectMemoryError的错误,可以判定为堆外内存泄漏了。

  • jdk8默认的垃圾收集器是:UseParallelGC

这里我们可以改用CMS,来测试下:命令如下:

java -jar -Xms6144m -Xmx6144m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/logs/gateway.hprof -XX:SurvivorRatio=8 -XX:+UseConcMarkSweepGC -XX:+PrintCommandLineFlags gateway.jar
复制代码

看下jmap结果:

jmap -heap 10010

#结果如下:
Attaching to process ID 10010, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.281-b09

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 6442450944 (6144.0MB)
   NewSize                  = 697892864 (665.5625MB)
   MaxNewSize               = 697892864 (665.5625MB)
   OldSize                  = 5744558080 (5478.4375MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 628162560 (599.0625MB)
   used     = 159850792 (152.44559478759766MB)
   free     = 468311768 (446.61690521240234MB)
   25.447360632254174% used
Eden Space:
   capacity = 558432256 (532.5625MB)
   used     = 112419424 (107.21151733398438MB)
   free     = 446012832 (425.3509826660156MB)
   20.131255455272267% used
From Space:
   capacity = 69730304 (66.5MB)
   used     = 47431368 (45.23407745361328MB)
   free     = 22298936 (21.26592254638672MB)
   68.02116910317787% used
To Space:
   capacity = 69730304 (66.5MB)
   used     = 0 (0.0MB)
   free     = 69730304 (66.5MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 5744558080 (5478.4375MB)
   used     = 22242664 (21.212257385253906MB)
   free     = 5722315416 (5457.225242614746MB)
   0.3871953889271148% used

33105 interned Strings occupying 3208384 bytes.
复制代码

新生代使用:UseParNewGC 垃圾回收器

老年代使用:UseConcMarkSweepGC 垃圾回收器

从上可以看出,Eden Space+concurrent mark-sweep generation远远不会超过3.4G

这就说明堆内存空间是正常的,结合OutOfDirectMemoryError的错误,可以判定为堆外内存泄漏了。

对于gateway``堆外内存泄漏的问题,我们要关注一下:DataBuffer对象,使用到的地方:

有下面这个类:

image.png

然后看到github的issue2467,也是有出现OOM的情况。

看到我们的CacheBodyGlobalFilter代码,确实是没有释放DataBuffer

既然找到了原因,我们来处理一下:

image.png

主要的处理,是要释放DataBuffer

// 释放堆外内存
DataBufferUtils.release(dataBuffer);
复制代码

好了,测试一下,基本上就不会出现内存一直飙升,不降的情况了。

问题解决!!!


好了,今天就先到这里了!!!^_^

如果觉得有收获的,帮忙点赞、评论、收藏一下呗!!!

image.png

猜你喜欢

转载自juejin.im/post/7106858929200005134