Netty4 bottom layer uses object pool and does not use object pool practice optimization

With the development of JVM virtual machine and JIT just-in-time compilation technology, object allocation and recovery is a very lightweight task. But for the buffer Buffer, the situation is slightly different, especially for the allocation and recovery of direct memory outside the heap, which is a time-consuming operation. In order to reuse buffers as much as possible, Netty provides a buffer reuse mechanism based on memory pools. The performance test shows that the performance of ByteBuf using memory pool is about 23 times higher than that of ByteBuf that is rapidly dying (performance data is strongly related to usage scenarios).

In version 4.x, UnpooledByteBufAllocator is the default allocator, although it has some limitations. Now that PooledByteBufAllocator has been widely used for some time, and we have an enhanced buffer leak tracking mechanism, it's time to make PooledByteBufAllocator the default.
Before optimization:

When the number of players reaches about 1100, the direct memory increases rapidly, and the CPU usage also soars.
At the peak, the server heap memory is 3993 M in total, the heap memory is 3476 M, the direct memory is 2048 M, and the direct memory is 715.8125 M. .

It is found that the underlying objects of netty occupy a lot


Class Name Shallow Heap Retained Heap
class com.lingyu.game.service.stage.StageManager @ 0x738778950 » 8 166,381,728
com.lingyu.game.service.equip.EquipDataTemplateManager @ 0x738051240 » 64 61,389,640
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask @ 0x7a8a1b698 » 40 56,363,384
com.lingyu.game.service.map.MapDataTemplateManager @ 0x738709b70 » 64 48,234,856
com.lingyu.game.service.item.ItemRepository @ 0x7387965e0 » 24 45,883,384
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask @ 0x7d0b4ad08 » 40 45,730,344
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask @ 0x7dba870e8 » 40 43,118,248
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask @ 0x76289b300 » 40 41,260,728
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask @ 0x796226f90 » 40 33,083,800
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask @ 0x7ec9f19a0 » 40 32,922,432
io.netty.channel.ChannelOutboundBuffer @ 0x754207b68 » 72 25,823,800
Total: 11 entries

Optimization, the guess is that because the direct memory is not enough, the space is repeatedly applied, resulting in CPU usage, and the direct memory cannot be recycled all the time! After using the object pool, 1380 people, the CPU usage is about 100/1200, the performance is very stable, the number of FULLGC is 0, the
server heap memory is 3993 M, the heap memory is 2150 M, the direct memory is 2048 M, and the direct memory is 400.00098 M
S0 S1 EOP YGC YGCT FGC FGCT GCT  
  0.00 0.71 8.87 15.05 71.87 175 2.638 0 0.000 2.638

The following two sentences are mainly added:
bootstrap.option(ChannelOption.ALLOCATOR, PooledByteBufAllocator.DEFAULT);
bootstrap.childOption(ChannelOption.ALLOCATOR, PooledByteBufAllocator.DEFAULT) //The key is this sentence

Class Name Shallow Heap Retained Heap
class com.lingyu.game.service.stage.StageManager @ 0x738977238 » 8 126,628,072
com.lingyu.game.service.equip.EquipDataTemplateManager @ 0x7380c53f8 » 64 61,391,800
com.lingyu.game.service.map.MapDataTemplateManager @ 0x738ce20d8 » 64 48,234,856
com.xianling.stage.configure.entity.map.PathInfoTemplate @ 0x7389e7c60 » 40 8,975,440
sun.misc.Launcher$AppClassLoader @ 0x738024e80 » 80 8,652,528
com.lmax.disruptor.RingBuffer @ 0x7382e4408 » 32 7,340,056
com.lingyu.game.service.item.ItemDataTemplateManager @ 0x738bc4a30 » 56 5,910,288
com.xianling.stage.configure.entity.map.PathInfoTemplate @ 0x73ac19aa8 » 40 5,231,256
org.springframework.beans.factory.support.DefaultListableBeanFactory @ 0x7381979b8 » 200 5,172,192
com.xianling.stage.configure.entity.map.PathInfoTemplate @ 0x73addc8b8 » 40 4,572,560
Total: 10 entries

Summary: In this optimization, the memory is saved by 1.7G, the direct memory is left at 300M, and the performance is stable. The number of online users has increased by 300 (may be limited by bandwidth, otherwise the performance should be better), and the CPU usage is reduced from 100% to 100%. 10%.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326717262&siteId=291194637