[HBase from entry to proficient series] How to avoid various problems caused by HBase writing too fast

Abstract: First, let's briefly review the entire writing process client api ==> RPC ==> server IPC ==> RPC queue ==> RPC handler ==> write WAL ==> write memstore ==> flush to filesystem The entire write The incoming process starts from the client calling the API, the data is encoded into a request through protobuf, and the IPC module implemented through socket is sent to the server's RPC queue.
 
First, let's briefly review the entire writing process
 
client api ==> RPC ==> server IPC ==> RPC queue ==> RPC handler ==> write WAL ==> write memstore ==> flush to filesystem
 
The entire writing process starts from the client calling the API, the data is encoded into a request through protobuf, and the IPC module implemented through socket is sent to the server's RPC queue. Finally, the handler responsible for processing the RPC takes out the request and completes the write operation. Writing will first write the WAL file, and then write a copy to the memory, that is, the memstore module. When the conditions are met, the memstore will be flushed to the underlying file system to form an HFile.
 
What problems do you encounter when writing too fast?
 
When writing is too fast, the water level of the memstore will be pushed up immediately. 
You may see logs similar to the following:
 
RegionTooBusyException: Above memstore limit, regionName=xxxxx ...
 
This is that the memory size of the Region's memstore exceeds 4 times the normal size. At this time, an exception will be thrown, the write request will be rejected, and the client will start to retry the request. When it reaches 128M, the flush memstore will be triggered. When it reaches 128M * 4 and the flush cannot be triggered, an exception will be thrown to reject the write. The default values ​​for the two related parameters are as follows:
 
hbase .hregion.memstore.flush.size = 128 M hbase .hregion.memstore.block.multiplier = 4
 
or a log like this:
 
regionserver.MemStoreFlusher: Blocking updates on hbase.example.host.com, 16020 , 1522286703886 : the global memstore size 1.3 G is >= than blocking 1.3 G size regionserver.MemStoreFlusher: Memstore is above high water mark and block 528 ms
 
这是所有region的memstore内存总和开销超过配置上限,默认是配置heap的40%,这会导致写入被阻塞。目的是等待flush的线程把内存里的数据flush下去,否则继续允许写入memestore会把内存写爆
 
hbase .regionserver.global.memstore.upperLimit = 0.4 # 较旧版本,新版本兼容 hbase .regionserver.global.memstore.size = 0.4 # 新版本
 
当写入被阻塞,队列会开始积压,如果运气不好最后会导致OOM,你可能会发现JVM由于OOM crash或者看到如下类似日志:
 
ipc.RpcServer: / 192.168 .x.x : 16020 is unable to read call parameter from client 10.47 .x.x java.lang.OutOfMemoryError: Java heap space
 
HBase这里我认为有个很不好的设计,捕获了OOM异常却没有终止进程。这时候进程可能已经没法正常运行下去了,你还会在日志里发现很多其它线程也抛OOM异常。比如stop可能根本stop不了,RS可能会处于一种僵死状态。
 
如何避免RS OOM?
 
一种是加快flush速度:
 
hbase .hstore.blockingWaitTime = 90000 ms hbase .hstore.flusher.count = 2 hbase .hstore.blockingStoreFiles = 10
 
当达到hbase.hstore.blockingStoreFiles配置上限时,会导致flush阻塞等到compaction工作完成。阻塞时间是hbase.hstore.blockingWaitTime,可以改小这个时间。hbase.hstore.flusher.count可以根据机器型号去配置,可惜这个数量不会根据写压力去动态调整,配多了,非导入数据多场景也没用,改配置还得重启。
 
同样的道理,如果flush加快,意味这compaction也要跟上,不然文件会越来越多,这样scan性能会下降,开销也会增大。
 
hbase .regionserver.thread.compaction.small = 1 hbase .regionserver.thread.compaction.large = 1
 
增加compaction线程会增加CPU和带宽开销,可能会影响正常的请求。如果不是导入数据,一般而言是够了。好在这个配置在云HBase内是可以动态调整的,不需要重启。
 
上述配置都需要人工干预,如果干预不及时server可能已经OOM了,这时候有没有更好的控制方法?
 
hbase .ipc.server.max.callqueue.size = 1024 * 1024 * 1024 # 1G
 
直接限制队列堆积的大小。当堆积到一定程度后,事实上后面的请求等不到server端处理完,可能客户端先超时了。并且一直堆积下去会导致OOM,1G的默认配置需要相对大内存的型号。当达到queue上限,客户端会收到CallQueueTooBigException 然后自动重试。通过这个可以防止写入过快时候把server端写爆,有一定反压作用。线上使用这个在一些小型号稳定性控制上效果不错。
 
 
阅读更多干货好文,请关注扫描以下二维码:  
 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326067053&siteId=291194637