Article Directory
1. Problems encountered
- Hbase data is written frequently, and memstore flashing is triggered many times, which causes region and RS to go offline and write failure
- Hbase is compact, causing RS to go offline and query failure
2, hbase read and write process
- The storage structure of hbase data: https://blog.csdn.net/eyeofeagle/article/details/105506606
- Hbase read and write flow chart: https://sematext.com/blog/hbase-memstore-what-you-should
- https://blog.cloudera.com/guide-to-using-apache-hbase-ports/
- Table data read and write:
1 (Get the server address of the meta region through zk)
2 (Find the regionserver where the table is located, and start reading and writing) - Manage Hbase table (memstore flashing, hfile merge):
1 (through zk, get the server address of the meta area)
3 (create/modify the table through the master node)
4 (master node notify the regionserver to open/close/move/divide/refresh /Compressed area)
5 (The master node saves the data: [server address of the meta area, creating log splitting tasks, etc.], and storing them in zk)
6,7,8 (regionserver reads zk data, performs tasks such as splitting logs, and Report to the master node)
Log in through the zookeeper-client command line to view the data in zk:
[zk: localhost:2181(CONNECTED) 2] ls /hbase
[ meta-region-server, rs, master, namespace, hbaseid, table .... ]
[zk: localhost:2181(CONNECTED) 3] get /hbase/master
�master:60000-��3�gNNPBUF
hadoop1�����.��
...
[zk: localhost:2181(CONNECTED) 4] ls /hbase/rs
[hadoop1,60020,1605056703882]
[zk: localhost:2181(CONNECTED) 5] get /hbase/meta-region-server
�regionserver:60020]�i~p�<PBUF
hadoop1�����.
...
[zk: localhost:2181(CONNECTED) 6] ls /hbase/namespace
[default, t1, hbase]
[zk: localhost:2181(CONNECTED) 7] get /hbase/namespace/default
�master:60000J3�g�e
default
...
3. Regionserver memory adjustment
Idea: Hbase Region Server memory adjustment managed by CDH
- It is the Region Server that saves the data. To increase its java heap memory: (4G); Master node can be smaller: (1G)
- Region Server memory is mainly consumed in cache: (The sum of the following two cannot be greater than 0.8, otherwise an error will be reported)
hfile.block.cache.size (disk HFile cache), the default is 0.4
hbase.regionserver.global.memstore.size (written by hbase Enter data to cache), default 0.4
MemStore flashing trigger conditions:
- hbase.hregion.memstore.flush.size default 128M: check every 10s, flush when the memstore size exceeds
- hbase.regionserver.global.memstore.size: flush when the total size of all memstores managed by the Region Server exceeds