Hbase read and write process, regionserver memory adjustment

1. Problems encountered

  • Hbase data is written frequently, and memstore flashing is triggered many times, which causes region and RS to go offline and write failure
  • Hbase is compact, causing RS to go offline and query failure

2, hbase read and write process

  • The storage structure of hbase data: https://blog.csdn.net/eyeofeagle/article/details/105506606
  • Hbase read and write flow chart: https://sematext.com/blog/hbase-memstore-what-you-should
    Insert picture description here
  • https://blog.cloudera.com/guide-to-using-apache-hbase-ports/
    Insert picture description here
  • Table data read and write:
    1 (Get the server address of the meta region through zk)
    2 (Find the regionserver where the table is located, and start reading and writing)
  • Manage Hbase table (memstore flashing, hfile merge):
    1 (through zk, get the server address of the meta area)
    3 (create/modify the table through the master node)
    4 (master node notify the regionserver to open/close/move/divide/refresh /Compressed area)
    5 (The master node saves the data: [server address of the meta area, creating log splitting tasks, etc.], and storing them in zk)
    6,7,8 (regionserver reads zk data, performs tasks such as splitting logs, and Report to the master node)

Log in through the zookeeper-client command line to view the data in zk:

[zk: localhost:2181(CONNECTED) 2] ls /hbase
[ meta-region-server, rs,  master, namespace, hbaseid, table .... ]

[zk: localhost:2181(CONNECTED) 3] get /hbase/master
�master:60000-��3�gNNPBUF

hadoop1�����.��
...

[zk: localhost:2181(CONNECTED) 4] ls /hbase/rs
[hadoop1,60020,1605056703882]


[zk: localhost:2181(CONNECTED) 5] get /hbase/meta-region-server
�regionserver:60020]�i~p�<PBUF

hadoop1�����.
...


[zk: localhost:2181(CONNECTED) 6] ls /hbase/namespace
[default, t1, hbase]


[zk: localhost:2181(CONNECTED) 7] get /hbase/namespace/default
�master:60000J3�g�e

default
...

3. Regionserver memory adjustment

Idea: Hbase Region Server memory adjustment managed by CDH

  • It is the Region Server that saves the data. To increase its java heap memory: (4G); Master node can be smaller: (1G)
  • Region Server memory is mainly consumed in cache: (The sum of the following two cannot be greater than 0.8, otherwise an error will be reported)
    hfile.block.cache.size (disk HFile cache), the default is 0.4
    hbase.regionserver.global.memstore.size (written by hbase Enter data to cache), default 0.4

MemStore flashing trigger conditions:

  • hbase.hregion.memstore.flush.size default 128M: check every 10s, flush when the memstore size exceeds
  • hbase.regionserver.global.memstore.size: flush when the total size of all memstores managed by the Region Server exceeds
    Insert picture description here

Guess you like

Origin blog.csdn.net/eyeofeagle/article/details/109626508