The structure of hbase to hang the interviewer

hbase architecture

Insert picture description here

Hbase is based on hadoop, and the storage of hbase depends on hdfs.
client, zookeeper, hmaster, hregionserver, hlog, hregion, store, mestore, storefile, hfile

client
hbase client, including access to hbase interface (shell, java, api)
client maintains some cache to accelerate access to hbase, such as region location information

Zookeeper
monitors the status of the master to ensure that there is one and only one active master to achieve high-availability storage
. The addressing entry of all regions, the location of the
root table (server) monitors the status of hregionserver in real time, and real-time up and down information of hregionserver Notify hmaster to
store all the table information of hbase (schema data of hbase), which tables are included, and which column clusters each table has

hmaster (the boss
of hbase ) allocates regions to the regionserver (when
creating a table) is responsible for the redistribution of hregion (regionserver is abnormal, hregion becomes larger and divided into two)
is responsible for the collection
of garbage files on the regionserver's load balancing hdfs, and
handles schema update requests

hregionserver (the younger brother of
hbase ) regionsever is responsible for maintaining the regions assigned to him by the master (management regions),
processing client io requests for these regions, and interacting with hdfs

hmaster :
Allocate the region
to the regionserver. Responsible for the redistribution of hregion (the regionserver is abnormal, and the hregion becomes larger and divided into two).
Responsible for the collection
of garbage files on the HDFS of the regionserver.
Process the schema update request

hregionserver :
Manages the region to
process client io requests and
is responsible for segmenting the io requests of the larger region
region

Hlog
records the operation of hbase, and writes data using WAl to write data to hlog first, and then to memstore
to prevent data loss, it can be rolled back

hregion
The smallest unit of distributed storage and load in hbase. Table or small part of table

store is
equivalent to a column cluster

Mestore
memory buffer data area, users will batch data refresh to hdfs (128M)

hstorefile
hbase data is stored in hdfs in the form of hfile

master :regionserver 1 : n
regionserver :log 1 : 1
regionserver : region 1 : n
region : store 1 : n
store : mestore 1 : 1
store : storefile 1 : n
storefile : hfile 1 : 1

Keyword
rowkey: row key
columnfamily: column cluster
column: column
timestamp: timestamp (display the latest time by default)
version: version number
cell: cell

Features of habse :

Linear and modular scalability.
Strictly consistent reading and writing.
Automatic and configurable
automatic failover support between regionservers for table fragmentation .
Convenient base class to support Hadoop MapReduce jobs using apache hbase table.
Easy to use javaapi for client access.
Block cache and Bloom filter for real-time query.
Push down query predicates through server-side filters. The
jruby-based extensible (JIRB) shell
supports exporting metrics to files or Ganglia through the Hadoop metrics subsystem; or through JMX

Mode: No mode
Type data: A single value can be byte[]
Multiple versions: Each value can have multiple versions
Storage: sparse storage. When the value of key-value is null, the entire storage space will not be occupied

Hbase addressing mechanism
1. Go to zk to find root
2. Go to the machine of the root table to find the location of
the meta table 3. Go to the machine where the meta table is located to find the real management location of
the data 4. To the location of the data, send an io request
5.rs Interact with hdfs
6. Write memstore
7. Return result

Guess you like

Origin blog.csdn.net/qq_42706464/article/details/108846721