Big data course G2 - the basic structure of Hbase

E-mail of the author of the article: [email protected] Address: Huizhou, Guangdong

 ▲ Purpose of this chapter

⚪ Master the basic structure of Hbase;

⚪ Master the reading and writing process of Hbase;

⚪ Master the design and optimization of Hbase;

1. Basic structure

1. HRegion

1. In HBase , a table is split into one or more HRegions from the row key direction.

2. After splitting, each HRegion will be handed over to a certain HRegionServer for management.

3. A table will contain at least one HRegion, and can contain multiple HRegions.

4. In HBase, the row keys are ordered, so the segmentation is performed from the direction of the row keys, so the data between HRegions is not intersected.

5. Because HRegionServer will be handed over to HRegionServer for management, and the data between HRegions does not cross each other, it is guaranteed that requests will not be concentrated on a certain node but will be distributed to different nodes.

6. As the running time goes by, the data managed by HRegion will continue to increase, and when the specified conditions are met, it will be automatically split.

7. Each HRegion contains one or more HStore, and the number of HStore is determined by the number of column families.

8. Each HStore will contain 1 memStore and 0 to more StoreFile/HFile.

2. The role of Zookeeper 

1. In HBase, Zookeeper acts as a registration center.

2. When HBase starts, it will automatically register a /hbase node on Zookeeper.

3. When Active HMaster starts, it will automatically register a temporary node /hbase/master on Zookeeper - when Active HMaster goes down, this temporary node will disappear, and Zookeeper will select the earliest registered one from Backup HMasters node to switch to Active state.

4. When the Backup HMaster starts, it will automatically register a temporary child node on the /hbase/backup-masters node of Zookeeper.

5. When HRegionServer starts, it will also automatically register child nodes under Zookeeper's /hbase/rs node.

3. HMaster

1. In HBase, users are allowed to start HMaster on any node where HBase is installed, and the number of HMasters is theoretically unlimited.

2. HMaster startup command:

hbase-daemon.sh start master

3. In HBase, if multiple HMasters are started, the HMasters will be divided into Active and Backup states.

4. If multiple HMasters are started, the HMaster registered first with Zookeeper will become Active, and the HMaster registered with Zookeeper later will become Backup.

5. After the Active HMaster receives the request, it needs to consider synchronizing the data to other Backup HMasters. The more nodes are synchronized, the lower the efficiency will be.

6. Therefore, in HBase, although the number of HMasters is not limited in theory, in practice, the number of HMasters generally does not exceed 3: 1 Active HMaster+2 Backup HMasters.

7. Active HMaster will monitor /h on Zookeeper in real time

Guess you like

Origin blog.csdn.net/u013955758/article/details/132032331