HBase of Internet of Things Architecture

HBase of Internet of Things Architecture

1. HBase overview

    NBase is a column-oriented distributed database based on HDFS. Since HDFS is based on streaming data access, it is not suitable for running on HDFS for real-time data. Therefore, if you need real-time random access to very large-scale data sets, using HBase is a better choice.
    HBase is a distributed database based on Hadoop, which uses HDFS as a file storage system. HDase is a NoSQL database that stores data in columns in the form of key-value pairs.
Overall, NoSQL databases can be divided into the following four categories:
    Based on the type of column storage
    Based on the type of document storage
    Class type based on key-value storage
    Types of graph-based data storage
    HBase essentially only has insert operations, and updates and deletions are all done in the form of inserts, which is determined by the underlying HDFS streaming access characteristics (write once, read multiple times). So, always insert a new row with a timestamp on update, and insert a new row with a delete marker on delete. Each insertion has a timestamp mark, each time is a new version, and HBase will keep a certain number of versions (this value can be set). If a timestamp is provided at query time, the version closest to that time is returned; otherwise, the most recent version is returned.
2. HBase architecture
    The server architecture of HBase is also the master-slave server structure of Mater/Slaves. It is composed of one HMaster server and multiple HReginServer servers, and all these servers are coordinated through ZooKeeper and deal with the problems that may occur during the operation of each server. mistake. HMaster is responsible for managing all HRegionServers, and HRegionServer is responsible for storing many HRegions, and each HRegion is a block of the HBase logical table. Figure 7.1 shows all members in the HBase cluster. <><>![Insert picture description here](https://img-blog.csdnimg.cn/img_convert/b426867c0e5d3bebe0b8e0db29c4a995.png#pic_center)

(1)Hregion

    HBase uses tables to store data sets, and tables are composed of rows and columns, which is very similar to a relational database. However, when the size of the table exceeds the set value, HBase will automatically divide the table into different regions (Regions). Each region is called HRegion, which is the smallest unit of distributed storage and load balancing on the HBase cluster. In this regard, tables and HRegions are similar to the concept of files and file blocks in HDFS. A piece of continuous data in a table is stored in a HRegion, and each HRegion is distinguished by the table name and the primary key range (start primary key ~ end primary key). At first, a table has only one Hregion. As the HRegion starts to grow larger until it exceeds the set size threshold, the table will be divided into two HRegions of almost the same size on the boundary of a certain row, which is called HRegion splitting, as shown in Figure 7.2.

Guess you like

Origin blog.csdn.net/qq_53195102/article/details/115625226