Chapter 4 - HBase distributed database

Chapter 4 - HBase distributed database

HBase Profile

BigTable is a distributed storage system using distributed parallel MapReduce computing model to handle massive amounts of data, using Google GFS distributed file system as the underlying data storage, can be extended to the PB level data and thousands of machines, with a wide range of applications, scalability, performance and availability characteristics.

HBase is a highly reliable, high performance, column-oriented, distributed database scalable, open source implementation of Google BigTable is mainly used to store bulk data unstructured and semi-structured. HBase goal is to deal with very large tables, you can expand horizontally by the way, the use of low-cost cluster of computers to process data table consists of more than one billion rows of data and number one million elements.

d41

  • MapReduce to use massive data in HBase, high-performance computing
  • HDFS using a highly reliable underlying storage, using an inexpensive clusters provide mass data storage capability
  • Zookeeper use to provide collaboration services, to achieve stability and error recovery services
  • Sqoop provide efficient and convenient for the RDBMS data import function Hbase
  • Pip and Hive provides a high-level language support

HBase difference with traditional relational database

Traditional relational database HBase
type of data Using the relational model, it has a wealth of data types and storage Uses a more simple data model, the data is stored as unexplained string, you need to write your own program to parse the string
Data Manipulation It contains a wealth of operation, which will involve complex multi-table joins The complex relationship between the table and the table does not exist, only a simple insert, search, delete, empty, etc., can not be achieved join query tables and tables
Storage mode On-line mode storage Based on the storage column, the column group of different files are separated. You can reduce I / O overhead and support a large number of concurrent user queries
Data Index Generally you can build complex multiple indexes for different columns to improve data access performance Only one index - the line key, all access methods in HBase, or by accessing the line key, or by scanning the line key
data maintenance Update operations to replace the old record in the original value of the latest current value, the old value is overwritten after not exist When an update operation, and will not remove the old version of the data, but generates a new version, the old version is still retained
Scalability Difficult to achieve lateral extension, the longitudinal extension of the relatively limited space Scalable performance can be achieved by increasing or decreasing the amount of hardware in the cluster

HBase also has its own limitations, such transactions are not supported and therefore can not achieve atomicity of interbank.

HBase data model

Conceptual data model

HBase is a sparse, multi-dimensional, sort of mapping table, this table is indexed row key, column families, and timestamp column qualifier. Each value is an uninterpreted string, no data types. User data is stored in a table, each row has a row of keys can be sorted and any number of columns.

Table in a horizontal direction by one or more of the group consisting of columns, a column family can contain any number of columns, a column of the same family which data are stored together. Supports extending column family, can easily add a row or column family, without pre-defined number and type of columns, all columns are stored as strings, the user needs to perform its own data type conversion. When HBase perform update operations, and will not remove the old version of the data, but generates a new version, the old version remains.

HBase data model concepts

concept description
table HBase employed to organize the data table, the table consists of rows and columns, the column is divided into several columns Group
Row HBase Each table consists of a number of rows, each row is identified by the row key (row key)
Column family HBase table are grouped into a collection of many "column group" (Column Family), which is the basic access control unit
Column qualifier Column clan data qualifier positioned by column (or columns)
Cell In HBase table by rows, columns and column qualifier determines a group a "cell" (Cell), the data is not stored in the cell data type, it is always treated as an array of bytes byte []. Each cell can store multiple versions of a data corresponding to the different time stamps.
Timestamp Each cell holds multiple versions of the same data, these versions are indexed using timestamps. Different versions of a cell is stored in accordance with the descending order of the time stamp, the latest version can be read first

d42

HBase used to locate the coordinate data in the table, need to be determined row key and column group, a timestamp column cell qualifiers and , therefore, can be regarded as a "four-dimensional coordinates", i.e. [row key, a column group, column qualifier, timestamp].

key value
[“201505003”, “Info”, “email”, 1174184619081] [email protected]
[“201505003”, “Info”, “email”, 1174184620720] [email protected]

Conceptual view

In the conceptual view of HBase, a table can be seen as a sparse, multidimensional mapping relationship. Each row contains the same column group, although the line does not need to store data in each column group.
d43

This is a fragment of a HBase table stored pages.

  • Row key is a reverse URL (ie com.cnn.www), so why store, because HBase is based dictionary sort order line key to store data, by way of a reverse URL, you can make from the same site the content data are stored in a location adjacent to, at the time of horizontal partitioning in accordance with the values ​​of the row keys, you can try to divide data from the same site to the same partition (Region) in.
  • contents column family is used to store Web content.
  • anchor column family contains any anchor text links to reference this page.

Physical View

In the conceptual view level, HBase each table is composed of many rows of; but in the physical storage level, it is the use of a column-based storage, which is an important distinction HBase and traditional relational database.

The above conceptual view when the physical storage, HBase table is stored separately according to contents, anchor these two columns group, data belonging to the same column group kept together. At the same time, and each column group stored with a timestamp, and further comprising a row of keys.
d44

In the conceptual view, some of the column is empty, i.e., do not exist above the values ​​of these columns. In the physical view, these empty columns are not stored as null, but would not be stored, when a request these empty cells, returns a null value.

Column-oriented storage

Line databases use NSM (N-ary Storage Model) storage model, a tuple (or rows) are successively stored in the disk page. When reading data from the disk, it is necessary sequentially scan the entire contents of each tuple from the disk, and then screened out from the desired properties of the query in each tuple. If each tuple value of the property is only a few useful for the query, then the NSM will waste a lot of disk space and memory bandwidth.

Column database using DSM (Decomposition Storage Model) storage model, the purpose is to minimize unnecessary I / O. DSM will decompose perpendicular relationship, and each assigned a sub-attribute relationship, each child relationship stored separately, each child relationship will only be accessed when its corresponding attributes are requested. DSM defects are: the need for expensive reconstruction tuple consideration when performing the connecting operation.
d45

Line database is mainly suitable for processing small quantities of data, such as online transactional data processing, we usually familiar with Oracle, MySQL and other relational databases are all line database.

Column database mainly for ad hoc queries and batch data processing (Ad-Hoc Query), it is possible to reduce I / O overhead to support a large number of concurrent queries the user, the data processing speed is 100 times faster than traditional methods. Columnar database is mainly used for data mining, decision support and geographic information systems, query-intensive systems.

The principle HBase

HBase functional components

HBase implementation consists of three major functional components:

  • Library function: Link to each client
  • A Master primary server
  • Many server Region

Region server is responsible for storing and maintaining their assigned Region, handle read and write requests from clients.

Master primary server is responsible for managing and maintaining the partition table information HBase, maintenance Region list server, server to detect cluster REgion, Region reasonable distribution of load balancing, as well as changes in treatment patterns, such as tables and columns to create a family.

The client does not directly read data from the Master main server, but after obtaining the location information of Region, Region read data directly from the server.

Tables and Region

HBase For each table, table rows are maintained in accordance with the value of the lexicographically row keys. Therefore the need for a distributed storage key according to the value of the row in the table row partitions, each partition forms a line intervals, is called "Region". It contains all the data is located within a certain range interval, is the basic unit of load balancing and data distribution, the Region will be distributed to different servers.

Each table contains only a first Region, with the continuous data inserted, Region will continue to increase, when the number of rows included in a Region reaches a threshold value, and so will be automatically split into two new Region. With the continued increase in the number of rows in the table, it will split more and more Region.
d46

Different Master main server will be assigned to Region Region on different servers, but is not the same Region is split across multiple servers. Region optimum size of each depends on the ability to deal effectively with a single server, it is generally recommended 1GB ~ 2GB. Each Region Region server is responsible for managing a collection, a collection of usually 10 to 1000 Region.

Region location

Each Region has a RegionID to identify its uniqueness, in order to locate the position of each Region is located, you can build a mapping table. Each row of the mapping table comprises two elements, one Region identifier, the other server identifier Region, this line represents the correspondence between the Region Region and the server. This map is also known as "metadata table", also known as ".META. Table." For faster access, .META. All Region tables are saved in memory.

When the Region data HBase table is very large, .META. Table will be split into multiple Region. To locate these Region, then construct a new mapping table recording the specific location of all metadata is "root Datasheet", also known as "-ROOT- table." -ROOT- table can not be divided, always there is only one Region for storage -ROOT- table , the Region's name is written dead, Master servers always know its location.

Before a client access to user data, you need to access Zookeeper, acquiring location information -ROOT- table, then visit -ROOT- sheet for .META. Information table, and then access .META. Table, find the specific Region is located which Region server, and finally to the Region server will read the data.
d47

To speed up the process of addressing, they will usually cached on the client, the inquiry had cached location information. In this way in the future to access the same data, you can get location information Region directly from the client cache, without the need to always go through a "three-tier addressing" process.

In summary, HBase three-layer structure similar to the B + tree to store the location information of Region.

level name effect
level one Zookeeper file It records the location information -ROOT- table
Second floor -ROOT- table Recorded .META. Region location information table, -ROOT- table has only one Region. By -ROOT- table can access .META. Data table
the third floor .META. Table Recording the user data table Region position information, .META. Region table may have a plurality of, in Region HBase saved position information of all user data tables

HBase operating mechanism

HBase System Architecture

HBase system architecture includes a client, Zookeeper server, Master server, Region server. HBase HDFS generally used as the underlying data store.
d48

Client: Includes interface for accessing HBase, while maintaining the Region has visited the location information in the cache, to speed up the process of follow-up data access. HBase HBase client uses RPC mechanism with the Master Region server and server to communicate.

Zookeeper: Zookeeper storage -ROOT- table of addresses and Master address, Region Server initiative to register with the Zookeeper, so that Master can always perceived health status of each Region Server. Zookeeper can help elect a Master manifold as a cluster, and ensure that at any time there is always only one Master in the run, which avoids the Master of the "single point of failure" problem.

Master主服务器:Master主要负责表和Region的 管理工作

  • 管理用户对表的增加、删除、修改等操作
  • 实现不同Region服务器之间的负载均衡
  • 在Region分裂或合并后,负责重新调整Region的分布
  • 对发生故障失效的Region服务器上的Region进行迁移

Region服务器:HBase中最核心的模块,负责维 护分配给自己的Region,并响应用户的读写请求。

Region工作原理

Region服务器内部管理了一系列 Region对象和一个共用的 HLog文件。HLog文件是磁盘上面的记录文件,记录了所有的更新操作。每个 Region对象又是由多个 Store组成的,,每个 Store又包含了一个 MemStore和若干个 StoreFile。其中,MemStore是在内存中的缓存,保存最近更新的数据,StoreFile是磁盘中的文件。

这些文件都是 B树结构,方便快速读取。StoreFile在底层的实现方式是 HDFS文件系统的 HFile,HFile的数据块通常采用压缩方式存储,压缩之后可以大大减少网络 I/O和磁盘 I/O。
d49

Store是 Region服务器的核心,每个 Store对应了表中的一个列族的存储。当用户写入数据时,系统首先把数据放入 MemStore缓存,当缓存满时,就会刷新到磁盘中的一个 StoreFile文件中。StoreFile达到一定的数量之后就会触发文件合并操作,多个 StoreFile合并成一个大的 StoreFile文件,这个StoreFile会越来越大。当StoreFile文件的大小超过一定的阈值时,就会触发文件分裂操作。同时,当前的父 Region会被分裂成2个子 Region,父 Region会下线,新分裂出的2个子 Region会被 Master分配到相应的 Region服务器上。
d491

HBase HLog employed to ensure a normal state can be restored to a system failure. Region HBase system for each server is configured with a HLog file, which is a write-ahead log (Write Ahead Log). The user must first update the data written to the log can be written MemStore cache, and cache content until MemStore corresponding log has been written to disk, the cache contents to be flushed to disk.

Zookeeper real time status monitoring for each Region servers, when a server fails Region, Zookeeper will notify Master. Master will first deal with the legacy server failures Region HLog above documents, the legacy of HLog file contains log records from multiple objects Region. The system according Region objects each log record belongs to HLog split data respectively into the corresponding directory object Region, then, and then reassigned to a failed Region Region available servers, and the object is the Region HLog associated log record is also sent to the appropriate server Region.

Region server to receive their assigned Region as well as objects related to the future with HLog logging will be done again in a variety of logging operations, the write data logging in to MemStore cache, then flushed to disk the StoreFile file, complete data recovery.

HBase system, each Region server only needs to maintain a HLog file, all of Region objects share a HLog. This can reduce disk seek times and improve write performance to the table.

Published 61 original articles · won praise 25 · views 7178

Guess you like

Origin blog.csdn.net/qq_42582489/article/details/105177479