HBase principle summary

HBase Hadoop HDFS is based on a high availability, high-performance, column-oriented, scalable, distributed real-time database system to read and write main store unstructured and semi-structured data. Utilizing Hadoop HDFS as a file storage system, mass data processing HBase Hadoop MapReduce is, as the Zookeeper distributed collaboration services.

1 HBase Overview

1.1 HBase advantages and disadvantages

Advantages
availability: no abnormal loss of data due to the machine when the Write-Ahead Log WAL (Write-Ahead Log) mechanism to ensure that data is written, Replication mechanism to ensure that the machine fails, data is not lost, and use the underlying HBase HDFS, HDFS backup
high performance: underlying LSM (Log-Structured Merge Tree) and Rowkey ordered data structure that has a high write performance HBase; Region segmentation, the primary key index and the caching mechanism that includes a random HBase face of massive data read performance that for Rowkey query milliseconds and
column-oriented: column-oriented storage and access control, independent retrieve columns, supports fast data indexing
sparse: empty column does not account for memory
large: a table can be up to billions of rows, columns of hundreds of
multi-version: support data multivariate
data types in: HBase data type is a string
modeless: each row can have a sort of rowKey and any number of columns, the column can be increased dynamically as needed, with different rows of a table may have different column.
Shortcoming
single Rowkey inherent limitations decided not effectively support multi-criteria query
to join and merge multi-table query performance data is not good
during the update process a large number of write and delete operations, merging and splitting frequently to reduce storage efficiency
is not suitable for large range scan query
does not directly support SQL statements to query, to support the relational model is not good, partitioning and indexing schema design more difficult

1.2 HBase usage scenarios

2 HBase data model

3 HBase principle

Guess you like

Origin www.cnblogs.com/eugene0/p/11574884.html