LSM Tree of HBase principle

The main reason that HBase can provide real-time computing services is determined by its architecture and underlying data structure. It adopts LSM-Tree (Log-Structured Merge-Tree) + HTable (region partition) + Cache architecture to ensure the query speed of HBase.

1. Principle of LSM-tree

LSM-tree originated from a paper "The Log-Structured Merge-Tree (LSM-Tree)" in 1996. It is now very common in NoSQL systems and has basically become a required solution. This article introduces the main features of LSM-tree thought.

LSM-tree is specially designed for key-value storage systems. The key-value storage system has two main functions, put (k, v): write one (k, v), get (k) : Find v given a k. The biggest feature of LSM-tree is its fast writing speed. It mainly uses the sequential writing of the disk, and pk drops the B-tree that needs to be written randomly.

The figure below is an integral part of LSM-tree, which is a multi-layer structure. The first is the C0 layer of the memory , which stores all the recently written (k, v). This memory structure is ordered and can be updated in situ at any time, and supports queries at any time. The remaining C1 to Ck layers are on the disk , and each layer is an ordered structure on the key.

Writing process: A put (k, v) operation is coming, first appended to the pre-write log (Write Ahead Log, that is, the log recorded before actually writing), and then added to the C0 layer. When the data of the C0 layer reaches a certain size, the C0 layer and the C1 layer are merged, similar to merging and sorting, and this process is Compaction. The merged new new-C1 will be written to the disk sequentially, replacing the original old-C1. When the C1 layer reaches a certain size, it will continue to merge with the lower layer. After merging, all old files can be deleted, leaving new ones.

Note that the writing of data may be repeated, and the new version needs to overwrite the old version. What is the new version, I write (a=1) first, and then (a=233), 233 is the new version. If the old version of a has reached the Ck layer, a new version is coming to the C0 layer at this time. At this time, it will not care whether the files underneath have an old version or not. The cleaning of the old version is done during the merge.

The writing process basically only uses the memory structure, and Compaction can be completed asynchronously in the background without blocking writing.

Query process: In the write process, you can see that the latest data is in the C0 layer and the oldest data is in the Ck layer. Therefore, the query is also to check the C0 layer first. If there is no k to be checked, then check C1, layer by layer.

One query may require multiple single-point queries, which is slightly slower. Therefore, LSM-tree is mainly aimed at scenarios with intensive writing and few queries.

LSM-tree is used in various key-value databases, such as LevelDB, RocksDB, and the distributed row storage database Cassandra also uses the LSM-tree storage architecture.

Second, the use of LSM-Tree in HBase

1. HBase write process

The data will be written to the memory first. In order to prevent the loss of memory data, it needs to be persisted to the disk while writing to the memory, which corresponds to the MemStore and HLog of HBase;

After the data in MemStore reaches a certain threshold, the data needs to be flashed to disk, that is, HFile (also a small B+ tree) file is generated;

The minor (small HFile small file merging) major in hbase (all HFile files in a region are merged) performs compact operation, and deletes invalid data (outdated and deleted data) at the same time. Multiple small trees are merged into a large tree at this time. Enhance reading performance.

2. Optimization of HBase for LSM-Tree

Bloom-filter: It is a bitmap with random probability, which can quickly tell you whether there is any specified data in a small ordered structure. So you can know whether the data is in a small set without binary search, and just a few simple calculations. Efficiency has been improved, but at the cost of space.

compact: Small trees are merged into large trees: Because the performance of small trees is problematic, there must be a process to continuously merge small trees into large trees, so that most of the old data queries can also be found directly using log2N, no need Then perform the query of (N/m)*log2n.

 

 

 

 

 

Guess you like

Origin blog.csdn.net/yuan1164345228/article/details/108693761