LSM tree - efficient storage

Go to http://bofang.iteye.com/blog/1676698

 

Paper The Log-Structure Merge-Tree(LSM-tree)( http://www.google.com.my/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&ved=0CDoQFjAD&url=http%3A% 2Fviewdoc%%% 2Fciteseerx.ist.psu.edu. 2F 2Fdownload 3Fdoi%%% 26rep 3D10.1.1.44.2782 26type%%% 3Drep1 & 3Dpdf of the USG = EI = 6OlPUJuZFsaYiAfIkIHIDg & AFQjCNGGoN9IFTLShcv2HbL0RVQdElfxow & SIG2 = 8wysS63qlqRvWf5m3lk7bg ) describes certain details of such algorithms and data structures.

 

The main goal of LSM-tree is to build indexes quickly. B-tree is a general technology for index building. However, in the case of large concurrent data insertion, B-tree requires a large amount of disk random IO. Obviously, a large amount of disk random IO will seriously affect the speed of index establishment. In particular, for those cases where the index data is large (for example, a joint index of two columns), insert speed is an important indicator of performance impact, while reads are relatively infrequent. LSM-tree achieves optimal write performance through sequential writes to the disk, because this will greatly reduce the number of disk seeks, and one disk IO can write multiple index blocks.

 

The main idea of ​​LSM-tree is to divide trees of different levels. Taking a two-level tree as an example, it can be imagined that an index data consists of two trees, one tree exists in memory, and the other tree exists in disk. The in-memory tree may not necessarily be a B-tree, but other trees, such as AVL trees. Because the data sizes are different, there is no need to sacrifice CPU to achieve the minimum tree height. The tree that exists on disk is a B-tree.

 

 

Data is first inserted into the tree in memory. When the data in the tree in memory exceeds a certain threshold, a merge operation is performed. The merge operation will traverse the leaf nodes of the tree in memory from left to right and merge the leaf nodes of the tree in the disk. When the amount of data to be merged reaches the size of the storage page of the disk, the merged data will be persisted to disk, while updating the pointer of the parent node to the leaf node.

 

 

After the leaf nodes that previously existed on the disk are merged, the old data will not be deleted, but a copy of the data will be sequentially written to the disk together with the data in the memory. This will waste some space, however, LSM-tree provides some mechanisms to reclaim this space.

 

The non-leaf node data of the tree on disk is also cached in memory.

 

The data lookup will first look for the in-memory tree, and if no results are found, it will instead look for the tree on disk.

 

There is an obvious problem that if the amount of data is too large, the tree in the disk will be correspondingly large, and the result is that the speed of merging will be slower. One solution is to build trees at all levels, with lower-level trees having larger datasets than the trees at the upper level. Suppose the tree in memory is c0, and the tree in disk is c1, c2, c3, ... ck-1, ck at a time according to the hierarchy. The order of merging is (c0, c1), (c1, c2)...(ck-1, ck).

 

Why is the insertion of LSM-tree fast

 

1. First, the insert operation will first act on the memory, and the tree in memory will not be very large, which will be very fast.

2. A merge operation writes one or more disk pages sequentially, which is much faster than random writes.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326628784&siteId=291194637