How to use B-tree in the database

Preface

First of all, the B-tree should not be confused with the binary tree. In computer science, the B-tree is a self-balancing tree data structure that maintains ordered data and allows search, sequential access, insertion and deletion in logarithmic time.

The B-tree is a generalization of the binary search tree, because a node can have more than two child nodes. [1] Unlike other self-balancing binary search trees, B-trees are very suitable for storage systems that read and write relatively large data blocks (such as optical discs).

It is commonly used in databases and file systems.

First understand disk storage

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-AvKFY41f-1613714653937)(https://secure-cdn.wolai.com/static%2Fptq1tGQ12vbWqLVmQ3gcDu%2Fimage.png ?auth_key=1613714622-wLXGs4rQUPnU8kU2g2f6db-0-0f4fbed40f93a590be3bbccea1096e2a)]

Computer storage uses disks for storage. The storage rules are as shown in the figure. Use tracks and blocks to partition, so that a disk is divided into different areas.

When we want to find a data on the disk, we need to use the address to find it, and this address is [track+block] to find the specified partition.

After finding the information, we need to copy the information on the disk to RAM, because we cannot process information on the disk.

Find information on disk

Assume that each block is 512byte in size. Now we need to store data: a total of five fields, a total of 128 bytes. (Id: 10bytes) There are 100 pieces of data in total.

This means that each block can store 4 records. Need 25 blocks to save.

Now think about it, if you need to take up 25 blocks. When we access data, the speed of searching data depends on the number of blocks you access. If you search the entire table, you need to search 25 blocks.

Index

The database index is designed to reduce access time.

When we create an index, we will create a table in another block, in the table is (our [index value] and [pointer] to the disk). Each data table record has an index entry. This is the tension index

Assume that the index value id: 10bytes, and the pointer is 6bytes. Then a record is 16 bytes.

Then a block can store 32 index entries, so the above 100 data requires 4 blocks to store the index.

If the entries in the database multiply exponentially, then we can build an indexed index, also called a sparse index

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-l69Srz5p-1613714542098)(https://secure-static.wolai.com/static/6bCQGnhknxW8otxdpJutZE/image.png )]

Now if we have many sparse indexes, and many tight indexes

[External link image transfer failed. The origin site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-u8o2AsO9-1613714542099)(https://secure-static.wolai.com/static/jTbT3FEsY5VtT3Y8Y8Et9B/image.png )]

Rotate them by 90°, it looks like a tree structure.

This is the application of B-tree in the database.

B tree (B-) tree

Motivation

nature

  1. Each node has at most m branches (subtrees); and the minimum number of branches depends on whether it is a root node. If it is a root node and not a leaf node, there are at least 2 branches, and non-root and non-leaf nodes have at least [m/ 2] branches.

  2. A node with n (k≤n≤m) branches has n-1 keywords, and they are sorted in ascending order. k=2 (root node) or [m/2] (non-root node)

  3. The nodes do not duplicate each other.

  4. The leaf nodes are in the same layer; it can be represented by a null pointer, which is the location where the search fails.

Basic knowledge

Root node: The root node is the top node, the root node of the B-tree may not be one,

Internal nodes: Internal nodes are all nodes except leaf nodes and root nodes, which have parent nodes and child nodes.

Leaf nodes: Leaf nodes have the same restriction on the number of elements, but there are no child nodes and no pointers to child nodes.

Order: If each node table has four elements, then it has five pointers, and the order is 5.

Insert and delete

It is recommended to watch the video directly for better understanding.
https://www.bilibili.com/video/BV1UC4y1p7zm?from=search&seid=2406874848049670294

https://www.bilibili.com/video/BV1Aa4y1j7a4?from=search&seid=2406874848049670294

Guess you like

Origin blog.csdn.net/Matcha_/article/details/113862741