MySQL database index implementation

Currently, the indexes supported by MySQL mainly include hash indexes, B+ tree indexes, fulltext indexes, and spatial indexes. The most commonly used is the B+ tree index. Today we will take a look at the index implementation of the InnoDB and MyISAM storage engines.

Introduction to InnoDB Indexes

The so-called B+ tree is evolved from the balanced binary tree (AVL), which is a typical multi-way balanced search tree.

The B+ tree index of InnoDB in MySQL is divided into clustered index and non-clustered index, that is, clustered index and non-clustered index. The clustered index constructs a B+ tree according to the primary key of the table, and its leaf nodes store the entire record data, while the leaf nodes of the non-clustered index only store the primary key value and the corresponding offset. The following figure is a simplified structure of a typical B+ tree clustered index:

Let's take a look at the non-clustered index:

The difference is very obvious. The data stored on the leaf nodes is different. In retrospect, why the search speed of the covering index mentioned earlier is so fast (if it is unclear, look down, and then go to the previous article to understand).

The reason why the covering index is fast is that it eliminates the need for secondary search, and all the fields required for the query can be obtained by just searching the index file (statement search fields, condition fields, and sorting fields are all in a joint index), non-clustered index The files themselves are much smaller and lookup is very fast. If the searched fields are not all in the non-clustered index, then the storage engine can only get the primary key and then use the primary key to get the data in the clustered index, which will be much slower in the case of a large amount of data.

Next, let's look at how MyISAM is implemented:

No matter what kind of index, the key value corresponds to the physical address of the data on the disk, the difference is that the value of the primary key index cannot be repeated.

Usually, the B+ tree index is very efficient, the height of the tree is not high, and the number of I/Os is small. However, it is not without its shortcomings. The larger the amount of data, the greater the change (to maintain balance) of each update of the data tree. After a certain amount, the performance will plummet, which is why the performance of large tables is low.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324971879&siteId=291194637
Recommended