MySQL clustered index and non-clustered index

Analysis & Answer

When a record in the database contains multiple fields, a B+ tree can only store the primary key. If a non-primary key field is retrieved, the primary key index will lose its effect and become a sequential search. At this time, a second set of indexes should be established on the second column to be retrieved. This index is organized by independent B+ trees. There are two common methods to solve the problem of multiple B+ trees accessing the same set of table data, one is called clustered index and the other is called non-clustered index (secondary index). Although these two names are both called indexes, this is not a separate index type, but a data storage method.

clustered index

For clustered index storage, the row data and the primary key B+ tree are stored together, the secondary key B+ tree only stores the secondary key and the primary key, and the primary key and non-primary key B+ tree are almost two types of trees.

InnoDB uses a clustered index, which organizes the primary key into a B+ tree, and the row data is stored on the leaf node. If you use the condition "where id = 14" to search for the primary key, then follow the B+ tree search algorithm. The corresponding leaf node can be found, and then the row data can be obtained. If conditional search is performed on the Name column, two steps are required: the first step is to retrieve the Name in the auxiliary index B+ tree, and reach its leaf node to obtain the corresponding primary key. In the second step, use the primary key to perform another B+ tree retrieval operation on the main index B+ tree species, and finally reach the leaf node to obtain the entire row of data.

nonclustered index

For non-clustered index storage, the primary key B+ tree stores pointers to real data rows in the leaf nodes, not the primary key.

MyISM uses a non-clustered index. The two B+ trees of the non-clustered index look the same. The structure of the nodes is exactly the same, but the stored content is different. The nodes of the primary key index B+ tree store the primary key, and the auxiliary key index B+ tree stores Accessory keys. The table data is stored in an independent place. The leaf nodes of the two B+ trees both use an address to point to the real table data. For the table data, there is no difference between the two keys. Since the index tree is independent, retrieval by secondary key does not require access to the index tree for the primary key.

The difference between a clustered index and a non-clustered index

We assume that a table stores 4 rows of data as shown in the figure below. Where Id is the primary index and Name is the secondary index. The diagram clearly shows the difference between clustered and nonclustered indexes.

 We focus on the clustered index. It seems that the efficiency of the clustered index is obviously lower than that of the non-clustered index, because every time the auxiliary index is used to search, two B+ tree searches are required. Isn't this superfluous? What are the advantages of clustered indexes?

  1. Since the row data and the leaf nodes are stored together, the primary key and the row data are loaded into the memory together, and the row data can be returned immediately when the leaf node is found. If the data is organized according to the primary key Id, the data can be obtained faster.
  2. The advantage of using the primary key as a "pointer" instead of using the address value as a pointer for the auxiliary index is that it reduces the maintenance work of the auxiliary index when rows are moved or the data page is split. Using the primary key value as a pointer will make the auxiliary index take up more The benefit in exchange is that InnoDB does not need to update this "pointer" in the auxiliary index when moving rows. That is to say, the position of the row (positioned by the 16K Page in the implementation, which will be covered later) will change with the modification of the data in the database (the previous B+ tree node split and Page split), and the clustered index can be used It is guaranteed that no matter how the nodes of the primary key B+ tree change, the auxiliary index tree will not be affected.

Meow Interview Assistant: One-stop solution to interview questions, you can search the WeChat applet [Meow Interview Assistant]  or follow [Meow Brush Questions] -> Interview Assistant  free questions. If you have good interview knowledge or skills, look forward to your sharing!

Guess you like

Origin blog.csdn.net/jjclove/article/details/127391069