Why is it recommended to use auto-increment columns as primary keys for InnoDB tables in mysql

Features of InnoDB Engine Tables

1. The InnoDB engine table is an index-organized table (IOT) based on B+ tree

About B+ Trees

(The picture comes from the Internet)

Features of B+ tree:

(1) All keywords appear in the linked list of leaf nodes (dense index), and the keywords in the linked list are just in order;

(2) It is impossible to hit a non-leaf node;

(3) A non-leaf node is equivalent to an index (sparse index) of a leaf node, and a leaf node is equivalent to a data layer that stores (keyword) data;

 

2. If we define the primary key (PRIMARY KEY), InnoDB will select the primary key as the clustered index. If the primary key is not explicitly defined, InnoDB will select the first unique index that does not contain NULL values ​​as the primary key index. If there is no For such a unique index, InnoDB will select the built-in 6-byte ROWID as an implicit clustered index (ROWID is incremented as row records are written, and this ROWID is not as quotable as ORACLE's ROWID, it is implicit ).

3. The data record itself is stored on the leaf node of the main index (a B+Tree). This requires that each data record in the same leaf node (the size is one memory page or disk page) is stored in the order of the primary key, so whenever a new record is inserted, MySQL will insert it into the appropriate node according to its primary key and position, if the page reaches the load factor (InnoDB defaults to 15/16), a new page (node) will be opened.

4. If the table uses an auto-incrementing primary key, then each time a new record is inserted, the record will be sequentially added to the current Subsequent positions of the inode, when a page is full, a new page will be opened automatically

5. If a non-self-incrementing primary key is used (if the ID number or student number, etc.), since the value of the primary key inserted each time is approximately random, each new record must be inserted into a certain position in the middle of the existing index page. At this point MySQL has to move the data in order to insert the new record into the right place, and even the target page may have been written back to disk and cleared from the cache, and then read back from disk at this time, which adds a lot of overhead At the same time, frequent moving and paging operations cause a lot of fragmentation, resulting in an insufficiently compact index structure. Later, OPTIMIZE TABLE has to be used to rebuild the table and optimize the filled pages.

 

To sum up, if the data writing order of the InnoDB table can be consistent with the order of the leaf nodes of the B+ tree index, the access efficiency is the highest at this time, that is, the following situations have the highest access efficiency:

1. Use an auto-incrementing column (INT/BIGINT type) as the primary key. At this time, the writing sequence is auto-incrementing, which is consistent with the splitting sequence of B+ leaf nodes;

2. The table does not specify an auto-increment column as the primary key, and there is no unique index that can be selected as the primary key (the above conditions). At this time, InnoDB will select the built-in ROWID as the primary key, and the writing order is consistent with the ROWID growth order;
except In addition, if an InnoDB table does not display a primary key, and there is a unique index that can be selected as the primary key, but the unique index may not be an incremental relationship (such as string, UUID, multi-field joint unique index), the table access efficiency will be poor.

 

The original words in "High Performance MySQL"

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326218788&siteId=291194637