MySQL (InnoDB analysis): ---B+ tree index (clustered index and non-clustered index (auxiliary index)), B+ tree index split

Hello everyone! Today is New Year’s Eve, I wish you an early year! Here, my brother, I wish all the big brothers reunion and harmony with their families, good health and smooth work in the new year!

1. Overview of B+ Tree Index

  • The essence of B+ tree index is the realization of B+ tree in the database. However, the B+ index in the database has a characteristic of high fan-out, so in the database , the height of the B+ tree is generally in the 2~4 level, that is to say, it only takes 2~4 times to find the row record of a certain key value. IO . Because the current general mechanical disk can do at least 100 times of IO per second, 2~4 times of IO means that the query time is only 0.02~0.04 seconds

The general working principle of B+ tree

  • Putting aside various implementation details, let's talk about the general working principle of B+ tree index
  • The working principle is as follows :
    • Assuming that it is a primary key index table, there are eight primary keys of 0, 1, 2, 3, 4, 5, 6, 7, and 8 in a table
    • Then in the B+ tree, the leaf nodes store these primary keys
    • When we select * from table where id >1 and id <7, we will first start from the root node of the B+ tree and look down. After finding the node 1, since the underlying nodes are organized in a linked list, The primary key is searched to the right to find the nodes 2, 3, 4, 5, and 6
    • insert, update, delete these are the same principle

What is the difference between InnoDB and MyIASM B+ tree

  • Both InnoDB and MyIASM support B+ tree indexes, so what is the difference between them?
    • InnoDB : Its leaf node stores not only the value of the primary key, but also the row data corresponding to the primary key. Therefore, each leaf node = primary key + entire row of data values
    • MyIASM : Its leaf node also stores the value of the primary key, but it does not store the row data corresponding to the primary key, and it stores the address corresponding to the row data. When we find the primary key value, we find the value on the corresponding address through the pointer

Interview question: MyIASM and InnoDB manipulate data through B+ number index, which one is faster?

  • The rules are :
    • When the amount of data to operate is not much, there may be no difference between the two
    • When the amount of data is large, InnoDB is faster than MyIASM
  • The explanation is as follows :
    • InnoDB reads data from the disk and constructs a B+ tree in the memory. Since the leaf nodes of the B+ tree store the value of the data, the data is stored directly in the memory.
    • MyIASM reads data from the disk and constructs a B+ tree in the memory. Since its B+ tree leaf node only stores the pointers corresponding to the data and does not store the value, it will not read the data in the memory.
    • So when the amount of data is small, there is no difference between the two: Because the amount of data is small, InnoDB fetches data directly from the memory, and MyIASM uses pointers to find data in the disk, and the efficiency is not much different.
    • When the amount of data to be operated on is large, InnoDB fetches the data directly from the memory, so the speed is faster, but MyIASM needs to constantly use the pointer to fetch the data from the disk, resulting in slower speed

B+ tree index classification

  • Divided into:
    • Clustered index (clustered index) : According to the primary key of each table to construct a B+ tree, a table can only have one clustered index
    • Auxiliary index (secondary index) : According to the non-primary key structure B+ tree, a table can have multiple auxiliary indexes
  • Whether it is clustered or auxiliary index, its internal is B+ tree, that is, highly balanced, leaf nodes store all data
  • The difference between a clustered index and an auxiliary index is whether the leaf node stores a whole row of information

Second, the clustered index

Clustered index structure

  • As mentioned earlier, InnoDB is an index-organized table, that is, the data in the table is stored in the order of the primary key
  • The clustered index is to construct a B+ tree according to the primary key of each table , and the row record data of the entire table is stored in the leaf nodes , and the leaf nodes of the clustered index are also called data pages . This feature of the clustered index determines that the data in the index-organized table is also part of the index. Like the B+ tree structure, each data page is linked through a doubly linked list
  • Since the actual data pages can only be sorted according to a B+ tree, each table can only have one clustered index
  • In most cases, the query optimizer tends to use a clustered index . Because the clustered index can find data directly on the leaf nodes of the B+ tree index. In addition, because the logical order of the data is defined, the clustered index can access queries for range values ​​particularly quickly. The query optimizer can quickly find that a certain range of data pages need to be scanned

B+ tree structure analysis case

  • Create a table below, so that each page can only store two rows of records
create table t(
    a int not null,
    b varchar(8000),
    primary key(a)
)engine=innodb;
  • Insert data , the length of the inserted column b is 7000, so it can artificially make that each page can only store two rows at present
insert into t select 1,repeat('a',7000);
insert into t select 2,repeat('a',7000);
insert into t select 3,repeat('a',7000);
insert into t select 4,repeat('a',7000);
  • Use the py_innodb+page_info tool to analyze the table space, you can get :
    • The page level of 0000 is the data page. The data page was analyzed in the previous chapter. Now we don’t focus on this part.
    • The page with page level 0001, the current B+ tree height is 2, so this page is the root of the B+ tree

  • Observe the data stored in the root page of the cause through the hexdump tool, and then analyze this page through the Page Directory at the end of the page :
    • You can know from 00 63 : where the line starts on the page
    • Then use the Recorder Header to analyze :
      • The value starting from 0xc063 is 69 6e 66 69 6d 75 6d 00, which means that infimum is a row record
      • The previous 5 bytes 01 00 02 00 1b is the Recorder Header. Analysis of the value 1 from the 4th to the 8th bit means that there is only one record in the row of records (it needs to be remembered that InnoDB's Page Directory is sparse), that is infimum record itself
      • Use the last two bytes of Recorder Header 00 1b to determine the position of the next record, that is, c063+1b=c07e, read the key value to get 80 00 00 01, which is the key value of the primary key of 1 (the int is Unsigned, so the binary is 0x80 00 00 01, not 0x0001)
    • The value 00 00 00 04 after 80 00 00 01 represents the page number that points to the data page
    • In the same way, you can find the two key values ​​of 80 00 00 02 and 80 00 00 04 and the data page they point to

  • Through the above analysis of non-data page nodes, it can be found that the data page stores a complete record of each row , while in the index page of the non-data page, only the key value and the offset to the data page are stored . Rather than a complete line record. Therefore, the structure of this clustered index is roughly as shown in the figure below

The storage of the clustered index is not physically continuous

  • Many database documents and online blogs say that the clustered index physically stores data in order. As can be seen from the above figure, if the clustered index must store physical records in a specific order, the maintenance cost appears to be very limited.
  • So the storage of the clustered index is not physically continuous, but logically continuous
  • There are two points :
    • One is that the pages mentioned above are connected by a doubly linked list, and the pages are sorted in the order of the primary key
    • Another point is that the records in each page are also maintained through a doubly linked list, and the physical storage can also not be stored according to the primary key.

 

 

"Quick query" advantages of clustered index

  • Another advantage of the clustered index is that it is very fast for sorting and range searching of the primary key, and the data of the leaf node is the data that the user wants to query
  • If the user needs to query a table of registered users, query the last 10 users registered , because the B+ tree index is a doubly linked list, the user can quickly find the last data page and retrieve 10 records. If you use EXPLAIN for analysis, you can get:
    • Although order by is used to sort the records here, the so-called filesort operation is not performed in the actual process , and this is because of the characteristics of the clustered index

  • The other is range query , that is, if you want to find data in a certain range of the primary key, you can get the page range through the upper intermediate node of the leaf node, and then read the data page directly . Another example:
    • Executing explain gets the execution plan of the MySQL database , and an estimated number of returned rows of the query result is given in the rows column. It should be noted that rows represents an estimated value, not an exact value. If you actually execute this SQL query, you can see that there are actually only 9946 rows of records

 

Three, auxiliary index (non-clustered index)

Auxiliary index structure

  • Secondary index (secondary index) is also called non-clustered index, the leaf node does not contain all the data of the row record
  • Leaf nodes in addition contain key , each leaf node in the index line also includes a bookmark (Bookmark) . This bookmark is used to tell the InnoDB storage engine where to find the corresponding row data. Since the InnoDB storage engine table is an index-organized table, the bookmark of the auxiliary index of the InnoDB storage engine is the clustered index key of the corresponding row data
  • The following figure shows the relationship between the auxiliary index and the clustered index in the InnoDB storage engine:

working principle

  • The existence of auxiliary indexes does not affect the organization of data in the clustered index, so there can be multiple auxiliary indexes on each table
  • When looking for data through the auxiliary index, the InnoDB storage engine traverses the auxiliary index and obtains the primary key to the primary key index through the leaf-level pointer , and then finds a complete row record through the primary key index
  • For example : if you search for data in an auxiliary index tree with a height of 3, you need to traverse the auxiliary index tree 3 times to find the specified primary key. If the height of the clustered index tree is also 3, then you also need to perform the clustered index tree 3 searches, and finally found a page where the complete row data is located, so a total of 6 logical IO accesses are required to get the final data page

Auxiliary index structure analysis

  • Take the above table t as an example, and then add another column
create table t(
    a int not null,
    b varchar(8000),
    primary key(a)
)engine=innodb;

 

alter table t add c int not null;
  • Update the value of the newly added column c for each row
  • update t set c=0-a;

     

  • Create a non-clustered index on column c
alter table t add key idx_c(c);
  • Check the current index

  • Check the data of the current table

  • Using the py_innodb_page_info tool to analyze the table space, you can get:

  • Compared with the above clustered index, there is one more page this time. Analyze the page whose page offset is 4, which is the page where the non-clustered index is located, and can be obtained by analyzing the tool hexdump:

Since there are only 4 rows of data, and column c has only 4 bytes, it can be completed in a non-clustered index page. The analysis and analysis can be obtained as shown in the following figure. The following figure shows the table t auxiliary index idx_c and clustered index relationship:

  • You can see that the leaf node of the auxiliary index contains the value of column c and the value of the primary key
  • Because here we deliberately set the key value to a negative value, we will find that -1 is stored internally in the manner of 7f ff ff ff
  • 7 (0111) The highest bit is 0, which represents a negative value. The actual value should be inverted and then added 1, that is, -1

Fourth, the split of B+ tree index

  • The split of the B+ tree index is different from the B+ number insertion operation . The split of the B+ tree index page does not always start from the middle record of the page , which may lead to a waste of page space

Demonstration description

  • Insertion is carried out according to the order of increment, if 10 records are inserted at this time

  • If you want to insert records later and need paging operations, record 5 will be used as the split point, and the following two pages will be obtained after splitting

  • We know that they are inserted sequentially during insertion, so no more records will be inserted in the page P1, which will result in a waste of space, and P2 will split again
  • The Page Header of the InnoDB storage engine has the following parts to save the inserted sequence information :
    • PAGE_LAST_INSERT

    • PAGE_DIRECTION

    • PAGE_B_DIRECTION

  • With this information, the InnoDB storage engine can decide whether to split to the right or left, and at the same time decide which one to record the split point as
    • If the insertion is random, the middle record of the page is taken as the record of the split point, which is the same as the previous introduction
    • If the number of records inserted in the same direction is 5, and the record that has been located (cursor) is currently located (in InnoDB, the first need to be located, the located record is the previous record of the record to be inserted) and there are 3 records, Then the record of the split point is the third record after the located record, otherwise the record of the split point is the record to be inserted

Demo case

  • Now look at an example of splitting to the right, and there are 3 records after the located record, the split point record is shown in the following figure:

  • The above picture is split to the right and there are 3 records after the located record, split record is the split point record, and finally split to the right to get the situation shown in the figure below

Demo case

  • In the above demonstration case, the split point is the insertion record itself. After splitting to the right, only the record itself is inserted. This is a common situation in auto-increment insertion.

 

 

Guess you like

Origin blog.csdn.net/m0_46405589/article/details/113788256