Advanced MySQL - Index

If you want to see the basics, click here! ! !
MySQL Basics-MySQL Command Encyclopedia

What is an index: An index is a data structure that helps MySQL obtain data efficiently

1. Index data structure selection:

1.1 hash table

insert image description here

1.2 Binary tree/red-black tree index format

insert image description here

Disadvantages: The depth of the tree will affect the efficiency and increase the number of IOs. The red-black tree needs to be rotated when the amount of data increases, which also affects the efficiency

1.3 B-tree

insert image description here

Figure description:

​ Each node occupies one disk block. There are two ascending keys on a node and three pointers to the root node of the subtree. The pointer stores the address of the disk block where the child node is located. The three scopes divided by the two keywords correspond to the data scopes of the subtree pointed to by the three pointers. Taking the root node as an example, the keywords are 16 and 34, the data range of the subtree pointed to by the P1 pointer is less than 16, the data range of the subtree pointed to by the P2 pointer is 16-34, and the data range of the subtree pointed to by the P3 pointer is greater than 34.

Find keywords process:

  1. Find the hit disk block according to the root node, and 1 read it into the memory. [Disk I/O operation 1st time]
  2. Compare the key word 28 in the interval (16,34), and find the pointer P2 of disk block 1
  3. Find the disk block 3 according to the P2 pointer, and read it into the memory. [Disk 1/O operation 2nd time]
  4. Compare key words 28 and ask (27,29) in the district. Find the pointer P2 of disk block 3
  5. Find the disk block 8 according to the P2 pointer, and read it into the memory. [Disk 1/O operation 3rd time]
  6. Found key 28 in the key list in disk block 8.

Disadvantages :

  1. Each node has a key and also contains data, and the storage space of each page is limited. If the data is relatively large, the number of keys stored in each node will decrease.
  2. When the amount of stored data is large, the depth of the tree will be large, which will increase the number of disk IOs during query and affect query performance.

1.4 B+ tree

insert image description here

Non-leaf nodes store keys and pointers, and leaf nodes store data (so that each disk IO reads as many data keys as possible), and only needs 3 layers at most to reach tens of millions of data

  • INNODB implements B+ tree :

insert image description here

The leaf node stores data , if the key to create the index is other fields (non-primary key), then the data stored in the leaf node is the primary key of the record, and then the record is found through the primary key

  • MyISAM implements B+ tree:

insert image description here

The leaf node stores the address of the data, and then fetches and reads the data according to the address (corresponding to two MyISAM files. myi stores the index file, and myd stores the data file)

2. Index classification:

Adding indexes can improve the reading speed of data, improve the concurrency and anti-pressure ability of the project

  • Primary key index (primary key): The primary key is a unique index, but must be designated as primary key
  • Unique index (unique): All values ​​of the index column can only appear once and must be unique, but the value can be empty
  • Ordinary index : the basic index type, the value can be empty, there is no uniqueness restriction
  • Full-text index : the type of full-text index is fulltext. Full-text index can be created on varchar, char, and text types
  • Composite index : Multi-column values ​​form an index, which is specially used for composite indexes

Example: Create an index:alter table staffs add index idx_nap(name,age,pos)

Usage of combined index (a, b, c) (the index is no longer used after the range) :

insert image description here

Indexes are matched by:

insert image description here

3. Comparison of storage engines:

One disk IO reads 4 pages of 16k

MyISAM InnoDB
index type nonclustered index clustered index
Support business no yes
Support table lock yes yes
Support for row locks no yes
Support for foreign keys no yes
Support for full-text indexing yes Yes (supported after 5.6)
Suitable for operation type a lot of choice mass insert,delete,update
data structure B+ tree B+ tree

The data structure of the memory storage engine is: hash table

4. MySQL interview terminology:

  • Return to the table (only ordinary indexes exist): When creating indexes for non-primary key attributes, in the ordinary index B+ tree, the leaf node stores the primary key of the row, and you need to search the B+ tree of the primary key index again through the primary key to get all the data The process (back to the table: look up the B+ tree of the primary key index again to get all the data of the row )

  • Covering index : No need to return to the table, such as select id from table can directly find the id from the primary key index, and there is no need to go back to the table to find other details.

    [External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-BnDKYUPX-1651719631481) (C:\Users\HP\Desktop\learning direction\mysql database\index\covering index .PNG)]

  • Leftmost match : For the combined index (name, age), there will be a leftmost matching principle, which will match (name, age), (name), such as 1, 2, 4 (4 can match because of MySQL The optimizer puts the name in front of the match )

  1. select  * from t1 where name = "z";
  2. select  * from t1 where name = "z" and age=10;
  3. select  * from t1 where age=10;
  4. select  * from t1 where age=10 and name = "z";
  • Predicate pushdown : Steps such as select t1.name, t2.name from t1 join t2 on t1.id=t2.id: first take out the 4 fields required by the two tables, and then do the form association

  • Index push-down (filtered by the storage engine, and then returned) : For a combined index (name, age), when pulling data directly from the storage index, it is directly judged according to the name and age, and the matching result is returned to MySQLServer (available after 5.7 )

Note :select * from t where age>10 and name = "z"; age会走索引,但name不会走索引 (范围查询后面的都不会走索引)

Clustered index (InnoDB) : It is not a separate index type, but a data storage method, which refers to the compact storage of data rows and adjacent key values.

Non-clustered index (MyISAM) : data files and index files are stored separately.

Advantages and disadvantages of clustered index:

insert image description here

6. hash index

The memory storage engine uses

Data structure : hash table

insert image description here

Application scenario : used when the storage index takes up a lot of space

When a large number of URLs need to be stored and searched based on the URLs, if the B+ tree is used, the stored content will be very large

select id from url where url=""

You can also use the CRC32 redundancy check to hash the url , and you can use the following query methods:

select id fom url where url="" and url_crc=CRC32("")

The reason for the high performance of this query is that it uses a small index to complete the lookup

7. Optimization details:

insert image description here
insert image description here

The way to join:

7.1: Nested-Loop Join Algorithm (Nested)

insert image description here

7.2 Index nested-Loop Join (with index)

insert image description here

Distinguish between driven tables and non-driven tables: such as A join B (reading data does not necessarily read A first and then read B, but you can specify A constraint join B to force A to be read first and then B)

7.3 Block Nested-Loop Join (without index)

insert image description here

Disadvantages: Join buffer size is limited: default 256MB

The difference between and and where in the join connection

- select * from t1 join t2 on t1=id=t2.id and t1.name="z";
- select * from t1 join t2 on t1=id=t2.id where t1.name="z";

and (does not participate in the join operation) is to filter which records in table A or table B meet the connection conditions before the table is joined, and at the same time, it will take into account whether it is a left join or a right join. If it is a left join, a certain record in the left table does not meet the join conditions condition, then it does not join, but it will still be left in the result set (the result of the join on the right is NULL at this time).

The on condition is the condition used when generating the temporary table. Regardless of whether the condition in on is true or not, it will return the records of the left table

8. Index monitoring:

About monitoring:

insert image description here

Handler_read_key and Handler_read_rnd_next as high as possible (the number of times the index is used to find data)

Limit tuning for large data volumes (and only fetching a few pieces of data):

select * from rental a join (select id from rental limit 100000,5) b on a.id = b.id ;      //查主键索引树相对较快

Guess you like

Origin blog.csdn.net/The_xiaoke/article/details/124584623
Recommended