MySQL Index for Advanced MySQL

Today I mainly study MySQL index, but mainly use MySQL in Linux system, mainly follow MySQL in Linux environment, and then demonstrate the related operations of index, introduce the underlying structure of index, classification and syntax of index, performance analysis of index, Index usage rules, index design principles, etc.

Table of contents

1. Index overview

2. Index structure

3. Index classification

4. Index syntax

5. SQL performance analysis

6. Index usage

7. Index Design Principles


To install MySQL on Linux, refer to the specific installation method. After the installation is complete and start MyQL, you can use the visual management tool to connect to remote MySQL.

1. Index overview

No index, full table scan, index, can be quickly retrieved.

 Advantages and disadvantages of indexing:

2. Index structure

In different storage engines, MySQL indexes have different structures, as follows:

The index structure we generally refer to refers to the B+ tree structure. All three storage engines support the B+ tree structure. Memory supports Hash index, and MyISAM supports spatial index and full-text index.

Let's take a look at the binary sorting tree first. It may degenerate into a linked list in extreme cases, and the retrieval performance is low. You can use a red-black tree, which is an approximate balanced binary tree, which can improve the retrieval speed, but the performance is still not enough.

 Before looking at the B+ tree, let's take a look at the B tree. The order of the B tree is the number of pointers. Each node can store up to (degree-1) keys. The element below the root node, the element to the right of the pointer is larger than the root node. Every time an element is inserted, when it is greater than the corresponding order, the middle element will be fissioned.

The B+ tree is different from the B tree. All the nodes of the B+ tree will appear in the leaf nodes. The leaf nodes are linked together in the form of a linked list. The non-leaf nodes only play the role of index data. Every time the inserted element reaches the corresponding degree, it will be phased. Fission, but the corresponding fission nodes will be reserved in the leaf nodes and linked with pointers, and the data will be stored in the leaf nodes.

Let's take a look at the B+ tree structure in the index in MySQL. On the basis of the original B+ tree, pointers to adjacent leaf nodes are added to improve access performance.

 

Let's take a look at the Hash index again, convert the key into a hash value, map it to a specific location, and then store it in the hash table. If there is a hash conflict, solve it through the linked list, and just add an element to the end of the linked list at that location.

Let's take a look at the characteristics of the hash index: it supports equivalence comparison, does not support range query, is out of order, and generally speaking has high retrieval efficiency. If there is a hash collision, the linked list must be retrieved, and the efficiency is not necessarily high.

Finally, let's look at an interview question: Why does the InnoDB engine choose a B+ tree index instead of a binary tree, or a red-black tree, or a B-tree?

3. Index classification

Let's take a look at the classification of indexes, which mainly include four categories: primary key, unique, regular, and full text.

This is also a frequently asked question in interviews. In InnoDB, according to the storage form of the index, it can be divided into clustered index and non-clustered index. The clustered index puts data storage and index together, and the leaf nodes of the index structure store row data; non-clustered index Separate the data from the index, and the leaf nodes of the index are associated with the corresponding primary key.

Let's take a look at the clustered index and non-clustered index (secondary index) in combination with the figure. We can find that the primary key index is a clustered index, and the data storage and index are put together. The leaf nodes store the corresponding row data, and the rest of the indexes are non-clustered indexes. , you can see that the storage of indexed data is separated, and the leaf nodes are associated with the corresponding primary key id.

 

Index retrieval: return to the table query, first go to the non-clustered index to find the corresponding primary key value, and then get the row data corresponding to the primary key value from the clustered index according to the primary key value. 

Let's look at a few thinking questions, and we can see that querying by id is more efficient. You can directly use the clustered index to compare the id. According to the name query, you must first use the non-clustered index to find the id, and then return to the table to query the row data.

4. Index syntax

Let's take a look at the basic syntax of the index. Create an index with create index, delete an index with drop index, and view an index with show index, as follows:

5. SQL performance analysis

1) SQL execution frequency, we can query the crud access frequency of the current database by executing the following command, and then select an appropriate SQL optimization strategy.

2) You can also locate the SQL with low execution efficiency by enabling the slow query log. You only need to configure the file to enable the slow query and set the slow log time. as follows: 

 3) You can also check the time-consuming situation of each SQL by executing the profile, and you can also check the query status of each stage of the specified query statement, as follows:

 4) Use explain to view the execution status of each statement, focusing on the type. When optimizing, try to optimize the connection type in the direction of good performance, and avoid the situation of all. Primary key or unique index access is const, non-unique index will appear ref.

6. Index usage

Before using the index, you can first verify whether the performance is improved after adding the index, and use the explain key to view the specific parameters related to the execution status of the specific SQL.

 The use of composite indexes must satisfy the leftmost prefix rule, and the leftmost column cannot be missing, otherwise all indexes will be invalid. If the indexed column is skipped, all field indexes behind the index after this column will be invalid.

 For composite indexes, if there is a range query > or <, it will cause the index on the right side of the range query to fail.

 Try not to perform operations on index columns, which will cause the index to become invalid.

 If the string is not enclosed in single quotes, the index will be invalid.

In the case of using like for fuzzy query, the fuzzy matching starting with % will cause the index to fail.

 In the case of using the or conditional connection, as long as there are columns without indexes, the indexes will become invalid.

 If MySQL thinks that it is faster to scan all, it will not use the index.

Let's take a look at the SQL prompt, which is an important means of optimizing the database. You can specify the recommended index, ignore the index, and force the use of the index, as follows:

Let's take a look at the thinking question. We need to optimize the statement select id, username, password from table name where username = "Zhang San". How to optimize it? The optimal solution is to build a composite index based on (username, password) , because the data under the composite index is the id, so you can avoid querying back to the table.

Use a prefix index to save index space and improve index efficiency. The length of the index is determined according to the index selectivity. Selectivity = unique index value/total number of records in the data table. The higher the index selectivity, the higher the index query efficiency.

If there are multiple query conditions, it is recommended to build a joint index instead of a single-column index for the query field. If it is a multi-condition single-column index, MySQL will evaluate which index has the highest query efficiency and choose which one.

 

7. Index Design Principles

Seven principles of index design: 1. Frequently build indexes for queries with large amounts of data; 2. Build indexes that are often used as query, grouping, and sorting conditions; 3. Build indexes for highly differentiated columns; 4. Long strings can be Use prefix indexes; 5. Try to use composite indexes; 6. Control the number of indexes; index columns cannot be stored and controlled, and use non-null constraints when creating tables.

Guess you like

Origin blog.csdn.net/nuist_NJUPT/article/details/129151105