Performance Optimization Topics-MySql Performance Optimization-01-MySql Index Mechanism

Preface

The performance optimization topic consists of four parts, namely:

This section is the second part of the performance optimization topic-MySql performance optimization, a total of four subsections, namely:

  1. MySql index mechanism
  2. MySql operating mechanism
  3. In-depth understanding of InnoDB
  4. MySql tuning

Highlights of this section:

➢ Who implements the
index ➢ The definition of the index
➢ Why choose B+Tree
➢ How does B+Tree reflect in the two engines

Introduction to MySql Index

Use of index

Regarding the basic grammar of the index, adding, deleting, modifying and checking functions, here is a blog post:

The use of mysql index [reproduced]

The index is divided into:

  • Single-column index: the key in the node [name]
  • Joint Index: Keyword in node [name,phone]

Single-column indexes are special joint indexes.

Covering index: If the query column can be directly returned by the key in the index node, the index is called a covering index. The covering index can reduce database IO and change random IO into sequential IO, which can improve query performance

Joint index selection principles:

  1. Frequently used column first [Leftmost matching principle]
  2. Columns with high selectivity (dispersion) are preferred [principle of high dispersion]
  3. Column with smaller width is preferred [Minimum Space Principle]

Data structure tree

For tree-related data structure knowledge, I will provide you with a summary of binary trees, balanced binary trees, red-black trees, B+ trees, and binary tree traversal algorithms . The statement is reproduced here ! If the students do not understand the knowledge points here, they need to master them first in order to understand the subsequent MySql indexing mechanism.

1. Binary tree traversal algorithm [Reprinted]

  • Preorder traversal
  • In-order traversal
  • Post-order traversal

2. Binary Tree and Balanced Binary Tree [Reprinted]

Defects of binary tree:

  • Sequential storage may waste space (when it is not a complete binary tree), but it is more efficient when reading a specified node O(0)
  • Chained storage wastes less space when compared to binary trees, but it is less efficient when reading a specified node. O(nlogn)

3. Balanced binary tree and red-black tree [reproduced]

4. B tree and B+ tree [reproduced]

5. [Graphic] The essential difference and application scenarios of red-black tree, B tree, and B+ tree [Reproduced]

Leaf nodes are concepts in discrete mathematics. The nodes in a tree that have no child nodes (that is, the degree is 0) are called leaf nodes, or "leaves" for short. A leaf is a node with a degree of 0, which is also called a terminal node.

int leaf(BiTree root){
    
    
	static int leaf_count = 0; --->在递归调用时只进行一次初始化。
	if (NULL != root) {
    
    
		leaf(root->lchild);
		leaf(root->rchild);
	if (root->lchild == NULL & root->rchild == NULL)
		leaf_count++;
	}
	return leaf_count;
}

index

The correct creation of appropriate indexes is the basis for improving database query performance

What is an index?

Index is a distributed storage data structure created to speed up the retrieval of data rows in the table
Insert picture description here

Advantages of indexing:

  • Greatly reduce the amount of data that the storage engine needs to scan
  • Can turn random IO into sequential IO
  • Use indexes when grouping and sorting, you can avoid the use of temporary tables

MySql default data structure

Using Hash storage, the time complexity is log(N), using B+Tree storage, the time complexity is O(1), why does MySql choose B+tree as the default data structure?

We use online data structure analysis tools to view the arrangement of the binary tree:
Insert picture description here

If it is only select * from table where id=45, the hash algorithm can be easily implemented, but if it is select * from table where id<6, it is not easy to use, their search method is similar to "full table scan", because their height is uncontrollable (as shown above). The height of B+Tree is controllable, and mysql is usually 3 to 5 layers. Note: B+Tree only stores data at the end leaf nodes, and the leaf nodes point to each other in a linked list.

  • B+ tree scan library, stronger table ability
  • B+ tree has stronger disk read and write capabilities
  • B+ tree has stronger sorting ability
  • B+ tree query efficiency is more stable

MySql B+Tree index embodiment

MyISAM engine

Myisam engine (non-clustered index)
Insert picture description here

The manifestation of B+ tree index in MySql-
Insert picture description here
If MyISAM uses this engine to create a database table Create table user (…..), it actually generates three files:

  • user.myi index file
  • user.myd data file
  • user.frm data structure type

As shown below: When we perform the select * from user where id = 1time, its execution process.

  1. Check whether the myi file of the table has an index tree indexed by id.

  2. Find the id value of the leaf node according to this id index, and get the data address in it. (The leaf node stores the index and data address).

  3. Find the corresponding data in the myd file according to the data address and return it.

Innodb engine

If the Innodb engine (clustered index)
Insert picture description here
uses this engine to create a database table Create table user (…..), it actually generates two files:

  • user.ibd index file
  • user.frm data structure type

Because the innodb engine creates the table by default with the primary key as the index, the myi file is not required

Obviously, the biggest difference between it and myisam is that the entire data is stored in leaf nodes instead of addresses. (Leaf nodes store the primary key index and data information)

If at this time, you create an index such as name in other columns, it will create an index tree with name as the index (the leaf node stores the index and the primary key index).

You are executing select * from user where name = ‘zhangsan’, his execution process is as follows:

  1. Find the name index tree

  2. Find the name index and primary key value of the leaf under the tree according to the value of name

  3. Use the primary key value to go to the primary key index tree to the leaf node to the data information

The difference between MyISAM engine and InnoDB engine

  • MyISAM: supports full-text index; does not support transactions; it is a table-level lock; it will save the specific number of rows in the table.
  • InnoDB: Full-text index is only available after 5.6; supports transactions; it is a row-level lock; it does not save the specific number of rows in the table.
    Insert picture description here

When there is no transaction, it is suitable for myisam engine when there are many count calculations. High reliability is the use of innodby engine. The InnoDB engine is recommended.

After adding an index, the query speed can be greatly improved, but the more indexes are not the better, on the one hand it will take up storage space, on the other hand it will make the write operation very slow. Usually we have more frequent queries and only build indexes for columns with more values.

For example:, select * from user where sex = "famale"this does not need to build an index, because there are two values ​​for gender, and the query itself is relatively fast.
select * from user where user_id = 1995, This requires an index, because the value of user_id is very large.

Write at the end

Tips in this section:

  1. The data length of the index column can be as little as possible.
  2. Indexes must not be as many as possible, and as comprehensive as possible, they must be established appropriately
  3. The matching column prefix can be used to index like 999%, like %999%, and like %999 cannot use index
  4. The not in and <> operations in the where condition cannot use the index
  5. Matching range value, order by can also use index
  6. Use more specified columns to query, only return the data you think of, less use select *
  7. If the joint index is not searched according to the leftmost column of the index, the index cannot be used
  8. In the joint index, the exact match 1 is the leftmost front column and the range matches the other column can use the index
  9. In the joint index, if there is a range query of a column in the query, all the columns to the right cannot use the index

Reference link:

Understand the underlying B+tree index mechanism of Mysql

For more architectural knowledge, please pay attention to this series of articles : The growth path of Java architects

Guess you like

Origin blog.csdn.net/qq_34361283/article/details/112237736