MySQL Performance Optimization (2) Index

optimization means

  1. Is the more complete the table index, the better?
  2. Why aren't gender bullets commonly indexed?
  3. Why is it not recommended to use the ID card as the primary key?
  4. Does fuzzy matching like xx%, like %xx% and like %xx use indexes?
  5. Why is select * deprecated?

Prepare

create table user_innodb
(
    id int not null primary key,
    username varchar(255) null,
    gender char(1) null,
		phone char(11) null
) ENGINE=INNODB;

create table user_myisam
(
    id int not null primary key,
    username varchar(255) null,
    gender char(1) null,
		phone char(11) null
) ENGINE=myisam;

create table user_memory
(
    id int not null primary key,
    username varchar(255) null,
    gender char(1) null,
		phone char(11) null
) ENGINE=memory;

SET @i = 1;
INSERT INTO user_innodb (id, username,gender, phone)
SELECT @i := @i + 1 AS id,
       CONCAT('user', LPAD(@i, 5, '0')) AS username,
			 IF(FLOOR(RAND() * 2) = 0, '1', '0') AS gender,
       CONCAT('1', LPAD(FLOOR(RAND() * 10000000000), 10, '0')) AS phone
FROM   INFORMATION_SCHEMA.TABLES,
       INFORMATION_SCHEMA.TABLES AS t2
WHERE  @i < 5000000;

select max(id) from user_innodb

the case

-- 没有索引的查询时间
select * from user_innodb where username = 'huathy'
> OK
> 时间: 5.872s
-- 为username字段加上索引
alter table user_innodb add index idx_user_innodb_name(username);
-- 走索引的name查询时间开销
select * from user_innodb where username = 'huathy'
> OK
> 时间: 0.017s

The nature of indexing

Database index: A sorted data structure in the database management system to speed up query efficiency.

  • Indexes are classified by column: single column index, joint index
  • Index type: normal, spatial, unique unique index (empty), primary key index (non-empty), fulltext full-text index (large text field, the effect of word segmentation is not good for Chinese, replace ES)
  • Index method: B+ tree, hash index
    image.png

indexed data structure

  1. Linked list structure of binary search: binary search tree.
    The nodes of the left subtree are smaller than the parent node, and the nodes of the right subtree are larger than the parent node.

    There is an extreme situation in the binary tree. When all nodes are larger than the parent node, the binary tree will degenerate into a linked list structure.


  2. The absolute value of the depth difference between the left and right subtrees of a balanced binary tree (AVL Three) cannot exceed 1.
    Left-left shape -> right-handed, right-right shape -> left-handed.

image.png

  1. The multi-way balanced search tree (B tree)
    maintains balance through splitting and merging. This splitting and merging is the splitting and merging of InnoDB pages.
    If the keys are out of order, it can cause fragmentation when storing to disk. so id
    image.png

4. B+ tree enhanced version of multi-way balanced search tree
All data is stored in leaf nodes, and there are two-way pointers between leaf nodes to form a linked list structure.

image.png
Advantage:

  • The B-tree solves the problem that a node of the AVL tree is not full of data and the depth is too deep.
  • Stronger performance in library and table scanning
  • The number of IOs is less. Stronger disk read and write capabilities
  • Stronger sorting ability
  • Efficiency is more stable

image.png

Why doesn't MySQL use a red-black tree as an index data structure? The purpose of a red-black tree is that the maximum depth is no more than 2 times the minimum depth. Red-black trees are not balanced enough. Not applicable for on-disk data structures. Memory can be prevented.

  • Nodes are classified as red or black.
  • The root node must be black.
  • The leaf nodes are all black NULL nodes.
  • Both children of a red node are black (two adjacent red nodes are not allowed).
  • Starting from any node, the path to each leaf node contains the same number of black nodes.

5. Hash index time complexity is always O(1)
query speed. The hashed data is inherently unordered. So comparing values ​​is time consuming. Hash collisions are inevitable.
This index type cannot be used in InnoDB. But it can be used in other engines. Such as memory engine.

Practice of Indexing in Different Storage Engines

MyIsam (the index has no primary and secondary, and is stored in the MYI file)

primary key index

image.png

other indexes

image.png

InnoDB (data is index, index is data)

Index and data are stored in one file. The leaf nodes of its B+ tree store data directly.

Primary key index - clustered index

Leaf nodes store data
image.png

clustered index

If the order of the index key values ​​is consistent with the physical storage order of the data rows, it becomes a clustered index.

other indexes

Leaf nodes store primary keys.

image.png
Question: Why is the primary key of the data stored on the secondary index instead of the address?
Due to the addition and deletion of data, the splitting and merging of the B+ tree, the address will change.
Back to the table: After querying the secondary index, you need to query the data in the table according to the primary key. The longest red line in the figure indicates the table return operation.

What if there is no primary key?

Official answer: MySQL :: MySQL 5.7 Reference Manual :: 14.6.2.1 Clustered and Secondary Indexes

If there is a primary key index, use the primary key index. If there is no primary key index, a non-null unique index is used. If there is no suitable primary key and unique index, use the hidden rowID as the index.

// 但是我在这里查询的时候,好像提示以下错误信息:
// 1054 - Unknown column '_rowid' in 'field list'
select _rowid from test ;

An explanation is found here: https://blog.csdn.net/u011196295/article/details/88030451

When creating a table without explicitly defining a primary key.

  1. First determine whether there is a non-empty integer unique index in the table. If so, the column is the primary key (at this time, you can use select _rowid from table to query the primary key column).
  2. If there is no qualified one, a 6-byte primary key will be created automatically (the primary key cannot be found).

Index Creation and Usage Principles

Are more indexes better?

no. Indexes take up disk space and trade space for time.

Discreteness of columns: count(distinct(column_name)):count(*)

Which of gender and phone has a higher degree of dispersion? The phone has a high degree of dispersion.
So there is no need to build indexes on keys with low discreteness. Because there will be back-to-table operations when going through the index, it will reduce the performance.

The leftmost matching principle of the joint index

The joint index must start from the first field and cannot be interrupted. It is recommended to put the most queried ones on the left.

alter table user_innodb add index comidx_name_phone(username,phone);

EXPLAIN select * from user_innodb t where t.phone = '13603108202' and t.username='huathy';	-- 使用索引
EXPLAIN select * from user_innodb t where t.username='huathy' and t.phone = '13603108202';	-- 使用索引
EXPLAIN select * from user_innodb t where t.username='huathy';			-- 使用索引
EXPLAIN select * from user_innodb t where t.phone = '13603108202';	-- 不使用索引

image.png

Usage scenario:
For data that must be retrieved at the same time, such as ID number and examination number, you can use a joint index.

redundant index

With the above index, is it necessary for us to build another such index for the above query. Unnecessary, the index is redundant.

select * from user_innodb t where t.username='huathy';	
alter table user_innodb add index idx_user_innodb_name(username);

covering index

If the queried column is already included in the index used, then there is no need to return to the table. This is called a covering index. A covering index is a case of using an index.
How to judge whether to use a covering index: If it is a Using Index in Extra, it means that a covering index is used.

EXPLAIN select username,phone from user_innodb t where  t.username='huathy';	-- 使用覆盖索引
EXPLAIN select username from user_innodb t where t.username='huathy' and t.phone = '13603108202';	-- 使用覆盖索引
EXPLAIN select username from user_innodb t where t.phone = '13603108202';			-- 使用覆盖索引
EXPLAIN select * from user_innodb t where t.username='huathy';	-- 不使用覆盖索引,不得不回表操作

Index Condition Pushdown (ICP)

innoDB is automatically enabled and automatically optimized.
Indexing is implemented in the storage engine, which is responsible for storing data, and data filtering and calculation are implemented in the service layer. It is more efficient if you can query based on the index. Filter the conditions that cannot be filtered in this storage engine first in the storage engine. This action is the index condition push down.

How to judge whether the index condition pushdown is used: If the Using index condition exists in the Extra of the execution plan, it indicates that the index condition pushdown is used. The full name of index condition: Index condition pushing down.

-- 创建员工表
CREATE TABLE `employees` (
 `emp_no` int(11) NOT NULL,
 `birth_date` date  NULL,
 `first_name` varchar(14) NOT NULL,
 `last_name` varchar(16) NOT NULL,
 `gender` enum('M','F') NOT NULL,
 `hire_date` date  NULL,
 PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB ;
-- 在姓、名列上加上索引
alter table employees add index idx_lastname_firstname(last_name,first_name);

-- 进行查询
EXPLAIN SELECT * FROM employees t WHERE t.last_name = 'Wu'  AND t.first_name like '%x'
-- 可以看到Extra中Using index condition表示用到了索引条件下推。

-- 查看操作开关是否开启索引条件下推
show global variables like '%optimizer_switch%';
-- index_condition_pushdown=on
-- 关闭索引条件下推
set optimizer_switch = 'index_condition_pushdown=off' 
-- 再次查看是否使用了索引条件下推
EXPLAIN SELECT * FROM employees t WHERE t.last_name = 'Wu'  AND t.first_name like '%x'
-- 可以看到返回 Using Where 表示在server层过滤

The above query method, the query process is as follows

  • If the index push-down process is not performed:
    secondary index retrieval data -- return to the table --> get the complete record at the leaf node of the primary key index --> server layer filter data (N records that do not meet the like condition, the server layer itself is required filter)
  • Query process for index pushdown: secondary index retrieval data --> filter secondary index Wu,x -- return table --> obtain complete records at the primary key index leaf node --> return to the server layer (meeting the like condition N records, no need for server layer filtering)

Principles of indexing:

  1. Create indexes on fields used for where judgment, order sorting, join links, and group by grouping.
  2. The number of indexes should not be too many.
  3. The field with the lowest degree of discrimination (the lowest degree of dispersion of the column) does not need to build an index.
  4. Frequently updated values ​​should not be used as primary keys or indexes.
  5. It is not recommended to use unordered values ​​(ID card, UUID) as indexes. It will cause a large number of structural adjustments of the B+ tree and consume computing performance.
  6. Composite index puts the columns with high dispersion in front.
  7. Common conforming indexes, rather than modifying single-column indexes.
  8. If the field is too long, create a prefix index.

prefix index

Some text is too long, we only need to match by prefix, we can intercept the string and use the prefix index. If the text is too long, it will take up storage space, if it is too short, there will be no distinction. Here you need to calculate the appropriate length.

-- 前缀索引:
CREATE TABLE `pre_test` (
  `content` varchar(20) DEFAULT NULL,
  KEY `pre_idx` (`content`(6))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Principles of using indexes

when not to use the index

  1. Use functions (replace, substr, concat, sum, count, avg) and expressions on indexed columns
  2. Strings are unquoted, and an implicit conversion occurs.
  3. The like condition is preceded by %. Violates the leftmost matching principle. Of course, the exception is the case where the index condition is pushed down.
  4. The situation of negative query cannot be determined: related to optimizer version, database version, etc.<>、!=、not in、not exists

optimizer

  1. Cost-based optimizer (used by MySQL): IO, CPU
  2. Rule-based optimizer (Oracle earlier versions):

Guess you like

Origin blog.csdn.net/qq_40366738/article/details/130009562