MySQL performance [index optimization analysis]

SQL performance drops

Several reasons for SQL performance degradation:

  • Bad SQL
  • The index is used, but the index is invalid
  • Too many joins in related queries (design flaws or unavoidable requirements)
  • Server tuning and various parameter settings (such as buffers, thread size settings, etc.)

Join query

MySQL SQL statement execution order
Insert picture description here

to sum up
Insert picture description here

Seven kinds of Join query

// 内连接两表共有内容
select * from emp a inner join dept b on a.deptId=b.id
// 左连接左表全有 && 两表共有内容
select * from emp a left join dept b on a.deptId=b.id
// 右连接右表全有 && 两表共有内容
select * from emp a right join dept b on a.deptId=b.id
// 左连接 左表独有内容
select * from emp a left join dept b on a.deptId=b.id where b.id is null
// 右连接 右表独有内容
select * from emp a right join dept b on a.deptId=b.id where a.deptId is null
// 全连接 两表全部内容【union 去除重复数据】
select * from emp a left join dept b on a.deptId=b.id
union
select * from emp a right join dept b on a.deptId=b.id
// 全连接 两表独有内容
select * from emp a left join dept b on a.deptId=b.id where b.id is null
union
select * from emp a right join dept b on a.deptId=b.id where a.deptId is null

Introduction to Index

Official definition: Index Index is a data structure that helps MySQL obtain data efficiently. So we can draw a conclusion:**索引的本质就是数据结构。**

What is an index?

排好序的快速查找数据结构

In addition to the data itself, the database also maintains a data structure that satisfies a specific search algorithm. These data structures point to the data in a certain way, so that advanced search algorithms can be implemented on the basis of these data structures. This data structure is an index.

Generally speaking, the index itself is also very large, and it is impossible to store all of it in memory, so the index is often stored on the disk in the form of an index file.

The indexes we usually refer to, unless otherwise specified, refer to the indexes organized by the B-tree (multiple search tree, not necessarily binary) structure. Among them, clustered index, secondary index, covering index, composite index, prefix index, and unique index all use B+ tree index by default, collectively referred to as index. In addition to the B+ tree type index, there are hash indexes (hash, index) and so on.

Advantages of indexing

Similar to the establishment of a bibliographic index in a university library, it improves the efficiency of data retrieval and reduces the IO cost of the database.
Sorting the data through the index column reduces the cost of data sorting and reduces the consumption of CPU

Index disadvantages

In fact, the index is also a table, which stores the primary key and index fields, and points to the records of the entity table, so the index column also takes up space.

Although the index greatly improves the query speed, it will also reduce the speed of updating the table, such as insert, update, and delete the table. Because when updating the table, MySQL not only saves the data, but also saves the index file every time the index file is updated to add the index column field, it will adjust the index information after the key value changes caused by the update.

Indexes are only a factor in improving efficiency. If your MySQL has tables with a large amount of data, you need to spend time researching and establishing the best indexes or optimizing queries.

mysql index classification

  • Single index

即一个索引只包含单个列,一个表可以有多个单列索引。

  • Unique index

索引的值必须是唯一的,但允许有空值。

  • Compound index

即一个索引包含多个列

  • Basic grammar
// 单独创建索引
create [unique] index indexName ON mytable(columnName(length));
// 创建表的同时创建索引
alter mytable add [unique] index [indexName] ON (columnName(lenght));
// 删除索引
drop index [indexName] ON mytable;
// 查看索引
show index from table_name;

Four ways to add the index of the data table:

// 该语句添加一个主键,这意味着索引值必须是唯一的,且不能为 NULL
ALTER table table_name ADD PRIMARY KEY(column_list);
// 该语句创建索引的值必须是唯一的(除了NULL外,NULL可能会出现多次)。
ALTER table table_name ADD UNIQUE index_name(column_list);
// 添加普通索引,索引值可出现多次。
ALTER table table_name ADD INDEX index_name(column_list);
// 该语句指定了索引为 FULLTEXT,用于全文索引
ALTER table table_name ADD FULLTEXT index_name(column_list);

mysql index structure

BTree 索引

Hash 索引

full-text 全文索引

R-Tree 索引

What situations need to create an index

  1. The primary key automatically creates a unique index
  2. Fields frequently used as query conditions should be indexed
  3. Query the fields associated with other tables in the query, and create an index for the foreign key relationship
  4. Single key/combined index selection problem (high concurrency tends to create a composite index)
  5. The sorting field in the query, if the sorting field is accessed through the index, the sorting speed will be greatly improved.
  6. Statistics or grouping fields in the query.

When not to create an index

  1. Frequently updated fields are not suitable for index creation
  2. Fields not used in the Where condition are not indexed
  3. Too few table fields
  4. Table fields with repeated and evenly distributed data, so you should only build indexes for the most frequently queried and most frequently sorted data. If a data column contains many duplicate content, indexing it will not have much practical effect.

Performance analysis

MySql Query Optimizer

  1. There is an optimizer module in MySQL that is specifically responsible for optimizing SELECT statements. Its main function is to provide statistical information collected in the calculation and analysis system to provide the Query requested by the client with the best execution plan it thinks. This part is the most time-consuming.
  2. When the client requests a Query from MySQL, the command parser module completes the classification of the request. When the difference is SELECT and forwarded to MySQL Query Optimizer, MySQL Query Optimizer first optimizes the entire query, processing the budget of some constant expressions. Convert directly to a constant value. And to simplify and transform the query conditions in Query, such as removing some useless or obvious conditions, structural adjustments, etc. Then analyze the Hint information (if any) in the Query to see if displaying the Hint information can completely determine the execution plan of the Query. If there is no Hint or Hint information is not enough to completely determine the execution plan, the statistical information of the involved objects will be read, the corresponding calculation and analysis will be written according to the Query, and then the final execution plan will be obtained.

MySQL common bottlenecks

  1. CPU:CPU在饱和的时候一般发生在数据装入内存或从磁盘上读取数据的时候。
  2. I:磁盘 I/O 瓶颈发生在装入数据远大于内存容量的时候。
  3. Performance bottlenecks of server hardware:top,free,iostat和vmstat来查看系统的性能状态

Explain

Explain 又称为 【查看执行计划】

Use the Explain keyword to simulate the SQL query executed by the optimizer, so as to know how MySQL processes your SQL statement. Analyze the performance bottleneck of your query or table structure.

How to use Explain

explain + SQL 语句

Insert picture description here
Information contained:

id select_type table type possible_keys key key_len ref rows Extra

The role of Explain

  1. 查看表的读取顺序
  2. 查看数据读取操作的操作类型
  3. 查看哪些索引可以使用
  4. 查看哪些索引被实际使用
  5. 查看表之间的引用关系
  6. 查看每张表有多少行被优化器查询

Explain's id field function

The sequence number of the select query, including a set of numbers, indicating the order in which select clauses or operation tables are executed in the query

There are three situations for the id field:

  1. The id is the same, the execution order is from top to bottom
  2. The id is different. If it is a subquery, the id's serial number will increase. The larger the id value, the higher the priority, and the earlier it will be executed
  3. The same id but different, exist at the same time

If the id is the same, it can be considered as a group and executed
in order from top to bottom; in all groups, the greater the id value, the higher the priority, the earlier the execution;
derivative = DERIVED

Explain's select_type and table field function

Common values ​​of select_type:

id select_type
1 SIMPLE
2 PRIMARY
3 SUBQUERY
4 DERIVED
5 UNION
6 UNION RESULT

The type of query is mainly used for distinction.
Complex queries such as ordinary queries, joint queries, and sub-queries.

  1. SIMPLE: simple select query, the query does not contain sub-queries or unions
  2. PRIMARY: If the query contains any complex subparts, the outermost query is marked as primary
  3. SUBQUERY: Subqueries are included in the select or where list
  4. DERIVED: The sub-queries contained in the from list are marked as DERIVED derived. MySQL will execute these sub-queries recursively and place the results in a temporary table.
  5. UNION: If the second select appears after the union, it will be marked as union; if the union is included in the subquery of the from clause, the outer select will be marked as: DERIVED
  6. UNION RESULT: get the result from the union table select

table: Shows which table the data of this row is about.

Explain's type field function

type indicates the arrangement of access types, type shows the type of access, which is a more important indicator, the result值从最好到最坏依次是:

system>const>eq_ref>ref>fulltext>ref_or_null>index_merge>unique_subquery>index_subquery>range>index>ALL

system>conft>eq_ref>ref>range>index>ALL

Generally speaking, you must ensure that the query reaches at least the range level, preferably ref.

Common values ​​for type:

ALL index range ref eq_ref const,system NULL

Show what type of query is used,从最好到最差依次是:
system>const>en_ref>ref>range>index>ALL

  1. system: The table has only one row of records (equal to the system table), this is a special column of const type, usually does not appear, this can also be ignored
  2. const: It means that it can be found once by index. const is used to compare primary key or unique index. Because only one row of data is matched, it is very fast; if the primary key is placed in the where list, MySQL can convert the query into a constant.
  3. ep_ref: Unique index scan. For each index key, there is only one record in the table that matches it. Commonly used in primary key or unique index scans.
  4. ref: non-unique index scan, returning all rows matching a single value, is essentially an index access, it returns all rows matching a single value, however, it may find multiple eligible rows, So he should be a hybrid of search and scan.
  5. range: Only check rows in a given range, use an index to select rows. The key column shows which index is used. Generally, queries such as between, <, >, in appear in your where statement. This range scan index is faster than a full table scan because it only needs to start at a certain point of the index , And ends at another point, without scanning all indexes.
  6. index: Full Index Scan, the difference between index and ALL is that the index type only traverses the index tree. This is usually faster than ALL because index files are usually smaller than data files. (That is to say, although all and Index both read the entire table, index is read from the index, and all is read from the hard disk)
  7. all: Full Table Scan, will traverse the entire table to find matching rows.

Explain's possible_keys and key field function

  • possible_keys

Shows one or more indexes that may be applied to this table.
Queries related to the field if there is an index, the index will be listed 但不一定要被查询实际使用.

  • key

The index actually used, if it is NULL, the index is not used
查询中若使用了覆盖索引,则该索引仅出现在 key 列表中

Explain's key_len field function

Indicates the number of bytes used in the index. This column can be used to calculate the length of the index used in the query. Without loss of accuracy, the shorter the length, the better.
The value displayed by key_len is the maximum possible length of the index field 并非实际使用长度, that is, key_len is calculated according to the table definition, not retrieved from the table.

Explain's ref field function

Shows which column of the index is used, if possible, a constant. Which columns or constants are used to find the value on the indexed column.

Query the fields associated with other tables in the query, and create an index for the foreign key relationship

Explain's rows field function

According to table statistics and index selection, roughly estimate the number of rows that need to be read to find the required record

Explain's Extra field function

Record extra important information

Common values ​​are as follows:

  1. Using filesort: It means that Mysql will use an external index to sort the data instead of reading it according to the index order in the table. Sorting operations that cannot be done using indexes in MySQL are called "file sorting".
  2. Using temporary: A temporary table is used to save intermediate results, and MySQL uses a temporary table when sorting query results. Commonly used in sorting order by and grouping query group by.
  3. Using index: Indicates that the covering index (Covering Index) is used in the corresponding select operation to avoid accessing the data rows of the table, and the efficiency is good! If using where appears at the same time, it indicates that the index is used to search for index key values; if using where does not appear at the same time, it indicates that the index is used to read data instead of performing a search.覆盖索引(Covering Index)

Covering index: the data column of select can be obtained only from the index, without reading the data row, MySQL can use the index to return the fields in the select list without having to read the data file again according to the index, in other words查询列要被所创建的索引覆盖。

Note: If you want to use a covering index, be sure to take out only the required columns from the select list, not select *, because if all fields are indexed together, the index file will be too large and query performance will decrease.

  1. Using where: Indicates that the SQL uses where filtering.
  2. using join buffer: Indicates that this piece of SQL uses the connection cache.
  3. impossible where: Indicates that the value of the where clause of the SQL statement is always false and cannot be used to obtain any tuples.
  4. select tables optimized away: Without group by, optimize the min/max operation based on the index or optimize the count( *) operation for the MyISAM storage engine. You don't need to wait until the execution stage to perform calculations. The query execution plan generation stage is optimized.
  5. distinct: Optimize the distinct operation, and stop looking for the same worthy action after finding the first matching tuple.

Index optimization

Index analysis

  • Single table analysis: commonly used to query the field to build index optimization well.
  • Two-table analysis: the left join index is recommended to be built in the right table, and the right table is built in the left table. On the contrary, it can be optimized by creating indexes for each other.
  • Three-table analysis: the same as the two tables, but also the opposite to the other two tables to create index optimization.

Index failure

  1. Full match
  2. Best left prefix rule

If the index contains multiple columns, follow the left-most prefix rule, which means that the query starts from the left-most front column of the index and does not skip the columns in the index.

  1. Do not do any operation (calculation, function, automatic or manual type conversion) on the index column, will cause the index to fail and turn to the full table scan
  2. The storage engine cannot use the column to the right of the range condition in the index
  3. Try to use a covering index (only access to the index query (index column and query column)), reduce select *
  4. MySQL cannot use the index when it is not equal to (!= or <>), which will cause a full table scan
  5. is null, is not null also cannot use index
  6. The like wildcard starts with% ('%abc...') mysql index failure will become a full table scan operation.
  7. The index of the string without single quotes is invalid.
  8. Use or sparingly, the index will fail when you use it to connect.

Interview question: How to solve the method that the index is not invalid when like'%abc%'?
We all know that the prerequisite to ensure that the index does not fail during the like query is that% is added to the right of the string, such as:'abc%' so that the index will not be invalidated, but the current condition is that we have to make a full fuzzy query, such as'% abc%', writing the index in this way will fail 100%. How to solve this problem?

We can add a covering index to optimize, for example, your query condition is name, we add an index on the field, such as:
select * from table where name like'%abc%';
after the explain analysis, the above sql is not used Index, we can change this sql
select name from table where name like'%abc%';
after analyzing the above sql again, we finally find that the index is indeed used. This ensures that even if the like full fuzzy query is used, it is resolved The problem of index failure.
**这里要注意的事情:覆盖索引优化的前提是 select 查询的字段正好和你的索引字段数量一致的前提下,简单来说:覆盖索引字段和你select 的字段一致或小于其字段。**

General advice

  1. For single-key indexes, try to choose an index with better filtering performance for the current query.
  2. When selecting a composite index, the field with the best filterability in the current Query is in the index field sequence, and the higher the position, the better.
  3. When choosing a composite index, try to choose an index that can contain more fields in the where clause in the current query.
  4. Try to analyze statistical information and adjust the wording of the query to achieve the purpose of selecting a suitable index.

Guess you like

Origin blog.csdn.net/qq_43647359/article/details/105822524