MySQL database optimization steps

Index invalidation, not fully utilizing the index - index creation

There are too many JOINs in associated queries (design flaws or unavoidable requirements)--SQL optimization

Server tuning and various parameter settings (buffering, number of threads, etc.)--adjust my.cnf

Too much data - sub-database sub-table

Although there are many technologies for SQL query optimization, they can be divided into physical query optimization and logical query optimization in general.

Physical query optimization is optimized through technologies such as indexing and table connection methods. Here, the key point is to learn how to use indexes.

Logical query optimization is to improve query efficiency through SOL equivalent transformation. To put it bluntly, another way of query writing may be more efficient.

2. Optimization steps

1. View system performance parameters

Connections: The number of connections to the MySQL server.

Uptime: The online time of the MySQL server.

slow_queries: the number of slow queries

Innodb_rows_read: The number of rows returned by the Select query

Innodb_rows_inserted: The number of rows inserted by the INSERT operation

Innodb_rows_updated: The number of rows updated by the UPDATE operation

Innodb_rows_deleted: The number of rows deleted by the DELETE operation

Com_select: The number of query operations.

Com_insert: The number of insert operations. For batch insert INSERT operations, only accumulate once

Com_update: number of update operations

Com_delete: The number of delete operations.
SHOW STATUS LIKE '参数';

2. Compare page overhead

last_query_cost: The number of pages used. (used to compare page overhead)

3. Locate the SQL statement that executes slowly: query the slow log

long_query_time: When the running time of the SQL statement exceeds the value of this parameter, it is called a slow query.

3.1 Turn on the slow query log (it is off by default)
# 临时修改
# 打开慢查询日志
SET slow_query_log = ON;

# 修改慢查询门槛阈值
SET GLOBAL long_query_time=秒数;
SET long_query_time=秒数;


# 永久修改
修改配置my.cnf配置文件，在[mysqld]下修改参数，然后重启服务器。
3.2 Analyzing slow query statements
# 查看已有多少条慢查询语句
SHOW variables LIKE 'slow_queries';
# 使用mysqldumpslow来查看慢查询语句。
mysqldumpslow -s -a t /var/lib/mysql/table-slow.log

4. View SQL execution cost: Show Profile

# 打开show profile功能
SET profiling = 'ON';

# 查看最近执行的查询语句
SHOW profiles;

# 查看某一条查询语句
SHOW profile for query 1;

5. Analyze the query statement: EXPLAIN

1 Basic grammar
EXPLAIN SELECT * FROM table;

DISCRIBE SELECT * FROM table;
2 Column function of EXPLAIN statement output

column name describe

id The exclusive ID corresponding to this select statement

select_type query type

table Table Name

partitions Matching Partition Information

type Access method for single table

possible_keys Indexes that may be used

key the index actually used

key_len the length of the index actually used

ref When the index column is equivalent to query, the object information for equivalent matching with the index column

rows The number of records to be read for the statement

filtered Percentage of the number of remaining records after a table is filtered by search conditions

extra some additional information

3. Four output formats of EXPLAIN

traditional format

JSON format

TREE format

Worldbench visualization output
EXPLAIN FORMAT='格式' SELECT * FROM table1;

Detailed explanation of id

If the ids are the same, they can be considered as a group and executed sequentially from top to bottom

In all groups, the larger the id, the higher the priority, and the earlier the execution

Each number of id represents an independent query, the fewer the number of queries, the better.

Detailed explanation of select_type

type describe

SIMPLE

PRIMARY

UNION

UNION RESULT

SUBQUERY

DEPENDENT SUBQUERY

DEPENDENT UNION

MATERIALIZED

UNCACHEABLE SUBQUERY

UNCACHEABLE UNION

Detailed explanation of type

The resulting values from best to worst are: system > const > eq_ref > ref > fulltext > ref_or_null > index merge > unique_subquery > index_subquery > range > index > ALL

Some of the more important ones are extracted (see the blue in the figure above). The goal of SQL performance: at least reach the range level, the requirement is the ref level, preferably the consts level. (Required by Alibaba Development Manual)

key_len detailed explanation

Mainly for joint index

The longer the length the better.

Extra details

slightly

3. Index optimization and query optimization

1. In case of index failure

with operation

use function

LIKE uses %XXX left fuzzy query, because mysql is the leftmost principle, using XXX% right fuzzy query can use the index, but the left fuzzy violates the leftmost principle, so it can’t

Use range operations, not in, in >, < will not work

The queried field is not the leftmost field of the index, also because of the leftmost principle

The field type does not match, common implicit data type conversion, mobile=1356 will not go to the index, it will be converted to a string and can be queried, but mobile='1356' will go to the index

The left side of the or condition is an index field, and the right side is not. It will not take the index, because or is a union

General advice:

For single-column indexes, Jinling chooses indexes with better filterability for the current query.

When selecting a joint index, the field with the best filterability in the current query is in the order of the index fields, and the higher the position, the better.

When choosing a joint index, try to choose an index that can contain more fields in the where clause in the current query.

When selecting a joint index, if a field may have a range query, try to put this field at the end of the index order.

In short, when writing SQL statements, try to avoid causing index failure.

2. Correlation query optimization

When using JOIN, add indexes to the driven table first.

For inner joins, the query optimizer can decide who is the driving table and who is the driven table. (Usually small tables drive large tables).

Can directly use multi-table association as much as possible, without using subquery (reduce the number of times of query)

It is not recommended to use subquery, but to separate the subquery SQL and combine the program for multiple queries, or use JOIN instead of subquery.

3. The underlying principle of the JOIN statement
Use small tables to drive large tables (the essence is to reduce the amount of data in the outer loop)
-- 推荐写法
select tb1.b tb2.* from tb1 straight_join tb2 on (tb1.b=tb2.b) 
where tb2.id <= 100;
Add indexes to the conditions matched by the driven table (reduce the number of loop matches in the inner table)

Increase the size of the join buffer size (the more data is cached at one time, the fewer scans the inner layer contains)

Reduce unnecessary field queries of the drive table (the fewer fields, the more data cached by the join buffer)

Use Hash Join

4. Subquery optimization

Try to use JOIN instead of subquery

5. Sorting optimization

Two sorting methods, namely FileSort and Index sorting

In index sorting, the index can ensure the order of the data and does not need to be sorted, which is highly efficient and consumes less resources.

FileSort sorting is generally performed in memory and takes up a lot of CPU. If the result to be sorted is large, it will even send IO to the disk for sorting, which is inefficient.

optimization suggestion

In SQL, indexes can be used in the where clause and the order by clause. The purpose is to avoid full table scanning in the WHERE clause and to avoid using FileSort sorting in the ORDER BY clause.

Try to use Index to complete Order BY sorting. If the WHERE and ORDER BY are followed by the same column, a single-column index is used, and if not, a joint index is used.

When Index cannot be used, the FileSort method needs to be tuned.

Increase sort_buffer_size

Increase max_length_for_sort_data

Do not select * when using Order BY

Avoid index invalidation, such as ascending and descending order doping, loss of the leftmost index, loss of the middle index, use of non-index sorting, use of range queries such as IN().

6. GROUP BY group optimization, LIMIT page optimization

GROUP BY group optimization

The principle of using index by group by is almost the same as that of order by. Group by can use index directly even if there is no filter bar to use index. Group by sorts first and then groups, following the best left prefix rule for index building

When index columns cannot be used, increase the settings of max_length_for_sort_data and sort_buffer_size parameters

where is more efficient than having, if you can write in the conditions limited by where, don’t write in having

Reduce the use of order by, and communicate with business without sorting, or put the sorting on the terminal. Statements such as order by, group by, and distinct consume more CPU, and the CPU resources of the database are extremely precious.

For statements including order by, group by, and distinct, the result set filtered by the where condition should be kept within 1000 rows, otherwise the SQL will be very slow.

LIMIT page optimization
# 优化之前，不推荐
SELECT * FROM tb1 LIMIT 2000000, 10;

# 优化一，在索引上完成分页，然后根据主键回表
SELECT * FROM tb1 t1, 
(SELECT id FROM tb1 ORDER BY id LIMIT 2000000, 10) t2
WHERE t1.id = t2.id;

# 优化二，如果主键是自增的，那么可以直接使用WHERE定位到具体位置
SELECT * FROM tb1 WHERE id > 2000000 LIMIT 10;    

7. Covering index

Definition: When the index already contains the information required for the query, there is no need to return the table.

benefit:

Avoid the secondary query of the InnoDB table for indexing (back to the table)

Random IO can be programmed into sequential IO to speed up query efficiency

8. Index condition push down (ICP)

Explanation: When using a non-clustered index, the query statement is filtered multiple times before returning to the table to reduce the amount of data returned to the table.

Conditions of Use:

ICP can be used if the type of table access is range, ref, eq_ref and ref_or_null

ICP can be used for InnoDB and MyISAM tables, including partition tables InnoDB and MyISAM tables

For InnoDB tables, ICP is only used for secondary indexes. The goal of ICP is to reduce the number of full row reads, thereby reducing I/O operations. When SQL uses covering indexes, ICP is not supported. Because using ICP in this case will not reduce I/O. 4.

Conditions for correlated subqueries cannot use ICP

9. Other query optimization strategies
The difference between EXISTS and IN
# 当B表小时，使用IN
SELECT * FROM A WHERE cc IN (SELECT cc FROM B);

# 当A表小时，使用EXISTS
SELECT * FROM A WHERE EXISTS (SELECT cc FROM B WHERE B.cc = A.cc);
There is a difference between COUNT(*) and COUNT, MYISAM and InnoDB

4. Other tuning strategies for the database

1. The goal of tuning

Save system resources as much as possible so that the system can provide services with a greater load. (larger amount)

Reasonable structural design and parameter adjustment to improve the speed of user operation response. (faster response)

Reduce the bottleneck of the system and improve the overall performance of the MySQL database.

2. How to locate the tuning problem

User Feedback (Main)

Log analysis (mainly

Server resource usage monitoring

Database internal status monitoring

3. Dimensions and steps of tuning

First choose an appropriate database.

Optimizing Table Design

The table structure should follow the principle of three paradigms as much as possible

If there are many queries, especially when multiple tables are jointly queried, the anti-paradigm can be used to improve the efficiency of the query.

Choice of data type.

Optimize logical query

Optimizing Physical Queries

Use redis or memcached as cache

library-level optimization

read-write separation

data sharding

4. Optimize the MySQL server

Optimize server hardware

Configure larger memory, reduce the number of disk IOs, or increase the buffer capacity.

The configuration tells the disk system

Reasonable allocation of disk IO

Configure multiprocessor

Optimize MySQL parameters

innodb_buffer_pool_size: Maximum cache for tables and indexes

key_buffer_size: so the buffer size

table_cache: the number of tables opened at the same time

query_cache_size: The size of the query buffer.

query_cache_type: involves whether to use the query cache

sort_buffer_size: The size of the buffer allocated by the thread that needs to be sorted

join_buffer_size = 8M: The buffer size that can be used by the joint query operation

read_bufer_size: The size of the buffer allocated for each table scanned when each thread scans continuously

innodb_flush_log_at_trx_commit: When to write buffer data to the log file

innodb_log_buffer_size: the buffer used by the transaction log

max_connections: the maximum number of connections allowed to MySQL

back_log: Control the backlog request stack size set when listening to the TCP port

thread_cache_size: The size of the thread pool cache thread number

wait_timeout: the maximum connection time for a request

interactive_timeout: Indicates the number of seconds the server waits for action before closing the connection

5. Optimize the database structure
Split table: hot and cold data separation

Add intermediate table

Add redundant fields

Optimizing Data Types

Optimizations for Integer Types

choose between text type and integer type, integer type is preferred

Avoid TEXT, BLOB data types

Avoid using ENUM because ORDER BY is inefficient

Use timestamps to store time

Use DECIMAL fixed-point numbers instead of floating-point numbers

Optimize the speed of inserting records

Disable indexing early

Disabling uniqueness checks early

Use bulk insert

Try to use LOAD DATA INFLE instead of INSERT

Disable foreign key checks early

Disable autocommit early

Use not-null constraints
Analysis table, check table, optimization table
# 分析表，立即更新表索引的区分度
ANALYZE TABLE tb1;

# 检查表
CHECK TABLE 

# 优化表，但只优化字节数多的类型
OPTIMIZE TABLE
The above methods have advantages and disadvantages, and need to be carefully optimized by weighing the advantages and disadvantages.

6. Large table optimization

Limit the scope of the query

read-write separation

Vertical sub-library, vertical sub-table

split horizontally

7. Other tuning operations

Server statement timeout handling

Create a global common tablespace

hidden index

MySQL database optimization steps

Table of contents

1. From what aspects can database tuning be performed?

2. Optimization steps

1. View system performance parameters

2. Compare page overhead

3. Locate the SQL statement that executes slowly: query the slow log

4. View SQL execution cost: Show Profile

5. Analyze the query statement: EXPLAIN

3. Index optimization and query optimization

1. In case of index failure

2. Correlation query optimization

3. The underlying principle of the JOIN statement

4. Subquery optimization

5. Sorting optimization

6. GROUP BY group optimization, LIMIT page optimization

7. Covering index

8. Index condition push down (ICP)

9. Other query optimization strategies

4. Other tuning strategies for the database

1. The goal of tuning

2. How to locate the tuning problem

3. Dimensions and steps of tuning

4. Optimize the MySQL server

5. Optimize the database structure

6. Large table optimization

7. Other tuning operations

Guess you like

column name	describe
id	The exclusive ID corresponding to this select statement
select_type	query type
table	Table Name
partitions	Matching Partition Information
type	Access method for single table
possible_keys	Indexes that may be used
key	the index actually used
key_len	the length of the index actually used
ref	When the index column is equivalent to query, the object information for equivalent matching with the index column
rows	The number of records to be read for the statement
filtered	Percentage of the number of remaining records after a table is filtered by search conditions
extra	some additional information

type	describe
SIMPLE
PRIMARY
UNION
UNION RESULT
SUBQUERY
DEPENDENT SUBQUERY
DEPENDENT UNION
MATERIALIZED
UNCACHEABLE SUBQUERY
UNCACHEABLE UNION