Commonly used MySQL optimization

Commonly used MySQL optimization



Preface

MySQL database service: MySQL database service is a fully managed database service that can use the world's most popular open source database to deploy cloud native applications. The most comprehensive set of advanced features, management tools and technical support can achieve the highest level of MySQL scalability, security, reliability and uptime.


Tip: The following is the content of this article, the following cases are for reference

1. What is mysql optimization?

When faced with SQL statements that are not optimized enough or have extremely poor performance, our usual idea is to reconstruct the SQL statement so that the result set of the query remains the same as the original one, and hope that the SQL performance can be improved. When refactoring SQL, there are generally certain methods and techniques for reference. Today, let’s talk about why MySQL should be optimized and what is the method for MySQL to optimize SQL? .

2. Optimization steps

1.EXPLAIN

To do MySQL optimization, we must make good use of EXPLAIN to view the SQL execution plan.
Here is a simple example, label (1,2,3,4,5) the data we want to focus on
Insert picture description here

Type column, connection type : a good SQL statement must at least reach the range level. Prevent the all-level
key column from appearing , and the index name used : If no index is selected, the value is NULL. Mandatory indexing method can be adopted
key_len column, index length
rows column, number of scan rows : this value is an estimated value
extra column, detailed description : note that common values ​​that are not friendly are: Using filesort, Using temporary

There are generally several levels for type

system : system tables, a small amount of data, often do not need to perform disk IO;

const : constant connection;

eq_ref : primary key index (primary key) or non-empty unique index (unique not null) equivalent scan;

ref : non-primary key non-unique index equivalent scan;

range : range scan;

index : index tree scan;

ALL : full table scan (full table scan)

The above scanning methods are from fast to slow : system> const> eq_ref> ref> range> index> ALL

Avoid the null value judgment of the field in the where clause. The null judgment will cause the engine to give up using the index and perform a full table scan

When only one piece of data is needed, limit 1 is used. This is to make the type column in EXPLAIN reach the const type

2. The value contained in IN in the SQL statement should not be excessive

MySQL has made corresponding optimization for IN, that is, all the constants in IN are stored in an array, and this array is sorted. However, if the value is large, the consumption is relatively large.
Another example:

select id from table_name where num in(1,2,3)

  
   
   
  • 1

For continuous values, do not use in if you can use between; or use connection to replace.

3. The SELECT statement must specify the field name

SELECT * increases a lot of unnecessary consumption (cpu, io, memory, network bandwidth) and increases the possibility of using a covering index; when the table structure changes, the pre-break also needs to be updated. So it is required to connect the field name directly after the select.

4. If the sort field does not use the index, try to sort as little as possible

5. If there is no index in other fields in the restriction conditions, use or as little as possible

If one of the fields on both sides of or is not an index field, and other conditions are not an index field, the query will not be indexed. Many times use union all or union (when necessary) instead of "or" to get better results

6. Try to use union all instead of union

The main difference between union and union all is that the former needs to combine the result set and then perform unique filtering operation, which will involve sorting, increase a lot of CPU operations, increase resource consumption and delay. Of course, the prerequisite for union all is that there is no duplicate data in the two result sets .

7. Do not use ORDER BY RAND()

select id from `table_name` order by rand() limit 1000;

  
   
   
  • 1

The above sql statement can be optimized as

select id from `table_name` t1 join 
(select rand() * (select max(id) from `table_name`) as nid) t2 
on t1.id > t2.nid limit 1000;

  
   
   
  • 1
  • 2
  • 3

8. Distinguish between in and exists, not in and not exists

select * from 表A where id in (select id from 表B)

  
   
   
  • 1

The above sql statement is equivalent to

select * from 表A where exists
(select * from 表B where 表B.id=表A.id)

  
   
   
  • 1
  • 2

The main reason for distinguishing between in and exists is the change in the driving order (this is the key to performance changes). If it is exists, then the outer table is the driving table and is accessed first. If it is IN, then the subquery is executed first. So IN is suitable for the case where the outer surface is large and the inner surface is small; EXISTS is suitable for the case where the outer surface is small and the inner surface is large .

Regarding not in and not exists, it is recommended to use not exists, not only for efficiency issues, not in may have logic problems. How to efficiently write a SQL statement that replaces not exists? Original sql statement

select colname … from A表 where a.id not in (select b.id from B表)

  
   
   
  • 1

Efficient sql statement

select colname … from A表 Left join B表 on 
where a.id = b.id where b.id is null

  
   
   
  • 1
  • 2

The retrieved result set is shown in the figure below, the data in table A is not in table B
Insert picture description here

9. Use reasonable paging methods to improve the efficiency of paging

select id,name from table_name limit 866613, 20

  
   
   
  • 1

When using the above sql statement for paging, someone may find that as the amount of table data increases, directly using limit paging query will become slower and slower.
The optimization method is as follows: you can take the id of the maximum number of rows on the previous page, and then limit the starting point of the next page according to this maximum id. For example, in this column, the largest id on the previous page is 866612. sql can be written as follows:

select id,name from table_name where id> 866612 limit 20

  
   
   
  • 1

10. Segment query

In some user selection pages, some users may select a time range that is too large, causing slow queries. The main reason is too many scan lines. At this time, the program can be used to query by segment, loop through it, and merge the results for display.
As shown in the following figure, this sql statement can use segmented query when the scanned rows are more than millions of levels.
Insert picture description here

11. It is not recommended to use% prefix fuzzy query

For example, LIKE "%name" or LIKE "%name%" , this kind of query will cause the index to fail and perform a full table scan. But you can use LIKE "name%" .
How to query %name%?
As shown in the figure below, although an index is added to the secret field, the result of the explain is not used
Insert picture description here

So how to solve this problem, the answer: using full-text indexing
is often used in our queries

select id,fnum,fdst from table_name where user_name like '%zhangsan%'

  
   
   
  • 1

For such statements, ordinary indexes cannot meet the query requirements. Fortunately, in MySQL, there is a full-text index to help us.
The sql syntax for creating a full-text index is:

ALTER TABLE `table_name` ADD FULLTEXT INDEX `idx_user_name` (`user_name`);

  
   
   
  • 1

The sql statement that uses the full-text index is:

select id,fnum,fdst from table_name 
where match(user_name) against('zhangsan' in boolean mode);

  
   
   
  • 1
  • 2

12. Avoid expression operations on fields in the where clause

such as

select user_id,user_project from table_name where age * 2 = 18;

  
   
   
  • 1

Arithmetic operations are performed on the fields in the middle, which will cause the engine to abandon the use of indexes, it is recommended to change to

select user_id,user_project from table_name where age = 18 / 2;

  
   
   
  • 1

13. Avoid implicit type conversion

The type conversion that occurs when the type of the column field in the where clause is inconsistent with the type of the passed parameter, it is recommended to determine the type of the parameter in the where
Insert picture description here

14. For the joint index, to comply with the leftmost prefix rule

For example, the index contains fields id, name, school, you can directly use the id field, or the order of id, name, but name, school can not use this index. Therefore, you must pay attention to the order of the index fields when creating a joint index. Commonly used query fields are placed at the top

15. If necessary, you can use force index to force the query to go to an index

Sometimes the MySQL optimizer uses the index it considers appropriate to retrieve SQL statements, but the index it uses may not be what we want. At this time, force index can be used to force the optimizer to use the index we made.

16. Pay attention to the range query statement

For the joint index, if there is a range query, such as between, >, <and other conditions, the subsequent index fields will become invalid.

17. About JOIN optimization

Insert picture description here

LEFT JOIN A table is the driving table
INNER JOIN MySQL will automatically find the table with less data. The driving table
RIGHT JOIN B table is the driving table.
Note: There is no full join in MySQL, you can use the following methods to solve

select * from A left join B on B.name = A.name 
where B.name is null
union all
select * from B;

  
   
   
  • 1
  • 2
  • 3
  • 4

Try to use inner join and avoid left join

There are at least two tables participating in the joint query, and generally there are differences in size. If the connection method is inner join, MySQL will automatically select the small table as the driving table without other filter conditions, but the left join follows the principle of left driving right in the selection of driving table, that is, the table name on the left of left join For the driving table.
Reasonable use of indexes

The index field of the driven table is used as the restricted field of on.

Use small tables to drive large tables

Insert picture description here

It can be seen intuitively from the schematic diagram that if the drive table can be reduced, the number of loops in the nested loop can be reduced to reduce the total amount of IO and the number of CPU operations.
Clever use of STRAIGHT_JOIN

Inner join is the driving table selected by mysql, but in some special cases, another table needs to be selected as the driving table, such as group by, order by, etc. "Using filesort" and "Using temporary".
STRAIGHT_JOIN is used to force the connection order. The table name on the left of STRAIGHT_JOIN is the driving table, and the right is the driven table. A prerequisite for using STRAIGHT_JOIN is that the query is an inner join, which is an inner join.
STRAIGHT_JOIN is not recommended for other links, otherwise the query results may be inaccurate.
Insert picture description here

This method may sometimes reduce the time by 3 times.

to sum up

The above is what I will talk about today. This article introduces some methods of mysql optimization.

Guess you like

Origin blog.csdn.net/qq_37823979/article/details/108982700