SQL performance optimization summary

Insert image description here

1. Performance optimization strategy

1. The values ​​contained in IN in the SQL statement should not be too many.
MySQL stores all the constants in IN in a sorted array, but if there are more values, the consumption will be relatively large. soFor continuous values, do not use in if you can use between.

2. The SELECT statement must specify the field name
SELECT * adds a lot of unnecessary consumption, so it is required to directly add the field name after select.

3. When only one piece of data is needed, use limit 1.
This is to make the type column in EXPLAIN reach const type.

4. If the sorting field does not use an index, sort as little as possible

5. If other fields in the restriction are not indexed, use or as little as possible
If one of the fields on both sides of or is not an index field, and the other conditions are not index fields, the query will not be indexed.

6. Try to use union all instead of union.
The main difference between union and union all is that the former needs to merge the result sets and then perform unique filtering operations, which will involve sorting and increase a lot of CPU operations. certainly,The prerequisite for using union all is that there is no duplicate data in the two result sets.

7. Use in and exists according to different situations

If it is exists, then the outer table is the driving table and is accessed first. If it is IN, then the subquery is executed first. soIN is suitable for situations where the outer surface is large but the inner surface is small, and EXISTS is suitable for situations where the outer surface is small but the inner surface is large.

select * from 表A where id in (select id from 表B)

//用exist改进
select * from 表A where exists(select * from 表B where 表B.id=表A.id)

8. Use reasonable paging methods to improve paging efficiency

//随着表数据量的增加,直接使用limit分页查询会越来越慢
select id,name from table_name limit 866613, 20

//优化后的代码如下:可以取前一页的最大行数的id,然后根据这个最大的id来限制下一页的起点。比如此列中,上一页最大的id是866612
select id,name from table_name where id> 866612 limit 20

9. Segmented query
In some user selection pages, the time range selected by some users may be too large, resulting in slow query. The main reason is that too many rows are scanned. at this timeYou can query segmented by program, traverse in a loop, and combine the results for display.

10. Avoid null value judgment on fields in where clause

The judgment of null will cause the engine to give up using the index and perform a full table scan.

11. It is not recommended to use % prefix fuzzy query
, such as LIKE “%name” or LIKE “%name%”.This kind of query will cause index failure and a full table scan. If you want to improve efficiency, you can consider full-text search.

12. Avoid expression operations on fields in where clauses

//这会造成引擎放弃使用索引
select user_id,user_project from table_name where age*2=36;

//可以将上面查询语句改成这样,提高性能
select user_id,user_project from table_name where age=36/2;

13. Avoid implicit type conversions

Type conversion occurs when the column field type in the where clause is inconsistent with the passed parameter type. It is recommended to determine the parameter type in the where clause first.

14. For joint indexes, the leftmost prefix rule must be followed

When creating a joint index, you must pay attention to the order of the index fields, and put the commonly used query fields first.

15. Pay attention to range query statements for joint queries

If there is a range query, such as between, >, < and other conditions, the subsequent index fields will be invalid.

16. Try to use inner join and avoid left join

The tables participating in the joint query are at least two tables, and they generally vary in size.If the connection method is inner join, MySQL will automatically select the small table as the driving table if there are no other filtering conditions.However, left join follows the principle that the left side drives the right side in the selection of the driving table, that is, the table on the left side of the left join is called the driving table.
Insert image description here
17. Try to avoid full table scans
. To optimize queries, try to avoid full table scans. First, consider creating indexes on the columns involved in where and order by.

18. Try to use numeric fields
If fields that only contain numerical information should be designed as character types, it will reduce query and connection performance and increase storage overhead.

19. Use varchar instead of char as much as possible
firstVariable length fields have small storage space, which can save storage space. Secondly, for queries, the search efficiency is obviously higher in a relatively small field.

20. Try to avoid returning large amounts of data to the client.
If the amount of data is too large, you should consider whether the corresponding requirements are reasonable.

21. Use table aliases.
When connecting multiple tables in a SQL statement, please use table aliases and prefix the alias to each Colum.In this way, the parsing time can be reduced and syntax errors caused by Column ambiguity can be reduced. Table names and column names are aliased with one letter in the query, and the query speed is 1.5 times faster than building a connection table
.

22. Use "temporary table" to temporarily store intermediate results.
Temporarily store the temporary results in the temporary table, and subsequent queries will be in tempdb.This can avoid multiple scans of the main table in the program, greatly reduce blocking, and improve concurrency performance.

23. Calculate the results beforehand
Pre-calculate the results that need to be queried and put them in the table, and then select them when querying.

24. Do not have more than 5 table connections and use less subqueries.

25. IN optimization
In the list of values ​​following IN,Put the value that appears most frequently at the front and the value that appears least frequently at the end to reduce the number of judgments.

26. Try to put the data processing work on the server to reduce network overhead.

27. Reasonably allocate the number of threads
When the server has enough memory, configure the number of threads = maximum number of connections + 5, so as to maximize efficiency.; Otherwise, configure the number of threads < the maximum number of connections and enable the thread pool of SQL SERVER to solve the problem. If the number is still = the maximum number of connections + 5, this will seriously damage the performance of the server.

28. Pay attention to index maintenance, periodically rebuild indexes, and recompile stored procedures.

29. Batch insert or batch update
When there is a batch of insert or update, use batch insert or batch update and never update records one by one.

30. Use loops as little as possible
. In all stored procedures,What can be implemented using SQL statements will never be implemented using loops.

31. Choose the most efficient order of table names.
Oracle's parser processes the table names in the FROM clause in order from right to left. The table written last in the FROM clause (the base table) will be processed first. In the case of multiple tables included in the FROM clause, you mustSelect the table with the smallest number of records as the base table.

32. Improve the efficiency of GROUP BY statement
You can filter out unnecessary records by placing them before GROUP BY.

//低效
SELECT JOB , AVG(SAL) 
FROM EMP 
GROUP BY JOB 
HAVING JOB =’PRESIDENT’ 
OR JOB =’MANAGER’

//高效
SELECT JOB , AVG(SAL) 
FROM EMP 
WHERE JOB =’PRESIDENT’ 
OR JOB =’MANAGER’ 
GROUP BY JOB

33. Use uppercase letters for SQL statements
Because Oracle always parses the SQL statement first, converts lowercase letters into uppercase letters and then executes them.

34. Avoid deadlocks.
The amount of data involved in a transaction should be reduced as much as possible, and never wait for user input in a transaction.

35. It is best not to use triggers
. Triggering a trigger and executing a trigger event itself is a resource-consuming process. If it can be implemented using constraints, try not to use triggers.

36. The use of spaces should be minimized.
When writing SQL statements, the use of spaces should be minimized, especially the spaces at the beginning and end of SQL, becauseThe query buffer does not automatically intercept leading and trailing spaces.

37. Set an ID as the primary key for each table in the database
, preferably of type INT, and set the AUTO_INCREMENT flag that is automatically increased.

38. MySQL queries can enable high-speed query caching.
This is one of the effective ways to improve database performance.When the same query is executed multiple times, retrieving data from the cache is much faster than returning the data directly from the database.

39. EXPLAIN SELECT query is used to track and view the effect.
Using the EXPLAIN keyword can let you know how MySQL processes your SQL statement. This can help you analyze the performance bottlenecks of your query statements or table structures.

40. Use LIMIT 1 when there is only one row of data.
Sometimes when you query a table, you already know that the result will only be one result. In this case, adding LIMIT 1 can increase performance. Thus,The MySQL database engine will stop searching after finding a piece of data, instead of continuing to search for the next piece of data that matches the record.

41. The principle of optimizing the data type of the table
is to be simple and practical, so when creating the table, in order to obtain better performance, we canMake the width of the fields in the table as small as possible.All fields must have default values ​​and try to avoid null.

42. Move operations to the right of the equal sign whenever possible
Any operation on the column will result in table scan, which includes database functions, calculation expressions, etc. When querying, the operation should be moved to the
right side of the equal sign as much as possible.

2. Index creation rules

Each index created on the table will increase storage overhead, and the index will also increase processing overhead for insert, delete, and update operations.
In addition, too many compound indexes are generally of no value when there are single-field indexes; on the contrary, they will also reduce the performance when
data added and deleted, especially for frequently updated tables, which will have a negative impact bigger.

(1) The primary key and foreign key of the table must have indexes;
(2) Tables with more than 300 data volumes should have indexes; (
3) Tables that are frequently connected to other tables should have indexes on the connection fields;
(4) Frequently Fields that appear in the Where clause, especially fields in large tables, should be indexed;
(5) Indexes should be built on highly selective fields;
(6) Indexes should be built on small fields, for large text fields Do not build an index even for extremely long fields;
(7) The establishment of a composite index requires careful analysis, and try to consider replacing it with a single-field index; (
8) Correctly select the main column fields in the composite index, generally fields with better selectivity ;
(9) If the fields included in the composite index often appear alone in the Where clause, decompose it into multiple single-field indexes;
(10) If the composite index contains more than 3 fields, carefully consider its necessity, Consider reducing the number of compound fields;
(11) If there are both single-field indexes and compound indexes on these fields, you can generally delete the compound index;
(12) Do not create too many indexes for tables that frequently perform data operations;
(13) Delete useless indexes to avoid negative impacts on the execution plan;
(14) Try not to index a field in the database that contains a large number of duplicate values.

3. Query Optimization Summary

(1) Use the slow query log to find slow queries, use the execution plan to determine whether the query is running normally, and always test your queries to see if they are running optimally; (2) Avoid using count on the entire
table (*), it may lock the entire table;
(3) Make the query consistent so that subsequent similar queries can use the query cache;
(4) Use indexed columns in the WHERE, GROUP BY and ORDER BY clauses, Keep the index simple and do not include the same column in multiple indexes;
(5) For index fields with less than 5 records, using LIMIT instead of OR when UNION is used;
(6) In order to avoid SELECT before updating, use INSERT ON DUPLICATE KEY or INSERT IGNORE;
(7) Use UNION in the WHERE clause instead of subqueries, and consider persistent connections instead of multiple connections to reduce overhead; (8
) When the load increases on the server, use SHOW PROCESSLIST to view slow queries and problematic queries, test all suspicious queries on mirrored data.

Guess you like

Origin blog.csdn.net/m0_52861684/article/details/132894661