Summary of 52 SQL statement performance optimization strategies

1. To optimize the query, avoid full table scans as much as possible, and first consider establishing indexes on the columns involved in where and order by.

2. Try to avoid the null value judgment of the field in the where clause. NULL is the default value when creating the table, but most of the time you should use NOT NULL, or use a special value, such as 0, -1 as the default value.

3. Avoid using *!= or <> operators in the where clause. MySQL uses indexes only for the following operators: <, <=, =, >, >=, BETWEEN, IN, and some LIKE of the time.

4. Try to avoid using or in the where clause to join conditions, otherwise it will cause the engine to abandon the use of indexes and perform full table scans. You can use UNION combined query: select id from t where num=10 union all select id from t where num=20.

5. In and not in should be used with caution, otherwise it will cause a full table scan. For continuous values, do not use in if you can use between: Select id from t where num between 1 and 3.

6. The following query will also cause a full table scan: select id from t where name like'%abc%' or select id from t where name like'%abc' To improve efficiency, you can consider full-text search. And select id from t where name like'abc%' only used the index.

7. If you use parameters in the where clause, it will also cause a full table scan.

8, should try to avoid performing expression operations on fields in the where clause, and should try to avoid performing function operations on fields in the where clause.

9. In many cases, using exists instead of in is a good choice: select num from a where num in (select num from b). Replace with the following statement: select num from a where exists(select 1 from b where num=a.num).

10. Although the index can improve the efficiency of the corresponding select, it also reduces the efficiency of insert and update, because the index may be rebuilt during insert or update, so how to build an index needs to be carefully considered, depending on the specific situation. The number of indexes of a table should not exceed 6, if there are too many, you should consider whether it is necessary to build indexes on columns that are not frequently used.

11. Update clustered index data columns should be avoided as much as possible, because the order of clustered index data columns is the physical storage order of table records. Once the column value changes, it will cause the adjustment of the order of the entire table records, which will consume considerable resources. If the application system needs to frequently update the clustered index data columns, then you need to consider whether the index should be built as a clustered index.

12. Use numeric fields as much as possible. If fields that only contain numerical information, try not to design them as characters. This will reduce the performance of queries and connections and increase storage overhead.

13. Use varchar/nvarchar instead of char/nchar as much as possible, because the storage space of variable-length fields is small, which can save storage space. Secondly, for queries, the search efficiency in a relatively small field is obviously higher.

14. It is best not to use "" to return all: select from t, use a specific field list instead of "*", and do not return any unused fields.

15. Try to avoid returning a large amount of data to the client. If the amount of data is too large, you should consider whether the corresponding demand is reasonable.

16. Use table alias (Alias): When connecting multiple tables in a SQL statement, please use the table alias and prefix the alias on each column. In this way, you can reduce the parsing time and reduce those grammatical errors caused by Column ambiguity.

17. Use "temporary table" to temporarily store intermediate results: an important way to simplify SQL statements is to use temporary tables to temporarily store intermediate results, but the benefits of temporary tables are far more than these. Temporary results are temporarily stored in temporary tables, and subsequent queries are in In tempdb, this can avoid multiple scans of the main table in the program, and greatly reduce the "shared lock" blocking "update lock" in the program execution, reducing the blocking and improving the concurrent performance.

18. Nolock should be added to some SQL query statements. Reading and writing will block each other. In order to improve concurrency performance, nolock can be added to some queries, so that writing can be allowed when reading, but the disadvantage is that it may be read and not committed. Dirty data.

There are 3 principles for using nolock:

1. Nolock can not be added if the query result is used for "insert, delete, modify";

2. The query table belongs to frequent page splits, use nolock with caution;

3. The use of temporary tables can save "data front shadow", which has a function similar to Oracle's undo table space. Temporary tables can be used to improve concurrent performance, do not use nolock.

19. Common simplification rules are as follows:

Do not have more than 5 table joins (JOIN), consider using temporary tables or table variables to store intermediate results. Use subqueries sparingly, and view nesting should not be too deep. Generally, the nesting of views should not exceed two.

20. Pre-calculate the results that need to be queried and place them in the table, and then select when querying. This was the most important method before SQL7.0, such as the calculation of hospitalization fees in hospitals.

21. The wording of OR can be decomposed into multiple queries, and multiple queries can be connected through UNION. Their speed is only related to whether the index is used or not. If the query needs to use the joint index, the execution efficiency of UNION all is higher. Multiple OR clauses do not use the index, rewrite it into UNION and try to match the index. A key question is whether to use indexes.

22. In the list of values ​​after IN, put the most frequently occurring value at the top, and the least occurring value at the back, reducing the number of judgments.

23. Try to put data processing work on the server to reduce network overhead, such as using stored procedures.

Stored procedures are SQL statements that are compiled, optimized, organized into an execution plan and stored in the database. They are a collection of control flow languages, and of course the speed is fast. For dynamic SQL that is executed repeatedly, you can use a temporary stored procedure, which is placed in Tempdb.

24. When the server has enough memory, the number of configuration threads = the maximum number of connections + 5, which can maximize efficiency; otherwise, use the number of configuration threads <the maximum number of connections to enable the SQL SERVER thread pool to solve the problem, if the number is still = the maximum connection The number +5 seriously damages the performance of the server.

25. The order in which the query association is written:

select a.personMemberID, * from chineseresume a,personmember b where personMemberID = b.referenceid and a.personMemberID = ‘JCNPRH39681’ (A = B ,B = ‘号码’) 

select a.personMemberID, * from chineseresume a,personmember b where a.personMemberID = b.referenceid and a.personMemberID = ‘JCNPRH39681’ and b.referenceid = ‘JCNPRH39681’ (A = B ,B = ‘号码’, A = ‘号码’) 

select a.personMemberID, * from chineseresume a,personmember b where b.referenceid = ‘JCNPRH39681’ and a.personMemberID = ‘JCNPRH39681’ (B = ‘号码’, A = ‘号码’)

26. Try to use exists instead of select count(1) to determine whether there is a record. The count function is only used when counting all the rows in the table, and count(1) is more efficient than count(*).

27. Try to use ">=" instead of ">".

28. The use of index:

1. The creation of indexes should be considered in conjunction with the application. It is recommended that large OLTP tables do not exceed 6 indexes;

2. Use index fields as query conditions as much as possible, especially clustered indexes. If necessary, you can use index index_name to force the index to be specified;

3. Avoid table scan when querying large tables, and consider creating new indexes when necessary;

4. When using an index field as a condition, if the index is a joint index, then the first field in the index must be used as a condition to ensure that the system uses the index, otherwise the index will not be used;

5. Pay attention to the maintenance of the index, rebuild the index periodically, and recompile the stored procedure.

29. The columns in the following SQL conditional statements are all properly indexed, but the execution speed is very slow:

SELECT * FROM record WHERE substrINg(card_no,1,4)=5378(13)  

SELECT * FROM record WHERE amount/30< 100011秒)  

SELECT * FROM record WHERE convert(char(10),date,112)=19991201’ (10秒) 

analysis:

The result of any operation on the column in the WHERE clause is calculated column by column when SQL is running, so it has to perform a table search without using the index on the column.

If these results can be obtained when the query is compiled, they can be optimized by the SQL optimizer, using indexes and avoiding table searches, so the SQL is rewritten as follows:

SELECT * FROM record WHERE card_no like ‘5378%’ (< 1秒)

SELECT * FROM record WHERE amount< 1000*30 (< 1秒)

SELECT * FROM record WHERE date= ‘1999/12/01’ (< 1秒)

30. When there is a batch of inserts or updates, use batch inserts or batch updates, and never update each record.

31. In all stored procedures, SQL statements can be used, I will never use loops to achieve.

For example: To list every day of the previous month, I will use connect by to recursively query, and I will never use a loop from the first day to the last day of the previous month.

32. Choose the most efficient table name sequence (only valid in rule-based optimizers):

Oracle's parser processes the table names in the FROM clause in order from right to left. The table written in the FROM clause at the end (the basic table driving table) will be processed first, and multiple tables are included in the FROM clause. In the case of, you must choose the table with the least number of records as the base table.

If there are more than 3 table join queries, then you need to select the intersection table as the base table. The cross table refers to the table referenced by other tables.

33. To improve the efficiency of the GROUP BY statement, you can filter out unwanted records before the GROUP BY. The following two queries return the same results, but the second one is obviously much faster.

Inefficient:

SELECT JOB , AVG(SAL) 

FROM EMP 

GROUP BY JOB 

HAVING JOB =’PRESIDENT’ 

OR JOB =’MANAGER’ 

Efficient:

SELECT JOB , AVG(SAL) 

FROM EMP 

WHERE JOB =’PRESIDENT’ 

OR JOB =’MANAGER’ 

GROUP BY JOB

34. Use uppercase for SQL statements, because Oracle always parses SQL statements first and converts lowercase letters to uppercase before executing.

35. The use of aliases. Aliases are an application technique for large databases. The table name and column name are aliased with a letter in the query, and the query speed is 1.5 times faster than the connection table.

36. Avoid deadlocks. Always access the same table in the same order in your stored procedures and triggers; transactions should be shortened as much as possible, and the amount of data involved in a transaction should be reduced as much as possible; never Waiting for user input in the transaction.

37. Avoid using temporary tables. Unless you need them, you should try to avoid using temporary tables. Instead, you can use table variables instead; most of the time (99%), table variables reside in memory, so the speed is faster than temporary tables , The temporary table resides in the TempDb database, so operations on the temporary table require cross-database communication, which is naturally slow.

38. It is best not to use triggers:

1. Triggering a trigger, executing a trigger event itself is a resource-consuming process;

2. If you can use constraints to achieve, try not to use triggers;

3. Don't use the same trigger for different trigger events (Insert, Update and Delete);

4. Do not use transactional code in triggers.

39. Index creation rules:

1. The primary key and foreign key of the table must have indexes;

2. Tables with more than 300 data should have indexes;

3. For tables that are often connected to other tables, an index should be established on the connection field;

4. The fields that often appear in the Where clause, especially the fields of large tables, should be indexed;

5. Indexes should be built on highly selective fields;

6. Indexes should be built on small fields. Do not build indexes for large text fields or even long fields;

7. The establishment of composite index requires careful analysis, and try to consider using single-field index instead;

8. Correctly select the main column field in the composite index, generally the field with better selectivity;

9. Do several fields of a compound index often appear in the Where clause in AND mode at the same time? Are there few or no single-field queries? If so, you can build a composite index; otherwise, consider a single-field index;

10. If the fields contained in the composite index often appear alone in the Where clause, it is broken down into multiple single-field indexes;

11. If the composite index contains more than 3 fields, then carefully consider its necessity and consider reducing the composite field;

12. If you have both a single-field index and a composite index on these fields, you can generally delete the composite index;

13. Do not build too many indexes for tables with frequent data operations;

14. Delete useless indexes to avoid negative impact on the execution plan;

15. Each index created on the table will increase storage overhead, and the index will also increase processing overhead for insert, delete, and update operations. In addition, too many composite indexes, in the case of a single-field index, generally have no value; on the contrary, it will reduce the performance of data increase and deletion, especially for frequently updated tables, the negative impact is even more Big.

16. Try not to index a field in the database that contains a large number of repeated values.

40, MySQL query optimization summary:

Use the slow query log to find slow queries, use the execution plan to determine whether the query is running normally, always test your queries to see if they are running in the best state.

Performance will always change over time. Avoid using count(*) on the entire table. It may lock the entire table and keep the query consistent so that subsequent similar queries can use query caching. Use GROUP BY instead of DISTINCT in appropriate situations. , Use indexed columns in WHERE, GROUP BY, and ORDER BY clauses, keep the index simple, and do not include the same column in multiple indexes.

Sometimes MySQL will use the wrong index. In this case, use USE INDEX and check the problem of using SQL_MODE=STRICT. For index fields with less than 5 records, use LIMIT in UNION instead of OR.

In order to avoid SELECT before update, use INSERT ON DUPLICATE KEY or INSERT IGNORE, do not use UPDATE to achieve, do not use MAX, use index fields and ORDER BY clause, LIMIT M, N can actually slow down the query in some cases, Use it sparingly. Use UNION in the WHERE clause instead of subqueries. After restarting MySQL, remember to warm up your database to ensure that the data is in memory and the query speed is fast. Consider persistent connections instead of multiple connections. Reduce overhead.

Benchmark queries include using the load on the server. Sometimes a simple query can affect other queries. When the load increases on the server, use SHOW PROCESSLIST to view slow and problematic queries. Tested in the mirrored data generated in the development environment All suspicious queries.

41. MySQL backup process:

1. Make a backup from the secondary replication server;

2. Stop replication during the backup period to avoid inconsistencies in data dependence and foreign key constraints;

3. Completely stop MySQL and back up from the database file;

4. If using MySQL dump for backup, please back up the binary log file at the same time-make sure that the replication is not interrupted;

5. Don't trust LVM snapshots, this may cause data inconsistency, which will cause you trouble in the future;

6. To make it easier to perform single-table recovery, export data in units of tables-if the data is isolated from other tables.

7. Please use -opt when using mysqldump;

8. Check and optimize the table before backup;

9. In order to import faster, temporarily disable foreign key constraints during import.

10. In order to import faster, temporarily disable uniqueness detection during import;

11. Calculate the size of the database, table and index after each backup, so as to better monitor the growth of data size;

12. Monitor errors and delays of replication instances through automatic scheduling scripts;

13. Perform regular backups.

42. The query buffer does not automatically process spaces. Therefore, when writing SQL statements, the use of spaces should be minimized, especially at the beginning and end of SQL (because the query buffer does not automatically intercept the first and last spaces).

43. Is it convenient for member to use mid as standard for sub-tables to query? The general business requirements are basically based on username as the query basis. Normally, username should be used as hash modulo to divide the table.

In the case of sub-tables, MySQL's partition function does this, which is transparent to the code; it seems unreasonable to implement it at the code level.

44. We should set an ID for each table in the database as its primary key, and the best is an INT type (UNSIGNED is recommended), and set the automatic increase AUTO_INCREMENT flag.

45. Set SET NOCOUNT ON at the beginning of all stored procedures and triggers, and set SET NOCOUNT OFF at the end. There is no need to send the DONE_IN_PROC message to the client after executing each statement of the stored procedure and trigger.

46, MySQL query can enable high-speed query cache. This is one of the effective MySQL optimization methods to improve database performance. When the same query is executed multiple times, extracting data from the cache and returning data directly from the database is much faster.

47. EXPLAIN SELECT query is used to track and view the effect:

Use the EXPLAIN keyword to let you know how MySQL handles your SQL statement. This can help you analyze the performance bottleneck of your query or table structure. EXPLAIN query results will also tell you how your index primary key is used, how your data table is searched and sorted.

48. Use LIMIT 1 when only one row of data is required:

When you query the table for some time, you already know that there will only be one result, but because you may need to fetch the cursor, or you may check the number of records returned.

In this case, adding LIMIT 1 can increase performance. In this way, the MySQL database engine will stop searching after finding a piece of data, instead of continuing to find the next piece of data that matches the record.

49. Choose a suitable storage engine for the table:

myisam: The application is based on read and insert operations, with only a small amount of updates and deletions, and the integrity and concurrency requirements of the transaction are not very high.

InnoDB: Transaction processing, and data consistency required under concurrent conditions. In addition to inserts and queries, many updates and deletes are included. (InnoDB effectively reduces locks caused by deletions and updates).

For InnoDB-type tables that support transactions, the main reason for affecting the speed is that the default setting of AUTOCOMMIT is turned on, and the program does not explicitly call BEGIN to start the transaction, which
causes each insert to be automatically committed, which seriously affects the speed. You can call begin before executing SQL, and multiple SQLs form one transaction (even if autocommit is turned on), which will greatly improve performance.

50. Optimize the data type of the table and select the appropriate data type:

Principle: Smaller is usually better, simple is better, all fields must have default values, try to avoid null.

For example: When designing database tables, use smaller integer types as much as possible to occupy disk space. (mediumint is more suitable than int)

For example, the time field: datetime and timestamp, datetime occupies 8 bytes, and timestamp occupies 4 bytes, only half of it, and the range of timestamp is 1970-2037 suitable for update time

MySQL can well support the access of large amounts of data, but generally speaking, the smaller the table in the database, the faster the query executed on it.

Therefore, when creating the table, in order to obtain better performance, we can set the width of the fields in the table as small as possible.

For example: when defining the postal code field, if it is set to CHAR(255), it obviously adds unnecessary space to the database. Even using VARCHAR is redundant, because CHAR(6) can accomplish the task well.

Similarly, if possible, we should use MEDIUMINT instead of BIGIN to define integer fields, and we should try to set the field to NOT NULL so that the database does not need to compare NULL values ​​when executing queries in the future.

For some text fields, such as "province" or "gender", we can define them as ENUM type. Because in MySQL, ENUM type is treated as numeric data, and numeric data is processed much faster than text type. In this way, we can improve the performance of the database.

51. String data type: char, varchar, text selection difference.

52. Any operation on the column will result in a table scan, which includes database functions, calculation expressions, etc. When querying, move the operation to the right of the equal sign as much as possible.

Original link: https://www.cnblogs.com/hlkawa/p/14176545.html

Guess you like

Origin blog.csdn.net/qq_43307934/article/details/111934833