mysql optimization scheme

First of all, the statement is transferred, I feel that the writing is very detailed, the original link: http://simpleframework.net/blog/v/7881.html


1. To optimize the query, full table scan should be avoided as much as possible. First, you should consider building indexes on the columns involved in where and order by.

2. Try to avoid the null value judgment of the field in the where clause, otherwise the engine will give up the use of the index and perform a full table scan, such as:

select id from t where num is null

You can set a default value of 0 on num, make sure there are no null values ​​in the num column in the table, and then query like this:

select id from t where num=0

3. Try to avoid using the != or <> operator in the where clause, otherwise the engine will give up the use of the index and perform a full table scan.

4. Try to avoid using or to join conditions in the where clause, otherwise it will cause the engine to give up using the index and perform a full table scan, such as:

select id from t where num=10 or num=20

You can query like this:

select id from t where num=10

union all

select id from t where num=20

5. In and not in should also be used with caution, otherwise it will result in a full table scan, such as:

select id from t where num in(1,2,3)

For consecutive values, use between instead of in:

select id from t where num between 1 and 3

6. The following query will also result in a full table scan:

select id from t where name like ‘%abc%’

To improve efficiency, consider full-text search.

7. If a parameter is used in the where clause, it will also cause a full table scan. Because SQL resolves local variables only at runtime, the optimizer cannot defer the choice of an access plan to runtime; it must choose it at compile time. However, if the access plan is built at compile time, the value of the variable is unknown and cannot be used as an input for index selection. The following statement will perform a full table scan:

select id from t where num=@num <mailto:num=@num>

You can instead force the query to use the index:

select id from t with(index(索引名)) where num=@num <mailto:num=@num>

8. The expression operation on the field in the where clause should be avoided as much as possible, which will cause the engine to give up the use of the index and perform a full table scan. Such as:

select id from t where num/2=100

Should be changed to:

select id from t where num=100*2

9. You should try to avoid functional operations on fields in the where clause, which will cause the engine to give up the use of indexes and perform full table scans. Such as:

select id from t where substring(name,1,3)='abc' – id whose name starts with abc

select id from t where datediff(day,createdate,’2005-11-30′)=0–‘2005-11-30’生成的id

Should be changed to:

select id from t where name like ‘abc%’

select id from t where createdate>=’2005-11-30′ and createdate<’2005-12-1′

10. Do not perform functions, arithmetic operations or other expression operations on the left side of the "=" in the where clause, otherwise the system may not be able to use the index correctly.

11. When using an index field as a condition, if the index is a composite index, the first field in the index must be used as a condition to ensure that the system can use the index, otherwise the index will not be used and should be As much as possible, make the field order consistent with the index order.

12. Don't write some meaningless queries, such as generating an empty table structure:

select col1,col2 into #t from t where 1=0

This kind of code will not return any result set, but will consume system resources, it should be changed to this:

create table #t(…)

13. Many times it is a good choice to use exists instead of in:

select num from a where num in(select num from b)

Replace with the following statement:

select num from a where exists(select 1 from b where num=a.num)

14. Not all indexes are valid for queries. SQL optimizes queries based on the data in the table. When a large amount of data in the index column is repeated, the SQL query may not use the index. For example, there are fields sex, male, The female is almost half and half, so even if an index is built on the sex, it will not be effective for the query.

rate doesn't work.

15. The more the index, the better. Although the index can improve the efficiency of the corresponding select, it also reduces the efficiency of the insert and update, because the index may be rebuilt during the insert or update, so how to build the index needs to be carefully considered. As the case may be. The number of indexes in a table should not exceed 6. If there are too many indexes, you should consider whether it is necessary to build indexes on some infrequently used columns.

16. Avoid updating the clustered index data column as much as possible, because the order of the clustered index data column is the physical storage order of the table records. Once the value of this column changes, the order of the entire table records will be adjusted, which will consume considerable resources. If the application system needs to update the clustered index data column frequently, it needs to consider whether the index should be built as a clustered index.

17. Use numeric fields as much as possible. If the fields only contain numeric information, try not to design them as character fields, which will reduce the performance of query and connection and increase the storage overhead. This is because the engine compares each character of the string one by one when processing queries and joins, whereas only one comparison is required for numbers.

18. Use varchar/nvarchar instead of char/nchar as much as possible, because first of all, the storage space of variable-length fields is small, which can save storage space. Secondly, for queries, the search efficiency in a relatively small field is obviously higher.

19. Do not use select * from t anywhere, replace "*" with a list of specific fields, and do not return any fields that are not used.

20. Try to use table variables instead of temporary tables. If the table variable contains a lot of data, be aware that the indexes are very limited (only the primary key index).

21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

22. Temporary tables are not unusable, and their proper use can make certain routines more efficient, for example, when a large table or a dataset in a frequently used table needs to be repeatedly referenced. However, for one-time events, it is better to use an export table.

23. When creating a new temporary table, if a large amount of data is inserted at one time, you can use select into instead of create table to avoid causing a large number of logs to improve the speed; if the amount of data is not large, in order to ease the resources of the system table, you should first create table, then insert.

24. If temporary tables are used, all temporary tables must be explicitly deleted at the end of the stored procedure, first truncate table, and then drop table, which can avoid long-term locking of system tables.

25. Try to avoid using the cursor, because the efficiency of the cursor is poor, if the data operated by the cursor exceeds 10,000 rows, then you should consider rewriting.

26. Before using the cursor-based method or the temporary table method, you should look for a set-based solution to solve the problem, and the set-based method is usually more efficient.

27. Like temporary tables, cursors are not unavailable. Using FAST_FORWARD cursors for small datasets is often preferable to other row-by-row processing methods, especially if several tables must be referenced to obtain the required data. Routines that include "totals" in the result set are usually faster than using cursors. If development time allows, try both the cursor-based approach and the set-based approach to see which one works better.

28. Set SET NOCOUNT ON at the beginning of all stored procedures and triggers, and set SET NOCOUNT OFF at the end. There is no need to send a DONE_IN_PROC message to the client after each statement of stored procedures and triggers is executed.

29. Try to avoid large transaction operations and improve system concurrency capabilities.

30. Try to avoid returning a large amount of data to the client. If the amount of data is too large, you should consider whether the corresponding demand is reasonable.



Reprint 2: MySQL You May Not Know


Foreword:

The data table for the experiment is defined as follows:
mysql> desc tbl_name;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| uid   | int(11)      | NO   |     | NULL    |       |
| sid   | mediumint(9) | NO   |     | NULL    |       |
| times | mediumint(9) | NO   |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
The storage engine is MyISAM, and there are 10,000 pieces of data in it.
1. The role of "\G"

mysql> select * from tbl_name limit 1;
+--------+--------+-------+
| uid    | sid    | times |
+--------+--------+-------+
| 104460 | 291250 |    29 |
+--------+--------+-------+
1 row in set (0.00 sec)

mysql> select * from tbl_name limit 1\G;
*************************** 1. row ***************************
  uid: 104460
  sid: 291250
times: 29
1 row in set (0.00 sec)
Sometimes, the number of columns returned by the operation is very large, the screen cannot be displayed in one line, and the display is broken. Try "\G" to display the column data line by line ("\G" saved me, I used to see that the explain statement was not displayed horizontally. It seems like a lot of work to wrap the rows, and you have to map the data to the columns).

2. The "Invisible Killer" of "Group by"

mysql> explain select uid,sum(times) from tbl_name group by uid\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_name
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
        Extra: Using temporary; Using filesort
1 row in set (0.00 sec)

mysql> explain select uid,sum(times) from tbl_name group by uid order by null\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_name
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
        Extra: Using temporary
1 row in set (0.00 sec)
By default, Group by col will sort the col field, that's why there is Using filesort in the first statement, if you don't need to sort the col field, add order by null, it's much faster because filesort very slow.

3. Inserting bulk data

The most efficient way to insert data in bulk:

load data infile '/path/to/file' into table tbl_name;
If there is no way to generate a text file first or if you don't want to generate a text file, you can insert multiple lines at once:

insert into tbl_name values (1,2,3),(4,5,6),(7,8,9)...
Note that the maximum length of a SQL statement is limited. If you don't want to do this, you can try MySQL's prepare, which should be much faster than inserting one by one abruptly.

If the data table has an index, it is recommended to temporarily disable the index first:

alter table tbl_name disable keys;
After inserting, activate the index again:

alter table tbl_name enable keys;
Especially useful for MyISAM tables. Avoid updating the index every time a record is inserted.

Fourth, the fastest method of copying table structure

mysql> create table clone_tbl select * from tbl_name limit 0;
Query OK, 0 rows affected (0.08 sec)
Only the table structure will be copied, and the index will not be copied. If you want to copy the data, just remove the limit 0.

Five, the difference between quotation marks and no quotation marks

Add an index to the data table tbl_name:

mysql> create index uid on tbl_name(uid);
Test the following query:

mysql> explain select * from tbl_name where uid = '1081283900'\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_name
         type: ref
possible_keys: uid
          key: uid
      key_len: 4
          ref: const
         rows: 143
        Extra:
1 row in set (0.00 sec)
When we add an index to the value of an integer field, the index can be used. Many people on the Internet misreport that adding quotation marks to the integer field cannot use the index. Modify the uid field type to varchar(12):

mysql> alter table tbl_name change uid uid varchar(12) not null;
Test the following query:

mysql> explain select * from tbl_name where uid = 1081283900\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_name
         type: ALL
possible_keys: uid
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
        Extra: Using where
1 row in set (0.00 sec)
We do not add an index on the query value, and the resulting index cannot be used, pay attention to safety.

6. Prefix index

Sometimes there are fields such as varchar(255) in our table, and we also need to build an index on this field. Generally, it is not necessary to build an index on the entire field. It should be enough to build an index of the first 8 to 12 characters. Very few There are fields where 8 to 12 consecutive characters are equal.

Why? Shorter indexes mean smaller indexes, less CPU time, less memory, less IO and much better performance.

Seven, MySQL index usage

MySQL can only use one index in a query (the index_merge merged index was introduced in version 5.0 and later, and multiple indexes can be used for some specific queries. Check [Chinese] [English] for details), so it should be established according to the query conditions. Joint index, only the first field of the joint index can be used in the query conditions.

If MySQL thinks that it is faster to not use an index than to use an index, then it will not use an index.

mysql> create index times on tbl_name(times);
Query OK, 10000 rows affected (0.10 sec)
Records: 10000  Duplicates: 0  Warnings: 0

mysql> explain select * from tbl_name where times > 20\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_name
         type: ALL
possible_keys: times
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
        Extra: Using where
1 row in set (0.00 sec)

mysql> explain select * from tbl_name where times > 200\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_name
         type: range
possible_keys: times
          key: times
      key_len: 3
          ref: NULL
         rows: 1599
        Extra: Using where
1 row in set (0.00 sec)
Most of the times fields in the data table are larger than 20, so the first query does not use the index, and the second one uses the index.

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326941151&siteId=291194637