How to create a reasonable index, how to optimize the index?

Analysis & Answer

Reasonable indexing

  1. The leftmost prefix matching principle is a very important principle. Mysql will always match to the right until it encounters a range query (>, <, between, like) and then stops matching, such as a = 1 and b = 2 and c > 3 and d = 4 If you create an index in the order of (a,b,c,d), d will not use the index, if you create an index in (a,b,d,c), you can use it, the order of a,b,d Can be adjusted arbitrarily.
  2. Try to choose a column with a high degree of discrimination as an index. The formula for the degree of discrimination is count(distinct col)/count(*), which indicates the proportion of fields that are not repeated. The larger the proportion, the fewer records we scan. The degree of discrimination of the unique key is 1, and some status and gender fields may have a discrimination degree of 0 in front of big data, so some people may ask, is there any experience value for this ratio? It is difficult to determine this value in different usage scenarios. Generally, the fields that need to be joined are required to be above 0.1, that is, an average of 10 records per scan
  3. Index columns cannot participate in the calculation, keep the column "clean", such as from_unixtime(create_time) = '2014-05-29', the index cannot be used, the reason is very simple, all the fields stored in the b+ tree are the field values ​​​​in the data table, but When searching, it is necessary to apply a function to all elements to compare, which obviously costs too much. So the statement should be written as create_time = unix_timestamp('2014-05-29');
  4. Extend the index as much as possible, do not create a new index. For example, there is already an index of a in the table, and now you want to add the index of (a, b), then you only need to modify the original index
  5. Smaller data types are generally better: Smaller data types generally require less space on disk, in memory, and in CPU cache, and are faster to process.
  6. Simple data types are better: Integers are less expensive to process than characters, because string comparisons are more complex. In MySQL, you should use the built-in date and time data types instead of strings to store time; and use integer data types to store IP addresses.
  7. Try to avoid NULL: You should specify the column as NOT NULL unless you want to store NULL. Columns with null values ​​are difficult to optimize for queries in MySQL because they complicate indexes, index statistics, and comparison operations. You should replace null values ​​with 0, a special value or an empty string
  8. MYSQL can only use the index when the data type of the primary key and the foreign key are the same, otherwise it will not be used even if the index is established
  9. When indexing a list, a prefix length should be specified if possible. For example, if you have a CHAR(255) column, don't index the entire column if most values ​​are unique within the first 10 or 20 characters. Short indexes can not only improve query speed but also save disk space and I/O operations
  10. The index will not contain columns with NULL values. As long as the columns contain NULL values, they will not be included in the index. As long as there is a column in the composite index that contains NULL values, then this column is invalid for the composite index. So we don't let the default value of the field be NULL when designing the database.

Notes on SQL statements

  1. Use LIMIT 1 when the result set has only one row of data
  2. Avoid SELECT *, always specify the columns you need, the more data you read from the table, the slower the query will become. He increases the time that the disk needs to operate, or when the database server and the WEB server are separated independently. You will experience very long network delays simply because data is unnecessarily traveling between servers.
  3. Use connection (JOIN) instead of sub-query (Sub-Queries) connection (JOIN).. The reason why it is more efficient is that MySQL does not need to create a temporary table in memory to complete this logical two-step query Work.
  4. Use ENUM, CHAR instead of VARCHAR, use reasonable field attribute length
  5. Use NOT NULL whenever possible
  6. Fixed-length tables will be faster
  7. Split up large DELETE or INSERT statements
  8. The smaller the column, the faster the query
  9. Implicit conversion leads to index invalidation. This should be paid attention to. It is also a mistake that is often made in development.
  10. The operation on the index column causes the index to fail. The operation on the index column I refer to includes (+, -, *, /, !, etc.)
  11. The use of mysql internal functions causes the index to fail. In this case, a function-based index should be created.
  12. = and in can be out of order, such as a = 1 and b = 2 and c = 3 The (a, b, c) index can be established in any order, and the query optimizer of mysql will help you optimize it into a form that the index can recognize
  13. In the query, the WHERE condition is also a relatively important factor. It is very important to have as few as possible and reasonable where conditions. When there are multiple conditions, try to put the condition that will extract as little data as possible in front, and reduce the latter one. The query time of the where condition.
  14. Some where conditions will cause the index to be invalid:
    • There are query conditions in the where clause! =, MySQL will not be able to use the index.
    • When the where clause uses a Mysql function, the index will be invalid, for example: select * from tb where left(name, 4) = 'xxx'
    • When using LIKE to search and match, the index is valid: select * from tbl1 where name like 'xxx%', but the index is invalid when like '%xxx%'

with deep attention

  1. Use >= instead of > Efficient: SELECT * FROM EMP WHERE DEPTNO >=4 Inefficient: SELECT * FROM EMP WHERE DEPTNO >3
  2. Replace OR with UNION (for indexed columns)
  3. Replace UNION with UNION-ALL (if possible)
  4. Replace ORDER BY with WHERE
  5. Try not to use temporary tables unless you have to.
  6. Try not to use TEXT data type, VARCHAR can handle your data better.
  7. Using referential integrity, defining primary keys, unique constraints, and foreign keys can save a lot of time.
  8. The size of the mysql communication data package is 1M by default, so when inserting or updating in batches, pay attention to whether the length of the concatenated sql statement exceeds 1M. If you want to modify the default package size, put the mysql configuration file (my.ini) in The max_allowed_packet is set to the size you want. If you cannot modify the configuration of the online database, you can add it when connecting to JDBC: blobSendChunkSize=50000&useServerPrepStmts=true&emulateUnsupportedPstmts=false&maxAllowedPacket=10000000
  9. MySQL's timestamp type time range is between '1970-01-01 00:00:01' and '2038-01-19 03:14:07', and the value is recorded as '0000-00-00 00:00' if it exceeds this range: 00', when the field value of the TIMESTAMP type in the database is '0000-00-00 00:00:00', use this method to read, and an exception will be thrown: Cannot convert value '0000-00-00 00: 00:00' from column 1 to TIMESTAMP, this is because JDBC cannot convert '0000-00-00 00:00:00' into a java.sql.Timestamp, in order to solve this problem, you can add zeroDateTimeBehavior to the JDBC URL information.
  10. The method of querying the number of duplicate records in a field in a table: select id,count(id) from @table group by id having(count(id)>1)
  11. When MYSQL saves BOOLEAN values, 1 represents TRUE, 0 represents FALSE, and the type of boolean in MySQL is tinyint(1)

Reflect & Expand

How to judge the execution efficiency of SQL

Analyze inefficient SQL execution plans through the explain keyword.

EXPLAIN  SELECT ……
变体:
1. EXPLAIN EXTENDED SELECT ……
将执行计划“反编译”成SELECT语句,运行SHOW WARNINGS 可得到被MySQL优化器优化后的查询语句 
2. EXPLAIN PARTITIONS SELECT ……
用于分区表的EXPLAIN

比如: explain select sum(moneys) from sales a, company b where a.company_id = b.company_id and a.year = 2022;
复制代码

Execution Plan Limitations

  1. EXPLAIN will not tell you about triggers, stored procedures, or how user-defined functions affect queries
  2. EXPLAIN does not consider various Cache
  3. EXPLAIN cannot display the optimization work done by MySQL when executing queries
  4. Some statistics are estimated and not exact
  5. EXPALIN can only explain the SELECT operation, other operations should be rewritten as SELECT and then view the execution plan

index analysis method

View index usage

If the index is working, the value of Handler_read_key will be high, which represents the number of times a row was read by the index value.

A high value for Handler_read_rnd_next means that the query is running inefficiently and should be remedied by building an index.

mysql> show status like 'Handler_read%'; 
+-----------------------+--------+
| Variable_name         | Value  |
+-----------------------+--------+
| Handler_read_first    | 9      |
| Handler_read_key      | 16     |
| Handler_read_last     | 0      |
| Handler_read_next     | 680908 |
| Handler_read_prev     | 0      |
| Handler_read_rnd      | 0      |
| Handler_read_rnd_next | 935519 |
+-----------------------+--------+
7 rows in set (0.00 sec)
复制代码

Two simple and practical optimization methods:

  • The syntax for analyzing tables is as follows: (checks one or more tables for errors)
mysql> CHECK TABLE tbl_name[,tbl_name] …[option] …option = 
{ QUICK | FAST | MEDIUM| EXTENDED | CHANGED} 
mysql> check table sales; 
+--------------+-------+----------+----------+ 
| Table | Op | Msg_type | Msg_text | 
+--------------+-------+----------+----------+ 
| sakila.sales | check | status | OK | 
+--------------+-------+----------+----------+ 
1 row in set (0.01 sec)
复制代码
  • Syntax format of optimized table:

OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE tbl_name [,tbl_name]

Periodic optimization is required if a large portion of the table has been dropped, or if many changes have been made to the table with variable-length rows. This command can merge the space fragments in the table, but this command only works on MyISAM, BDB and InnoDB tables.

mysql> optimize table sales; 
+--------------+----------+----------+----------+ 
| Table | Op | Msg_type | Msg_text | 
+--------------+----------+----------+----------+ 
| sakila.sales | optimize | status | OK | 
+--------------+----------+----------+----------+ 
1 row in set 0.05 sec)

Meow Interview Assistant: One-stop solution to interview questions, you can search the WeChat applet [Meow Interview Assistant]  or follow [Meow Brush Questions] -> Interview Assistant  free questions. If you have good interview knowledge or skills, look forward to your sharing!

Guess you like

Origin blog.csdn.net/jjclove/article/details/127391593