MySQL index optimization at a glance (real case)

background

(Database used: MYSQL version 5.7, InnoDB engine)
Since Skywalking was added to the service, most of the slow interfaces have been exposed. So there is the optimization of this slow interface. The approximate optimization process.

  • Before optimization:
    Insert picture description here

  • Optimized:
    [External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-j9cgCxHG-1598366444020)(3DE97136B57C4F629349386ABF27B90B)]

Optimization steps

1. Investigation

  1. Through skywalking, you can clearly see at which step the slow interface is slower. Through the call situation, you can clearly see the link call situation as shown below:
    Insert picture description here

If your service is not connected to the monitoring service in this situation, then we can use Alibaba's open source Arthas to track the link (use trace to view the time-consuming method of each step) arthas official document

2. Record the current situation

  1. After the investigation, we can accurately locate the SQL is very slow. At this time, we have to analyze this SQL carefully, first take out the business SQL (obtained through the console log), and view the SQL execution plan through explain. Very important. Check whether there is a walking index, check the number of pre-scan rows, execution strategy, etc. The words here mainly rely on the information provided to us by each field of the explain. (Running is the production library). Record the main information obtained.
Index used Number of pre-scan lines Whether to return to the table Whether to sort execution time in conclusion

3. Determine the specific optimization plan according to the situation and analyze the reasons

1. Wrong index
1. - 这种情况我们呢可以查看他的索引基数通过 show index from table_name.大概看一下cardinality基数字段,大概评估一下。然后使用命令 analyze table tb_name重新计算一下。
    - 因为对与mysql选择索引其中索引基数是重要条件之一
    - 索引基数是通过抽样计算计算出来的,所以不一定是准确的,所以通过analyze table进行重新采样计算后就可以了。
2. - 如果说通过索引基数还没有OK的话,那就有可能是在这条语句中与可能有排序或者有创建临时表的情况,使用这个你认为扫描行数少的有可能产生。所以你可以考虑优化索引了,创建一个符合索引且不需要回表的

2. Low index discrimination

  • What needs to be noted here is to use some state values ​​as little as possible to create indexes. why? After constructing a B+ tree, most of your values ​​are the same, and the time complexity of searching is likely to be about the same as that of an ordinary linked list. Here we can really kill it decisively.
  • The other is that the first half is basically the same, and the latter part has some differences. In this case, we can use flashback storage, and then use it to create an index, and only use the part with a large degree of distinction for storage. In other words, store one more field as the hash value and use the hash value as index. Use the calculated hash directly to match when searching

3. Use the index to return to the table, there is sorting

  • Returning to the table means that the data we find through an index cannot walk slowly through the fields we need, and he needs to scan the collected ID through the primary key index to get the corresponding field information. Then our solution is to use a covering index. Create an index that meets the required fields. What needs to be noted here is that it has already been evaluated. You cannot create a joint index with just one query. The other is to comply with the leftmost matching principle, and the degree of discrimination must be large.
  • There is sorting, that is, when there is filesort when explaining, this operation will perform IO operations and sort. So we create a joint index with this field. For example, we query the joint index of the order corresponding to the id according to the user ID and sorted by create_time. In fact, there is another situation that is that if the user order has several million orders, the query performance will drop at once, then we can use Business requirements are optimized, that is, a time interval is limited in the where statement.
(id,create_time)

4. My final plan for SQL is to change two and add one

  1. Change the index with low discrimination and do not follow the leftmost matching principle
  2. Create a joint index to eliminate file sort
  3. Optimization of business logic, simple optimization of complex SQL, removing nested logic
  4. Combine business logic with time interval (one month before the current time).

3. Strict and effective verification

This is the most important step

My verification plan:

  1. Through my own understanding of the business, I estimate the modified index statement that will be used, and test it. Here you can compare it with the production library
  2. The original index usage has been recorded, and then the original SQL is run in the SIT environment database, and (our SIT library has synchronized data from PROD, but there are still some data missing)
  3. When running SQL, try to choose the filter conditions as far as possible: use the index, and the data will not be filtered out, to prevent the accidental situation of the data being ranked in the front and hitting very early.
  4. Then the test students will perform pressure test with our optimized interface.
  5. It is a full regression test. Click on each query page
  6. Fortunately, the performance has been improved by 1000 from tens of seconds to less than 1ms, and it has no effect on other business interfaces.

4. Index is online

  1. Changing the index is nothing more than deleting the index and creating the index. Here you need to pay attention to the lock table that may be caused during the change process. Here we can take a look at the syntax of Online DDL (https://dev.mysql.com/doc/refman/5.7/en/ innodb-online-ddl-operations.html)
  2. The company's business permits, but I stopped the service in the middle of the night. But it was executed for 1 hour (9 million data)

Hallucination

In actual work, many students think that adding an index will affect the update efficiency. But this impact is very small for the impact of slow queries.

to sum up

  • A process I optimized myself
  • The use of explain (https://segmentfault.com/a/1190000008131735)
  • Several cases of index problems
  • Strict testing is very important (my test personally thinks it is average)

data

  • Meituan's optimized index: https://tech.meituan.com/2014/06/30/mysql-index.html
  • Use of expalin: https://dev.mysql.com/doc/refman/5.7/en/using-explain.html
  • Online DDL syntax: https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-operations.html
  • arthas official document: https://alibaba.github.io/arthas/trace.html)

Guess you like

Origin blog.csdn.net/weixin_40413961/article/details/108230254
Recommended