An analysis of the principle of online SQL index optimization and index selection errors

A strange problem occurred in the query of the order module that my colleague was responsible for two days ago. When the filter conditions are added, the query timeout problem will occur. There is no problem when querying all orders. The SQL is as follows (the data has been desensitized, and MySql is used):

SELECT
	a.consumer_code AS orderCode,
	a.rent_equipment_snid AS eqSn,
	a.powerbank_snid AS pbSn,
	a.rent_merchant_name AS rentMerchant,
	a.rent_merchant_address AS merchantAddress,
	a.rent_date AS rentTime,
	a.close_date AS returnTime,
	a.payment_money AS orderAmount,
	a.order_status AS orderStatus,
	a.consume_schema AS consumeSchema,
	a.transaction_status AS transStatus,
	a.rent_equipment_model AS eqModel 
FROM
	cp_consumer_order_2020_10 a 
WHERE
	a.agent_code = xxxx
	# 下面两个条件就是筛选时才会加上
	AND a.order_status = xxx 
	AND a.close_date IS NULL 
ORDER BY
	a.consumer_code desc

cp_consumer_order_2020_10 is a monthly order table with almost 10 million data, consumer_code is the primary key, and agent_code has a normal index.
I got the above SQL executed in the database and found that the agent_code index was gone, and the query efficiency was normal. Then I deleted the filter conditions, and the execution results were the same.Insert picture description here
Insert picture description here

The above is a conditional and unconditional execution plan, you can see that there is no difference. At this time, I wonder if there is any time-consuming operation in
Insert picture description here
the code : this code does not seem to have any particularly time-consuming operation, getAgentOrderList is to execute that SQL, getAgentStaffOrderList also tried to query very quickly, because of the paging, the following for The loop execution will not be particularly slow.
Is it possible that another "spiritual incident" has occurred? At this time, I suddenly thought whether it might be caused by paging. We all know that limit will cause slow query when the offset is very large, but we have not turned the page yet, which is the first page, so this is not the problem.
In addition, I thought that I had seen the problem of index selection errors when using limit and order by before. So I brought limit 0,30 and executed the SQL just now in the database. As expected, slow SQL appeared. At this time, I look at the execution plan as follows:
Insert picture description here
You can see that Mysql uses the primary key index at this time, that is, the field we sort, so I suggest my colleagues use force index to force the normal index, and the query returns to normal.
At this point, SQL optimization is over, but why adding limit will cause Mysql to select the wrong index, and why is it so slow to use the primary key index, and the estimated number of scanned rows is obviously less? In line with the principle of "knowing what is going on, but also knowing why", I checked a lot of information, but I couldn't completely solve the doubts in my heart. In the end, I tried repeatedly and finally figured it out.

  • First of all, why is it faster to take normal indexes and slower to use primary key indexes?
    Because my SQL is the result of querying the primary key index in reverse order, the index is naturally ordered and does not need to be sorted, so I see that the Extra field in the execution plan does not have the Using filesort, which is faster than the ordinary index, but this SQL is filtered by where conditions Yes, after getting the ordered results, you need to take out the agent_code and where conditions one by one to match. Seeing this, I believe readers should basically understand. If there is no limit, then this SQL will have a full table scan; and there is limit 0,30, the following situation will occur: First, the 30 records that match the where condition are just right If it is the first 30 items after sorting, then mysql only needs to scan 30 items; if there are less than 30 items or matching records are at the end of sorting, the whole table will be scanned. However, the normal index agent_code does not have this problem, because the filter condition is agent_code, which can be matched quickly.
  • Why does the primary key index be used when limit is added?
    Because if the primary key index is used without limit, it will match the where condition one by one as mentioned above, but there is no limit to the number of returned items, and a full table scan will be performed (you can use force index(primary)+explain to see row It is the total number of rows in the table (1000w), Mysql thinks that it is faster to use ordinary indexes, because the estimated number of scan rows for ordinary indexes is less than 1.8W; but after adding limit, the estimated number of scan rows for the primary key index may be It is less than the estimated number of scan lines for normal indexes, which leads to index selection errors.

Guess you like

Origin blog.csdn.net/l6108003/article/details/109382533