The pit of Mysql composite index! ! !

 

Problem occurs for the first time!

Last Friday, at the company's annual meeting, the operation and maintenance call said UIOC in the morning. The CPU of IMS applications continued to soar, and UIOC was started immediately. It is true that I quickly arrived at the company. Various investigations finally found that the number of concurrent DB connections was extremely high, the DB load was extremely high, and the Kafka backlog was serious, and it lasted for an hour or two.

 

 

First wave of solutions!

Check the IMS application thread stack (ThreadDump) information and find that 167 of the 200 running threads are doing the same operation. This operation is that each user needs to load all the tasks of the day when they log in, cache them in redis, and load all their own tasks. The mysql statement is basically as follows:

select * from A left outer join B on A.taskid=B.taskid where A.resourceid = '123456' , where we already have a composite index, which is (A.modifytime, A.resouceid), and the composite index is used as From left to right, so A.resourceid = '123456' does not use a composite index at all, and the result is a full table scan. Obviously our sql statement forgot to add a time limit, so that night was changed to the following:

select * from A left outer join B on A.taskid=B.taskid where A.resourceid = '123456' and A.modifytime > 2017-01-09

 

 

The second time the problem occurs ( +principle analysis )!

I thought that if I changed it like this, the index would be effective and run normally. Who knows that the DB will report to the police in the past two days, and the problem is still that statement. The explain execution plan finds that the index will be deviated, the index will be missed, or only half of the composite index (A.modifytime) will be used. The composite index has two fields, and the where condition has both, why only half of them are used. Later, I learned that the mysql composite index goes from left to right. When encountering the condition of range query (A.modifytime), the fields behind the composite index will not be used after the index is returned to the table.

There is also a question why the index will be missed or missed?

I learned through DBA that which index is used by mysql when executing is determined at runtime (different from oracle). If the SQL engine thinks that the index is used to return to the table and then scan the query, it is better to scan the full table directly. He may not use the index. Or use another index that he thinks is more appropriate.

 Supplement: It is reasonable to use half of the index, why do you still call the police? The main reason is that in some relatively large cities, all the tasks of all users will be very large every day. Half of the indexes are hit, and the DB pressure of full table scan after returning the table is still very high. Especially in the morning rush hour, the pressure is more obvious!

 

 

The ultimate wave of solutions !

The problem is described here, how to solve it?

Combining with the business, we know that the main table stores all the task data of all users in this city for 15 days. It is much better to use A.resourceid to filter directly. Coupled with the time condition (A.modifytime > 2017-01-09), it can be almost Lock data to a very small range. So we adjusted the order of the composite index, namely (A.resouceid, A.modifytime), so that the SQL statement will hit this index every time. On the second day after going online, monitoring during peak hours found that the CPU usage of DB dropped from 80% to 12%.

 

 

 

 

 

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326435999&siteId=291194637