Mysql order by and limit mixed trap

In Mysql, we often use order by for sorting and limit for paging. When we need to sort first and then paginate, we often use a similar writing method select * from table name order by sorting field limt M,N. But this way of writing hides a deeper usage trap. In the case of repeated data in the sorting field, it is easy to cause the problem that the sorting results are inconsistent with expectations.

For example, there is now a user table, the table structure and data are as follows:

Table Structure

table data

Now I want to query the user table in ascending order according to the creation time, and paging query, 2 items per page, it is easy to write the sql as: select * from user order by create_time limit pageNo,2;

During the execution of the query, you will find: 
1. When querying the first page of data:

The first page of query results

2. When querying the data on the fourth page:

The fourth page of query results

The user table has 8 pieces of data and 4 pages of data, but the same data appears on the first and fourth pages during the actual query process.

What's happening here? Isn't the above paging SQL first querying the two tables associatively, then sorting them, and then fetching the corresponding paginated data? ? ?

The above actual execution results have proved that there is often a gap between reality and imagination, and the actual SQL execution is not executed according to the above method. Here is actually Mysql will optimize the Limit, see the official documentation for the specific optimization method: https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html 
This is the description of version 5.7, extract a few Points directly related to the problem are explained below.

Paste_Image.png

It is mentioned in the official document above that if you mix Limit row_count with order by, mysql will find the sorted row_count row and return it immediately, instead of sorting the entire query result and returning it. If it is sorted by index, it will be very fast; if it is sorted by file, all rows that match the query (without Limit) will be selected, and most or all of the selected rows will be sorted until the row_count required by limit is found. If the row_count required by limit is found, Mysql will not sort the remaining rows in the result set.

Here we look at the execution plan of the corresponding SQL:

Paste_Image.png

It can be confirmed that the file sorting is used, and the table does not have additional indexes. So we can be sure that when this SQL is executed, it will find the row required by the limit and return the query result immediately.

But even if it returns immediately, why is the pagination inaccurate?

The official documentation states the following: 
Paste_Image.png

If the order by field has multiple rows with the same value, mysql will return the query results in random order, depending on the corresponding execution plan. That is, if the sorted column is unordered, the order of the sorted result rows is also indeterminate.

Based on this, we basically know why the paging is inaccurate, because the field we are sorting is create_time, and there are just a few rows with the same value. The order of the rows corresponding to the returned results is uncertain during actual execution. Corresponding to the above situation, the data row with the name 8 returned on the first page may just be in the front row, and the data row with the name 8 in the fourth page query is just in the back row, so the fourth page appears again.

How should this situation be resolved?

The official solution is given: 
Paste_Image.png

If you want to ensure that the sorting results are the same even if the Limit exists or does not exist, you can add an additional sorting condition. For example, the id field is unique, you can consider adding an additional id sort to the sort field to ensure the order is stable.

Therefore, in the above case, you can add a sorting field in SQL, such as the id field of fund_flow, so that the problem of paging is solved. The modified SQL can look like this: 
SELECT * FROM  user ORDER BY create_time,id LIMIT 6,2;

Test again and the problem is solved! !

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325387004&siteId=291194637