Order by internal algorithm principle

When you develop an application, you will often encounter the need to display the results according to the specified field sorting.

For example: select city,name,age from t where city='Hangzhou' order by name limit 1000; the city field is an ordinary index

Sort all fields

Look at the execution results:

The "Using filesort" in the Extra field indicates that sorting is required. MySQL will allocate a piece of memory to each thread for sorting, called sort_buffer.

Under normal circumstances, the execution flow of this statement is as follows:

Initialize sort_buffer, make sure to put the three fields of name, city, and age;
Find the first primary key id that satisfies the condition of city='Hangzhou' from the index city;
Go to the primary key id index to fetch the entire row, take the values of the three fields name, city, and age, and store them in sort_buffer;
Take the primary key id of a record from the index city;
Repeat steps 3 and 4 until the value of city does not meet the query conditions;
Quickly sort the data in sort_buffer according to the field name;
According to the sorting result, the first 1000 rows are returned to the client.

Let's call this sorting process for the time being full-field sorting. The schematic diagram of the execution process is as follows:

The action of "sort by name" in the figure may be done in memory, or an external sort may need to be used, depending on the memory and parameter sort_buffer_size required for sorting. But if the amount of sorting data is too large to store inside, you have to use disk temporary files to assist sorting .

The following method can be used to determine whether a sorting statement uses a temporary file:

/* 打开optimizer_trace，只对本线程有效 */
SET optimizer_trace='enabled=on'; 
/* @a保存Innodb_rows_read的初始值 */
select VARIABLE_VALUE into @a from  performance_schema.session_status where variable_name = 'Innodb_rows_read';
/* 执行语句 */
select city, name,age from t where city='杭州' order by name limit 1000; 
/* 查看 OPTIMIZER_TRACE 输出 */
SELECT * FROM `information_schema`.`OPTIMIZER_TRACE`\G
/* @b保存Innodb_rows_read的当前值 */
select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
/* 计算Innodb_rows_read差值 */
select @b-@a;

This method is confirmed by viewing the result of OPTIMIZER_TRACE, you can see from number_of_tmp_files whether temporary files are used.

number_of_tmp_files represents the number of temporary files used in the sorting process. When the internal storage is insufficient, external sorting is required, and external sorting generally uses the merge sort algorithm . It can be easily understood that MySQL divides the data to be sorted into 12 parts, and each part is sorted separately and stored in these temporary files. Then merge these 12 ordered files into one large ordered file.

If sort_buffer_size exceeds the size of the data to be sorted, number_of_tmp_files is 0, which means that sorting can be done directly in memory. The smaller the sort_buffer_size, the more the number of copies that need to be divided, and the larger the value of number_of_tmp_files. Because, tmp_file is sort_buffer_size size.

In addition, examined_rows = 4000, which means that the number of rows involved in sorting is 4000. The packed_additional_fields in sort_mode means that the string is "compacted" in the sorting process. Even if the definition of the name field is varchar(16), the space must be allocated according to the actual length during the sorting process.

The return result of the last query statement select @b-@a is 4000 , which means that only 4000 rows were scanned during the entire execution. It should be noted here that in order to avoid interference to the conclusion, I set internal_tmp_disk_storage_engine to MyISAM . Otherwise, the result of select @b-@a will be displayed as 4001.

This is because when querying the OPTIMIZER_TRACE table, a temporary table is needed , and the default value of internal_tmp_disk_storage_engine is InnoDB . If you are using the InnoDB engine, the value of Innodb_rows_read will be increased by 1 when the data is retrieved from the temporary table.

rowid sort

One problem with the full-field sorting algorithm is that if there are many fields to be returned by the query, then the number of fields to be placed in sort_buffer is too much, so that the number of rows that can be placed in the memory at the same time is small, and it must be divided into many temporary files and sort Performance will be poor. So if the single line is large, this method is not efficient enough.

If MySQL thinks that the length of a single row for sorting is too large, it will use another algorithm rowid sorting . Let me modify a parameter:

SET max_length_for_sort_data = 16;

max_length_for_sort_data is a parameter in MySQL that specifically controls the length of the row data used for sorting. It means that if the length of a single row exceeds this value, MySQL considers the single row to be too large and needs to change to another algorithm. In this way, the total length of the definition of the three fields city, name, and age can be tested again.

The new algorithm puts in the sort_buffer field, only the column to be sorted (name field) and the primary key id . But at this time, the result of the sorting cannot be returned directly because the values of the city and age fields are missing, and the entire execution flow becomes as follows:

Initialize sort_buffer, make sure to put two fields, namely name and id;
Find the first primary key id that satisfies the condition of city='Hangzhou' from the index city;
Go to the primary key id index to fetch the entire row, take the two fields of name and id, and store them in sort_buffer;
Take the primary key id of a record from the index city;
Repeat steps 3 and 4 until the condition of city='Hangzhou' is not satisfied;
Sort the data in sort_buffer according to the field name;
Traverse the sort results, take the first 1000 rows, and return to the original table according to the value of id to take out the three fields of city, name and age and return it to the client.

The schematic diagram of the execution flow is as follows, I call it rowid sorting.

Compared with the full-field sorting, rowid sorting accesses the primary key index of table t one more time, which is step 7.

It should be noted that the final "result set" is a logical concept. In fact, the MySQL server takes out the id from the sorted sort_buffer in turn, and then finds the results of the three fields city, name and age in the original table. You need to consume memory on the server to store the results, which are directly returned to the client .

The value of examined_rows in the figure is still 4000, indicating that the data used for sorting is 4000 rows. But the value of the select @b-@a statement becomes 5000. Because at this time, in addition to the sorting process, after the sorting is completed, the original table must be retrieved according to the id. Since the statement is limit 1000, 1000 more lines will be read.

From the results of OPTIMIZER_TRACE, you can see that the other two messages have also changed.

sort_mode becomes <sort_key, rowid>, which means that only two fields, name and id, are involved in sorting.
number_of_tmp_files has become 10, because at this time, although the number of rows involved in sorting is still 4000, each row has become smaller, so the total amount of data that needs to be sorted has become smaller, and the number of temporary files required is correspondingly reduced. Up.

What conclusions can be drawn from the execution flow of the two algorithms?

If MySQL is really worried that the sorting memory is too small, which will affect the sorting efficiency, it will use the rowid sorting algorithm, so that more rows can be sorted at one time during the sorting process, but it needs to go back to the original table to fetch the data.
If MySQL thinks that the memory is large enough, it will prioritize the full-field sorting and put all the required fields in sort_buffer, so that the query results will be returned directly from the memory after sorting, without going back to the original table to fetch the data.

This also reflects a design philosophy of MySQL: if there is enough memory, more memory must be used to minimize disk access.

Index optimization of order by from business logic

In fact, not all order by statements require sorting operations. From the execution process analyzed above, we can see that the reason why MySQL needs to generate a temporary table and perform a sorting operation on the temporary table is that the original data is out of order . So, how to ensure that the rows retrieved from the city index are naturally sorted by name in ascending order.

Optimization 1, create a joint index of city and name

In this index, we can still use tree search to locate the first record that satisfies city='Hangzhou', and additionally ensure that in the traversal process of fetching the "next record" in order, as long as city's If the value is Hangzhou, the value of name must be in order. In this way, the flow of the entire query process becomes:

Find the first primary key id that meets the condition of city='Hangzhou' from the index (city, name);
Go to the primary key id index to fetch the entire row, take the values of the three fields name, city, and age, and return them directly as part of the result set;
Take the primary key id of a record from the index (city, name);
Repeat steps 2 and 3 until the 1000th record is found, or the loop ends when the condition of city='Hangzhou' is not met.

This query process does not require temporary tables, nor does it require sorting. Next, we use the result of explain to confirm it.

In addition, because the (city,name) joint index itself is ordered, this query does not need to read all 4000 rows, as long as it finds the first 1000 records that meet the conditions, it can exit. In other words, in our example, only 1000 scans are required.

Optimization 2, covering the index, creating a joint index of city, name and age.

In this way, the execution flow of the entire query statement becomes:

Find the first record that meets the condition of city='Hangzhou' from the index (city, name, age), take out the values of the three fields city, name, and age, and return them directly as part of the result set;
Take a record from the index (city, name, age), also take out the values of these three fields, and return them directly as part of the result set;
Repeat step 2 until the 1000th record is found, or the loop ends when the condition of city='Hangzhou' is not met.

As you can see, the "Using index" is added to the Extra field, which means that the covering index is used, which will be much faster in performance.

Of course, this is not to say that for each query to be able to use the covering index, it is necessary to build a joint index on the fields involved in the statement, after all, the index still has a maintenance cost. This is a decision that needs to be weighed.

Extended case

select * from t where city in ("Hangzhou","Suzhou") order by name limit 100; Does this SQL statement need to be sorted? Is there any solution to avoid sorting?

Although there is a (city,name) joint index, for a single city, the name is incremented. However, since this SQL statement searches the two cities of "Hangzhou" and "Suzhou" at the same time, all the names that meet the conditions are not incremented. This SQL statement needs to be sorted.

From the perspective of business development, implement a solution that does not require sorting on the database side. Furthermore, if there is a need for paging, page 101 must be displayed, which means that the sentence must be changed to "limit 10000,100" at the end. What would be your implementation method?

We need to use the feature of (city, name) joint index, split this statement into two statements, the execution flow is as follows:

Execute select * from t where city="Hangzhou" order by name limit 100; This statement does not require sorting. The client uses a memory array A with a length of 100 to store the results.
Execute select * from t where city="Suzhou" order by name limit 100; In the same way, assume that the result is stored in the memory array B.
Now A and B are two ordered arrays, and then you can use the idea of merging and sorting to get the first 100 values with the smallest name, which is the result we need.

If you change the "limit 100" in this SQL statement to "limit 10000,100", the processing method is actually similar, that is: change the above two statements to write:
select * from t where city="Hangzhou" order by name limit 10100; and select * from t where city="Suzhou" order by name limit 10100;

At this time, the amount of data is large, and you can read the results in two rows at the same time. Use the merge sort algorithm to get the two result sets, and take the name values from 10001 to 10100 in order, which is the desired result.

If the single row of the data is relatively large, you can consider changing * to id, name, and then use the merge sort method to obtain the value of name and id from 10001 to 10100 in the order of name, and then take these 100 ids to the database Find out all records.

Content source: Lin Xiaobin "45 Lectures on MySQL Actual Combat"