MySQL combat: how to optimize the order by statement?

Please add image description

How does order by work?

CREATE TABLE `person` (
  `id` int(11) NOT NULL,
  `city` varchar(16) NOT NULL,
  `name` varchar(16) NOT NULL,
  `age` int(11) NOT NULL,
  `addr` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `city` (`city`)
) ENGINE=InnoDB;

If there is a table like the above, execute the following sql

explain select city, name, age from person where city = '杭州' order by name limit 1000;

The result of explain is as follows

insert image description here
There is a Using filesort in the Extra column indicating that the sorting is performed. The sorting may be done in memory or in disk.

So how to sort records according to the name field?

There are two ways of sorting, full field sorting and rowid sorting

Sort by all fields

insert image description here
The process of sorting all fields is as follows

  1. Initialize the sort buffer and find the primary key id that satisfies the city=Hangzhou condition from the city index
  2. According to the primary key id, go back to the table to find the corresponding record, take out the values ​​of the three fields of name city age, and store them in the sort buffer
  3. Find the primary key of the next record from the city index
  4. Repeat steps 2 and 3 to find all records that meet the conditions
  5. Sort the data in the sort buffer according to the field name, and return the first 1000 rows of the sorting result to the client

We can check the size of the sort buffer by executing the following statement

show variables like '%sort_buffer%'

insert image description here

We call this sorting process a full field sorting

The action of sorting by name may be done in memory, or it may need to use an external sort. This depends on the memory size required for sorting and sort_buffer_size (the size of memory mysql opens up for sorting, that is, sort buffer)

If the amount of data is too large, you need to take advantage of disk file sorting

Sort by rowid

If there are many fields to be returned by the query, then the number of fields that need to be put in the sort buffer is also large. At this time, it will be divided into many temporary files, and the sorting performance will be very poor.

If the single line is large, this method is not efficient enough. So can't we solve this problem by reducing the size of the sort buffer?

When the size of a single row exceeds the fixed value, we only put the necessary fields (primary key id and sorting field) into the sort buffer, and after sorting by name, return the query data to the table according to the primary key id, and then
insert image description here
we can sort this The process is called rowid sorting

We can set a fixed value by executing the following statement. When the size of a single row exceeds this fixed value, let mysql change the algorithm

SET max_length_for_sort_data = 16;

Sorting by all fields, how to choose rowid sorting?

When memory is sufficient, full field sorting will be used to reduce disk access. When memory is not enough, rowid sorting is used

Of course, not all order by statements require sorting operations. The reason why MySQL needs to generate a temporary table and perform sorting operations on the temporary table is that the original data is out of order

Is it possible to get the data, the name is already in order?

It is not enough for us to build a joint index of city and name

alter table person add index city_user(city, name);

insert image description here
It can be seen that the Extra column of the execution plan has no Using filesort, indicating that no sorting is required, because the name is already in order when reading data from the index

Assuming that the person table has a joint index on city and name, does the following statement need to be sorted?

explain select * from person where city in ('杭州') order by name limit 100

insert image description here

The answer is no, the name of a city is ordered, no need to sort

What if the following statement?

explain select * from person where city in ('杭州', '苏州') order by name limit 100

insert image description here
The answer is yes, the names of multiple cities are not ordered and need to be sorted

When the execution of the order by statement is relatively slow, we can optimize it by the following methods

  1. The sorted field increases the index
  2. Increase the size of the sort buffer
  3. Don't use * as a query list, just return the required columns

Reference blog

[1]https://mp.weixin.qq.com/s/yUrq3UfCKP91jRp9VEFT6w
[2]https://zhuanlan.zhihu.com/p/380671457
[3]https://time.geekbang.org/column/article/73479

Guess you like

Origin blog.csdn.net/zzti_erlie/article/details/123710847