Order By and Group By optimization
Article directory
1. Order By optimization
In daily work, you will definitely need to use Order By for sorting in many places, such as always displaying the latest data, which is actually sorting by time, and for example, ranking according to points, etc. Etc., etc. It's really all too common.
For sorting, you may think whether you need to select a wave of data according to the where condition, and then load it into memory for sorting, or use a temporary file to store the data temporarily, and then return the data after sorting. But the above speed is usually slower, and the above method has a problem - when the amount of data is relatively large, and you do not use limit
it, such as when you want to sort the whole table, obviously you can't put so much data. At this time, MySQL's approach is to sort based on disk, and the professional term is **"filesort"**. Sorting on disk, especially when the amount of data is large, will be particularly slow.
The purpose of optimizing . In other words, use indexes whenever possible when sorting. The reason is simple, the leaf nodes of the index are themselves ordered.
Example
Suppose there is such a SQL
SELECT * FROM t1 ORDER BY key_part1, key_part2, key_part3;
In order to improve query efficiency, a (key_part1, key_part2, key_part3)
joint index can be established. It should also be followed when using order by 最左匹配原则
, so that neither ASC nor DESC will perform filesort when performing order by.
Precautions
1. To follow 最左匹配原则
, here is an example of the error:
SELECT * FROM t1 ORDER BY key1_part1, key1_part3;
2. All columns in the joint index are either ASC or DESC, and cannot be mixed. Here is an example of the error:
SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC, key_part3 DESC;
3. Only column names can be used in Order By and cannot be used with expressions
SELECT * FROM t1 ORDER BY ABS(key);
SELECT * FROM t1 ORDER BY -key;
summary
If sorting is to be performed frequently, it is necessary to consider whether to establish a joint index for it, and to use the index as much as possible when writing SQL to reduce filesort.
2. Group By optimization
In general, the most common method of grouping is to scan the entire table and create a temporary table, all rows in each group are contiguous, and if there is an aggregate function, it is applied to this temporary table.
If there is an index, it will be accessed using the index to avoid creating a temporary table.
When you want to use an index in Group By, the rules to follow are similar to those of Order By, and you need to follow the same principles. The difference to be mentioned here is the implicit sorting or explicit sorting in Group By .
Implicit Sort vs Explicit Sort
The so-called implicit means that the column of your Group By has no ASC
or DESC
indicator . The display ordering is that you add the indicator.
Implicit sorting or explicit sorting . That is to say, by default, MySQL sorts the grouped results. In layman's terms, each group displays the smallest/largest value in the group. This may be a bit abstract, so let's go straight to the example:
Example
data:
Result:
The above figure shows the results of implicit sorting. After querying by age group, the result displayed by default is the smallest value in each group, that is to say, each group is arranged in order from small to large. It is the smallest piece of data in the front of each group.
Prove:
After analyzing the Explain, you can see that the temporary table is used and the filesort is performed, which is obviously sorted! But in general, we don't want him to enter implicit sorting or explicit sorting during group by . At this time, we can use ORDER BY NULL
to prevent the sorting from happening, or use our own Order By to sort.
effect
When the data after our Group By is not what we expected, we know what to do. For example, in the above example, I want each group to display the largest name, that is, name
this column is arranged in reverse order (this is just an example, the actual situation depends on the business), so what should we do? Is order by after group by? This is actually sorting the results after group by, not sorting within each group .
So, how to sort within each group? That is to sort all the data first, and then group it . Like the following:
SQL:
SELECT
temp.*
FROM
( #先排好序
SELECT * FROM USER ORDER BY `name` DESC
) temp
#再分组
GROUP BY
temp.age
result:
3. Summary of this article
The above is the whole content of this article. It introduces how Order By and Group By can use the index as much as possible, and Group By avoids the occurrence of implicit sorting/display sorting as much as possible. I hope you can experiment with your own hands. Now, there must be a different understanding.
If there is any problem in this content, or there is something that needs to be added, please leave a comment and make progress together!