MySQL Advanced Road (19) - Order By and Group By Optimization

Order By and Group By optimization

1. Order By optimization

​ In daily work, you will definitely need to use Order By for sorting in many places, such as always displaying the latest data, which is actually sorting by time, and for example, ranking according to points, etc. Etc., etc. It's really all too common.

​ For sorting, you may think whether you need to select a wave of data according to the where condition, and then load it into memory for sorting, or use a temporary file to store the data temporarily, and then return the data after sorting. But the above speed is usually slower, and the above method has a problem - when the amount of data is relatively large, and you do not use limitit, such as when you want to sort the whole table, obviously you can't put so much data. At this time, MySQL's approach is to sort based on disk, and the professional term is **"filesort"**. Sorting on disk, especially when the amount of data is large, will be particularly slow.

​The purpose of optimizing . In other words, use indexes whenever possible when sorting. The reason is simple, the leaf nodes of the index are themselves ordered.

Example

Suppose there is such a SQL

SELECT * FROM t1 ORDER BY key_part1, key_part2, key_part3;

In order to improve query efficiency, a (key_part1, key_part2, key_part3)joint index can be established. It should also be followed when using order by 最左匹配原则, so that neither ASC nor DESC will perform filesort when performing order by.

Precautions

1. To follow 最左匹配原则, here is an example of the error:

SELECT * FROM t1 ORDER BY key1_part1, key1_part3;

2. All columns in the joint index are either ASC or DESC, and cannot be mixed. Here is an example of the error:

SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC, key_part3 DESC;

3. Only column names can be used in Order By and cannot be used with expressions

SELECT * FROM t1 ORDER BY ABS(key);
SELECT * FROM t1 ORDER BY -key;

summary

If sorting is to be performed frequently, it is necessary to consider whether to establish a joint index for it, and to use the index as much as possible when writing SQL to reduce filesort.

2. Group By optimization

In general, the most common method of grouping is to scan the entire table and create a temporary table, all rows in each group are contiguous, and if there is an aggregate function, it is applied to this temporary table.

If there is an index, it will be accessed using the index to avoid creating a temporary table.

​ When you want to use an index in Group By, the rules to follow are similar to those of Order By, and you need to follow the same principles. The difference to be mentioned here is the implicit sorting or explicit sorting in Group By .

Implicit Sort vs Explicit Sort

The so-called implicit means that the column of your Group By has no ASCor DESCindicator . The display ordering is that you add the indicator.

​Implicit sorting or explicit sorting . That is to say, by default, MySQL sorts the grouped results. In layman's terms, each group displays the smallest/largest value in the group. This may be a bit abstract, so let's go straight to the example:

Example

data:
Please add image description

Result:
Please add image description
​ The above figure shows the results of implicit sorting. After querying by age group, the result displayed by default is the smallest value in each group, that is to say, each group is arranged in order from small to large. It is the smallest piece of data in the front of each group.

Prove:

Please add image description

​ After analyzing the Explain, you can see that the temporary table is used and the filesort is performed, which is obviously sorted! But in general, we don't want him to enter implicit sorting or explicit sorting during group by . At this time, we can use ORDER BY NULLto prevent the sorting from happening, or use our own Order By to sort.

Please add image description

effect

When the data after our Group By is not what we expected, we know what to do. For example, in the above example, I want each group to display the largest name, that is, namethis column is arranged in reverse order (this is just an example, the actual situation depends on the business), so what should we do? Is order by after group by? This is actually sorting the results after group by, not sorting within each group .

So, how to sort within each group? That is to sort all the data first, and then group it . Like the following:

SQL:

SELECT
	temp.* 
FROM
	(	#先排好序
		SELECT * FROM USER ORDER BY `name` DESC
	) temp
#再分组	
GROUP BY
	temp.age

result:
Please add image description

3. Summary of this article

​ The above is the whole content of this article. It introduces how Order By and Group By can use the index as much as possible, and Group By avoids the occurrence of implicit sorting/display sorting as much as possible. I hope you can experiment with your own hands. Now, there must be a different understanding.

​ If there is any problem in this content, or there is something that needs to be added, please leave a comment and make progress together!

Guess you like

Origin blog.csdn.net/weixin_44829930/article/details/121067405