MySQL index optimization (two)


Small table drives big table

Insert picture description here

ORDER BY optimization

Simulation data

create table tblA(
#id int primary key not null auto_increment,
age int,
birth timestamp not null
);

insert into tblA(age, birth) values(22, now());
insert into tblA(age, birth) values(23, now());
insert into tblA(age, birth) values(24, now());

create index idx_A_ageBirth on tblA(age, birth);

View execution plan

Case A

EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY age
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY age,birth
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY birth
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY birth,age
EXPLAIN SELECT * FROM tblA WHERE birth > '2021-02-19 22:45:00' ORDER BY birth
EXPLAIN SELECT * FROM tblA WHERE birth > '2021-02-19 22:45:00' ORDER BY age

Insert picture description here

Case B

EXPLAIN SELECT * FROM tblA ORDER BY age ASC,birth DESC 

Insert picture description here

in conclusion

  • MySQL supports two ways of sorting, index and filesort. Index is highly efficient, it means that the scan index itself is sorted, and filesort is inefficient.
  • ORDER BY clause, try to use Index to sort, avoid filesort to sort.
  • The ORDER BY clause satisfies two conditions, and index sorting will be used. One is that the ORDER BY clause adopts the best left prefix rule; the other is that the where condition field and the ORDER BY clause are combined to meet the best left prefix rule.
    Sorting and grouping optimization

MySQL sorting algorithm

When Using filesort occurs, MySQL will sort the query results according to its own algorithm

Two-way sort

  • Before MySQL 4.1, two-way sorting was used, which literally means to scan the disk twice, and finally get the data, read the row pointer and the order by column, sort them, and then scan the sorted list, and re-start the data according to the values ​​in the list. Read the corresponding data output in the list.
  • Get the sort field from the disk, sort it in the buffer, and then fetch other fields from the disk. In
    simple terms, to fetch a batch of data, the disk must be scanned twice. As we all know, I\O is very time-consuming, so in mysql4. After 1, a second improved algorithm appeared, which is one-way sorting

Single way sort

  • Read all the columns required by the query from the disk, sort them in the buffer according to the order by column, and then scan the sorted list for output. It is faster and avoids reading the data a second time. And it turns random IO into sequential IO, but it uses more space because it saves each row in memory

The problem

  • In sort_buffer, method B takes up a lot more space than method A, because method B takes out all the fields, so it is possible that the total size of the extracted data exceeds the capacity of sort_buffer, resulting in only the size of the sort_buffer data each time. , Perform sorting (create tmp file, multi-channel merging), take the sort_buffer capacity after sorting, and sort again... thus multiple I/Os. That is to say, I wanted to save one I/O operation, but it caused a large number of I/O operations, which was not worth the gain

How to optimize

  • Increase the sort_buffer_sizeparameter setting

    No matter which algorithm is used, improving this parameter will improve efficiency. Of course, it must be improved according to the ability of the system, because this parameter is adjusted between 1M-8M for each process

  • Increase the max_length_for_sort_dataparameter setting

    The premise of mysql using single-way sorting is that the size of the sorted field is smaller than max_length_for_sort_data . Increasing this parameter will increase the probability of using the improved algorithm.

    But if it is set too high, the probability that the total data capacity exceeds sort_buffer_size will increase instead, resulting in high-frequency disk I/O and low processor utilization . (Adjusted between 1024-8192)
    Summary

  • Reduce the query field after select (use less select *)

    When the number of query fields is reduced, the buffer can hold more content, which is equivalent to indirectly increasing the sort_buffer_size

Summary A

Insert picture description here

Summary B

Insert picture description here

GROUP BY optimization

GROUP BY optimization is roughly similar to ORDER BY
Insert picture description here

  • If you want to use the index when sorting and avoid Using filesort, you can use index coverage
  • The order of the fields after ORDER BY /GROUP BY should be exactly the same as the order of the composite index
  • The index after ORDER BY /GROUP BY must appear in order, and the index after it may not appear
  • To perform ascending or descending order, the sort order of the fields must be consistent. Not part of ascending order, part of descending order, both ascending order or all descending order
  • If the field in front of the composite index appears in the filter condition as a constant, the sort field can be the field immediately following it

Guess you like

Origin blog.csdn.net/single_0910/article/details/113873323