MySQL combat: SQL tuning case combat

Please add image description

Basic ideas of SQL tuning

Operating System SQL Tuning in the Scenario of 10 Million Users

This article summarizes from "Take you to become a master of MySQL combat optimization from scratch"

In an Internet company, it is necessary to use the operating system to filter out a large number of users, and then push messages to these users. The executed sql is as follows:
insert image description here
users store the core data of users, such as id, name, nickname, etc.

users_extent_info stores extended information of users, such as home address, hobbies, last login time, etc.

Of course, we have to count the number of related users first, so execute the following sql
insert image description here

In the scenario of a large table with a data volume of tens of millions, the above sql runs out at a speed of tens of seconds. The corresponding execution plan is as follows
insert image description here
. First, look at the third line of the execution plan. The select_type is MATERIALIZED, indicating that the subquery has been materialized

Then see that the first and second rows of the execution plan have the same id values, indicating that the materialized table and the users table are connected and queried.

Since it is a join query, it means that when mysql generates an execution plan, an ordinary in statement is automatically optimized into a join operation based on semi_join

And this join operation driving table and driven table are full table scans, which is the reason for poor performance

Let's verify the idea. First, execute set optimizer_switch='semijoin=off' to turn off the semi-join optimization. At this time, execute the explain command to see the execution plan at this time, and find that it returns to a normal state at this time.

That is, there is a subquery of SUBQUERY, which is scanned based on range, and then there is a main query of PRIMARY type, which performs search directly based on the clustered index of the primary key. Run this sql again, and find that the performance has improved dozens of times, and it has become more than 100 seconds.

Of course, these configurations cannot be changed at will in the production environment, so we only need to change the way sql is written so that it does not generate semi-query optimization

insert image description here
An or condition is added to the original statement, but the or condition cannot be established, because the latest_login_time without data is less than -1. Since the addition of the or condition does not satisfy the conditions for semi-query optimization, the semi-query optimization will not be performed. Query optimization, but normal use of subqueries

SQL Tuning Practice of Billion-level Data Commodity System

A slow query was found in the monitoring system of the database.
insert image description here
This is a very simple statement. It is filtered according to the category of the product and its subcategories, and then sorted by id in reverse order, and finally paginated. The above statement actually takes tens of seconds to execute

The connection of the database is basically filled with slow queries. A connection needs to execute sql for dozens of seconds before the next sql can be executed, and the database is basically scrapped.

insert image description here
It stands to reason that when the category index is used, the speed is very fast, explain it to see a wave

There is our index_category in possible_keys, and the result is not actually using this index, but PRIMARY

Use the force index syntax
insert image description here
to force sql to use the sql you specified. At this time, if you execute this statement again, you will find that it only takes 100ms, and the performance will come up instantly.

SQL Tuning Practice of Billion Order Comment System

sql for pagination query on comments table
insert image description here

where product_id = 'xx' and is_good_comment = '1' These two conditions are not a joint index, and a large number of return table operations will inevitably occur, which is extremely time-consuming

Rewriting the above sql
insert image description here
statement will completely change his execution plan. First execute the sub-query in the parentheses. The sub-query will use the PRIMARY clustered index to scan in the reverse direction of the id value of the clustered index, and match where product_id = 'xx' and is_good_comment = '1' conditional data is filtered out

You will see the result set for the subquery from the execution plan, a temporary table, perform a full table scan, get 20 pieces of data, and then traverse the 20 pieces of data, each piece of data goes to the clustered index according to the id to find the complete data, that's it

Summarize

The in statement is optimized in the form of semi_join. The 2-table join leads to low execution efficiency. Rewrite the sql to change it into a SUBQUERY subquery.

The sql statement does not use the index correctly, use force index to force the index to be used

Deep paging results in a large number of back-to-table operations, so change it to a derived table query

Reference blog

[1]https://mp.weixin.qq.com/s/2ATCvniADrxyb0MhV5k3EQ

Guess you like

Origin blog.csdn.net/zzti_erlie/article/details/123645848