Operating through a show, I put SQL execution efficiency is improved by 10 million times!

Scenes

I use the database is mysql5.6, following a brief scenario

Class Schedule:

create table Course(

c_id int PRIMARY KEY,

name varchar(10)

)复制代码

Data 100

Student table:

create table Student(

id int PRIMARY KEY,

name varchar(10)

)复制代码

Data 70000

SC student achievement table

CREATE table SC(

    sc_id int PRIMARY KEY,

    s_id int,

    c_id int,

    score int

)复制代码

Data 70w Article

Queries Objective: Find Candidates language exam, 100 points

check sentence

select s.* from Student s where s.s_id in (select s_id from SC sc where sc.c_id = 0 and sc.score = 100 )复制代码

Execution Time: 30248.271s

Halo, why so slow, let's see the next query plan:

EXPLAIN select s.* from Student s where s.s_id in (select s_id from SC sc where sc.c_id = 0 and sc.score = 100 )复制代码


Found that did not use the index, type all ALL, is to create an index, index fields then first thought, of course, is where field conditions.

Give sc table c_id and build an index score

CREATE index sc_c_id_index on SC(c_id);
CREATE index sc_score_index on SC(score);复制代码

The above query is executed again, time: 1.054s

3w times faster, greatly reducing the query time, it appears that the index can greatly improve query efficiency, it is necessary to build the index.

Many have forgotten the time when indexing a small amount of data did not even feel that this optimization feeling very cool.

But time 1s or too long, but also to optimize it, a close look at the implementation plan:

SELECT
    `YSB`.`s`.`s_id` AS `s_id`,
    `YSB`.`s`.`name` AS `name`
FROM
    `YSB`.`Student` `s`
WHERE
    < in_optimizer > (
        `YSB`.`s`.`s_id` ,< EXISTS > (
            SELECT
            FROM
                `YSB`.`SC` `sc`
            WHERE
                (
                    (`YSB`.`sc`.`c_id` = 0)
                    AND (`YSB`.`sc`.`score` = 100)
                    AND (
                        < CACHE > (`YSB`.`s`.`s_id`) = `YSB`.`sc`.`s_id`
                    )
                )
        )
    )复制代码

NOTE: There is a friend asked how to view the statement after the optimization, as follows:

Execute the command window

There type = all

According to my previous idea, the order of execution of the sql should be the first implementation of sub-queries

select s_id from SC sc where sc.c_id = 0 and sc.score = 100复制代码

Consuming: 0.001s

Results are as follows:

Then perform

select s.* from Student s where s.s_id in(7,29,5000)复制代码

Consuming: 0.001s

So is quite fast, ah, Mysql, it is not executed in the first layer of the inquiry, but the sql optimization has become a clause exists, and the emergence of EPENDENT SUBQUERY, mysql outer query is executed first, and then perform a query layer, so 70007 * will cycle eight times.

So instead join query it?

SELECT s.* from 

Student s

INNER JOIN SC sc

on sc.s_id = s.s_id

where sc.c_id=0 and sc.score=100复制代码

In order to re-analyze the situation here join query, temporarily remove the index sc_c_id_index, sc_score_index

Execution time: 0.057s

Improved efficiency, take a look at the execution plan:

There is even the case of table comes out, I guess is not to give s_id sc index table established

CREATE index sc_s_id_index on SC(s_id);

show index from SC

In connection query execution

Time: 1.076s, even time is also longer, what is the reason? View the execution plan:

The optimized query is:

SELECT
    `YSB`.`s`.`s_id` AS `s_id`,
    `YSB`.`s`.`name` AS `name`
FROM
    `YSB`.`Student` `s`
JOIN `YSB`.`SC` `sc`
WHERE
    (
        (
            `YSB`.`sc`.`s_id` = `YSB`.`s`.`s_id`
        )
        AND (`YSB`.`sc`.`score` = 100)
        AND (`YSB`.`sc`.`c_id` = 0)
    )复制代码

Seemingly do first join query, then the filter conditions where

Return to the previous implementation plan:

Here is where conditions do first filter, do not even watch the execution plan is not fixed, then we look at the standard sql execution order:

Under normal circumstances is to join then where filtered, but the situation we have here, if you first join, there will be a 70w of data sent join exercise, so it is wise to perform where filtering scheme

Now in order to exclude sql mysql query optimization, I wrote a post optimization

SELECT
    s.*
FROM
    (
        SELECT
            *
        FROM
            SC sc
        WHERE
            sc.c_id = 0
        AND sc.score = 100
    ) t
INNER JOIN Student s ON t.s_id = s.s_id复制代码

I.e. performing first filtering sc table, then the table is connected, the execution time: 0.054s

And the time until almost not built s_id index, view the execution plan:

Even then extracted first sc table, so efficiency is much higher, the question now is to extract sc occur when the scan table, you can now clear the need to establish the relevant index

CREATE index sc_c_id_index on SC(c_id);CREATE index sc_score_index on SC(score);复制代码

And then execute the query:

SELECT
    s.*
FROM
    (
        SELECT
            *
        FROM
            SC sc
        WHERE
            sc.c_id = 0
        AND sc.score = 100
    ) t
INNER JOIN Student s ON t.s_id = s.s_id复制代码

Execution time: 0.001s, this time quite tricky, 50 times faster

Implementation plan:

We will see, first extract sc, then even the table, have used the index.

Then come back the next sql

SELECT s.* from 

Student s

INNER JOIN SC sc

on sc.s_id = s.s_id

where sc.c_id=0 and sc.score=100复制代码

Execution time 0.001s

Implementation plan:

Here is the mysql query optimization performed, where filtering is performed first, and then perform the join operation, and have used the index.

===========================================================

(I am a gorgeous dividing line)

Recently re-import some production data, has been tested and found that a few days ago finished sql optimize efficiency and a low

SC adjust the content to increase the data table to 300W, student scores are more discrete.

First review under:

show index from SC

Sql execution

SELECT s.* from 

Student s

INNER JOIN SC sc

on sc.s_id = s.s_id

where sc.c_id=81 and sc.score=84复制代码

Execution Time: 0.061s, this time a little slow points

Implementation plan:

Used here intersect and set operation, i.e., the result of the two indices at the same time and then seek to retrieve and set, look c_id discrimination field and the score,

From a single field of view, the discrimination is not great, the SC table lookup, c_id = 81 is the result of the search 70001, score = 84 the result is 39,425.

The c_id = 81 and score = 84 the result is 897, that is to unite these two fields of discrimination is relatively high, so the establishment of a joint index query efficiency will be higher.

From another perspective, the data in the table is 300w, more on this later, store it in terms of the index, are not a small number, with the increasing amount of data, the index can not be fully loaded into memory, but from disk to read, the more the number of such index, the greater the overhead of disk read.

Therefore it is necessary to establish a joint multi-column index based on specific business case, then let's try it.

alter table SC drop index sc_c_id_index;
alter table SC drop index sc_score_index;
create index sc_c_id_score_index on SC(c_id,score)复制代码

Executes the query, consuming time: 0.007s, this rate can still receive

Implementation plan:

Optimization of the statement temporarily come to an end

Summary :

  1. mysql nested sub-query is actually relatively low efficiency

  2. It can be optimized to join query

  3. When the connection table, you can filter the table with a condition where then do table join (even though mysql table statement will do optimization)

  4. Establish appropriate index, the establishment of a joint multi-column index, if necessary

  5. Learn to analyze sql execution plan, mysql sql will be optimized, so it is important to analyze the execution plan

Index Tuning

Subquery optimization mentioned above, as well as how to create an index, but also in multiple fields index, respectively, for the field to establish a single index

In fact, later found to be more efficient to establish joint index, especially in the large amount of data, the discrimination of a single column is not high.

A separate index

Query as follows:

select * from user_test_copy where sex = 2 and type = 2 and age = 10复制代码

Index :

CREATE index user_test_index_sex on user_test_copy(sex);
CREATE index user_test_index_type on user_test_copy(type);
CREATE index user_test_index_age on user_test_copy(age);复制代码

Respectively, sex, type, age field to do an index, the amount of data to 300w, query time: 0.415s

Implementation plan:

Found type = index_merge

This optimization is mysql plurality of separate index, using the result set intersect and set operations

Multi-column index

We can create multi-column index on the three columns, copy the table to do a test

create index user_test_index_sex_type_age on user_test(sex,type,age);复制代码

check sentence:

select * from user_test where sex = 2 and type = 2 and age = 10复制代码

Execution time: 0.032s, 10 times faster, and higher discrimination index plurality of columns, increase the speed of the more 

Implementation plan:

The most left-prefix

Multi-column indexes as well as the most left-prefix feature, execute the following statement:

select * from user_test where sex = 2
select * from user_test where sex = 2 and type = 2
select * from user_test where sex = 2 and age = 10复制代码

Will be used to index, i.e. the first field in the index of sex appears to be conditions where

Index covering

Is the column of the query are established index, so do not get the time to go get the result set of disk data other columns, the index data can be returned directly, such as:

select sex,type,age from user_test where sex = 2 and type = 2 and age = 10复制代码

Execution Time: 0.003s, much faster than taking all fields

Sequence

select * from user_test where sex = 2 and type = 2 ORDER BY user_name复制代码

Time: 0.139s

Sort index on the field will increase the efficiency of sorting

create index user_name_index on user_test(user_name)复制代码

Finally, attach a summary of some sql tuning, after a time further in-depth study:

  1. Column type of numeric types defined as possible and as short as possible, such as primary keys and foreign keys, etc. type field

  2. The establishment of a separate index

  3. According to the need to establish a joint multi-column indexes

    • When a single filter is low, there are many columns of data, then the index of efficiency will be relatively low, i.e., columns of distinguishing

    • If the index on multiple columns, multiple columns that discrimination on most of, there will be a significant increase in efficiency.

  4. Establish business scenarios covering indexes according to business needs only query fields if these fields are covered by the index, will greatly improve query efficiency

  5. You need to be indexed on the field multi-table joins, which can greatly improve the efficiency of the linked table

  6. Needs to be indexed on the field where conditions

  7. You need to create an index on the sort field

  8. It needs to be indexed on a group field

  9. Do not use the arithmetic function on Where the conditions, in order to avoid failure index

END


Author: Fengguowuhen - Tang

Source: http: //www.cnblogs.com/tangyanbo/p/4462734.html

This article belongs to the author of all


Personal Public Number: Huperzine architecture notes (ID: shishan100)

Welcome to long press the map No public concern: the architecture of huperzine notes!

No reply public background information , access to exclusive secret of learning materials

Huperzine architecture notes, BAT architecture experience purse


Guess you like

Origin juejin.im/post/5cec95f3f265da1bb31c18b2