1. Insert data

1.1 insert

If we need to insert multiple records into the database table at one time, we can optimize from the following three aspects.

insert into tb_test values(1,'tom');
insert into tb_test values(2,'cat');
insert into tb_test values(3,'jerry');

1) .Optimization plan one

Insert data in batches

Insert into tb_test values(1,'Tom'),(2,'Cat'),(3,'Jerry');

2) .Optimization plan two

Manually control transactions

start transaction;
insert into tb_test values(1,'Tom'),(2,'Cat'),(3,'Jerry');
insert into tb_test values(4,'Tom'),(5,'Cat'),(6,'Jerry');
insert into tb_test values(7,'Tom'),(8,'Cat'),(9,'Jerry');
commit;

3) .Optimization plan three

The performance of primary key sequential insertion is higher than that of out-of-order insertion.

主键乱序插入 : 8 1 9 21 88 2 4 15 89 5 7 3
主键顺序插入 : 1 2 3 4 5 7 8 9 15 21 88 89

1.2 Inserting data in large batches

If you need to insert a large amount of data at one time ( for example : millions of records ) , the insertion performance of using the insert statement is low. In this case, you can use

Use the load command provided by the MySQL database to insert. Here's how to do it:

You can execute the following instructions to load the data in the data script file into the table structure:

-- 客户端连接服务端时，加上参数 -–local-infile
mysql –-local-infile -u root -p
-- 设置全局参数local_infile为1，开启从本地加载文件导入数据的开关
set global local_infile = 1;
-- 执行load指令将准备好的数据，加载到表结构中
load data local infile '/root/sql1.log' into table tb_user fields
terminated by ',' lines terminated by '\n' ;

The performance of primary key sequential insertion is higher than that of out-of-order insertion

Example demo :

A.Create table structure

CREATE TABLE `tb_user` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`username` VARCHAR(50) NOT NULL,
`password` VARCHAR(50) NOT NULL,
`name` VARCHAR(20) NOT NULL,
`birthday` DATE DEFAULT NULL,
`sex` CHAR(1) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_user_username` (`username`)
) ENGINE=INNODB DEFAULT CHARSET=utf8 ;

B.Set parameters

-- 客户端连接服务端时，加上参数 -–local-infile
mysql –-local-infile -u root -p
-- 设置全局参数local_infile为1，开启从本地加载文件导入数据的开关
set global local_infile = 1;

C. loadload data

load data local infile '/root/load_user_100w_sort.sql' into table tb_user
fields terminated by ',' lines terminated by '\n' ;

We saw that inserting 1 million records was completed in 17 seconds , and the performance was very good.

When loading , the performance of primary key sequential insertion is higher than that of out-of-order insertion.

2. Primary key optimization

In the previous section, we mentioned that the performance of primary key sequential insertion is higher than out-of-order insertion. In this section, we will introduce the tools

The reason for the entity, and then analyze how the primary key should be designed.

1) .Data organization method

In the InnoDB storage engine, table data is organized and stored in order according to the primary key. Tables stored in this way are called index-organized tables.

(index organized table IOT) 。

Row data is stored on the leaf nodes of the clustered index. We have also explained the logical structure diagram of InnoDB before:

In the InnoDB engine, data rows are recorded in logical structure pages , and the size of each page is fixed. By default

16K 。

That means that the rows stored in a page are also limited. If the inserted data row row is not small in the page, it will be stored.

Stored to the next page, the pages will be connected through pointers.

2).Page split

The page can be empty, half filled, or 100% filled . Each page contains 2-N rows of data ( if a row of data is too large,

row overflow) , arranged according to the primary key.

A. Primary key sequence insertion effect

① . Apply for pages from disk and insert them in primary key order

② . The first page is not full, continue inserting to the first page.

③ . When the first page is also full, write the second page, and the pages will be connected through pointers.

④ . When the second page is full, write to the third page

B. The effect of primary key out-of-order insertion

① . Add 1# and 2# pages are already full and store the data as shown in the picture.

② . At this time, insert the record with id 50. Let’s see what happens.

Will a page be opened again and written to a new page?

Won't. Because the leaf nodes of the index structure are in order. In order, it should be stored after 47 .

However, page 1# where 47 is located is already full, and the data corresponding to 50 cannot be stored. Then a new page 3# will be opened at this time .

However, 50 will not be directly stored in page 3# , but the last half of the data in page 1# will be moved to page 3# , and then inserted into page 3#.

50 。

After moving the data and inserting the data with ID 50 , there is a problem with the data order between these three pages. Next to 1#

A page should be 3# , and the next page of 3# is 2# . Therefore, at this time, the linked list pointer needs to be reset.

The above-mentioned phenomenon is called " page splitting " and is a relatively performance-consuming operation.

3). Page merge

The index structure ( leaf nodes ) of the existing data in the table is currently as follows:

When we delete existing data, the specific effects are as follows :

When a row is deleted, the record is not physically deleted, it is just marked for deletion and it is empty.

The time becomes allowed to be used by other record declarations.

When we continue to delete the data record of 2#

When the number of deleted records in the page reaches MERGE_THRESHOLD (the default is 50% of the page ), InnoDB will start looking for the closest

page (front or back) to see if the two pages can be combined to optimize space usage.

After deleting the data and merging the pages, insert new data 21 again , then insert page 3# directly.

The phenomenon of merging pages that occurs inside is called " page merging " .

Knowledge tips:

MERGE_THRESHOLD : The threshold for merging pages can be set by yourself and specified when creating a table or index.

4). Index Design Principles

When meeting business needs, try to reduce the length of the primary key as much as possible.

When inserting data, try to insert sequentially and use AUTO_INCREMENT to auto-increment the primary key. Try not to use UUID to do

The primary key or other natural primary key, such as ID number.

During business operations, avoid modification of primary keys.

3. order by optimization

There are two ways to sort MySQL :

Using filesort: Read the data rows that meet the conditions through the table index or full table scan, and then sort in the sort buffer

The sorting operation is completed in the buffer . All sorting that does not directly return the sorting results through the index is called FileSort sorting.

Using index: Ordered data is directly returned through ordered index sequential scanning. This case is using index and is not required.

Additional sorting, high operating efficiency.

For the above two sorting methods, Using index has high performance, while Using filesort has low performance. We are optimizing the sorting.

During operation, try to optimize to Using index .

Next, let's do a test:

A. Data preparation

Directly delete some of the indexes created for the tb_user table during the previous test.

drop index idx_user_phone on tb_user;
drop index idx_user_phone_name on tb_user;
drop index idx_user_name on tb_user;

B. Execute sorting SQL

explain select id,age,phone from tb_user order by age ;

explain select id,age,phone from tb_user order by age, phone ;

Since there are no indexes for age and phone , when sorting at this time, Using filesort appears , and the sorting performance is low.

C. Create an index

-- 创建索引
create index idx_user_age_phone_aa on tb_user(age,phone);

D. After creating the index, sort in ascending order according to age and phone.

explain select id,age,phone from tb_user order by age; 1

explain select id,age,phone from tb_user order by age , phone; 1

After the index is established, the sorting query is performed again, and the original Using filesort becomes Using index . The performance

It's relatively high.

E. After creating the index, sort in descending order according to age and phone.

explain select id,age,phone from tb_user order by age desc , phone desc ;

Using index also appears , but at this time Backward index scan appears in Extra , which represents reverse scan index.

Because in the index we create in MySQL , the leaf nodes of the default index are sorted from small to large, and at this time we query

When sorting, it is from large to small, so when scanning, it is a reverse scan, and Backward index scan will appear . exist

In MySQL version 8, descending indexes are supported, and we can also create descending indexes.

F. Sort in ascending order according to phone and age , with phone first and age last.

explain select id,age,phone from tb_user order by phone , age;

When sorting , you also need to satisfy the leftmost prefix rule , otherwise filesort will also occur . Because when creating the index, age is the first

field, phone is the second field, so when sorting, you should follow this order, otherwise Using filesort will appear.

F. Sort according to age, phone, one ascending order, one descending order

explain select id,age,phone from tb_user order by age asc , phone desc ;

Because when creating an index, if the order is not specified, the default order is in ascending order, and when querying, one is in ascending order and the other is in descending order.

Using filesort will appear at this time .

In order to solve the above problem, we can create an index. In this joint index, age is sorted in ascending order and phone is sorted in reverse order .

sequence.

G. Create a joint index (age ascending order, phone sorting in reverse order )

create index idx_user_age_phone_ad on tb_user(age asc ,phone desc);

H. Then execute the following SQL again

explain select id,age,phone from tb_user order by age asc , phone desc ;

Ascending / descending joint index structure diagram :

From the above tests , we derive the order by optimization principle :

A. Establish an appropriate index based on the sorting field. When sorting on multiple fields, the leftmost prefix rule is also followed.

B. Try to use covering indexes.

C. Multi-field sorting , one in ascending order and one in descending order. At this time, you need to pay attention to the rules when creating the joint index ( ASC/DESC ).

D. If filesort is unavoidable , you can appropriately increase the sort buffer size when sorting large amounts of data.

sort_buffer_size ( default 256k) .

4. group by optimization

For grouping operations, we mainly look at the impact of indexes on grouping operations.

First, we delete all the indexes of the tb_user table.

drop index idx_user_pro_age_sta on tb_user;
drop index idx_email_5 on tb_user;
drop index idx_user_age_phone_aa on tb_user;
drop index idx_user_age_phone_ad on tb_user;

Next, without an index, execute the following SQL to query the execution plan:

explain select profession , count(*) from tb_user group by profession ;

Then, we create a joint index for profession , age , and status .

create index idx_user_pro_age_sta on tb_user(profession , age , status);

Immediately afterwards, execute the same SQL as before to view the execution plan.

explain select profession , count(*) from tb_user group by profession ;

Then execute the following group query SQL to view the execution plan:

We found that if we group only based on age , Using temporary will appear ; and if we group based on

If two fields of profession and age are grouped at the same time, Using temporary will not appear . The reason is because for grouping operations,

In the joint index, it also conforms to the leftmost prefix rule.

Therefore, in grouping operations, we need to optimize through the following two points to improve performance:

A. During grouping operations, indexes can be used to improve efficiency.

B. During grouping operations, the use of indexes also satisfies the leftmost prefix rule.

5.limit optimization

When the amount of data is relatively large, if limit paging query is performed, the efficiency of paging query will be lower as the query goes later.

Let’s take a look at the time-consuming comparison of executing limit paging queries:

Through testing, we will see that the further back, the lower the efficiency of paging query. This is the problem of paging query.

Because, when performing a paging query, if limit 2000000,10 is executed , MySQL needs to sort the first 2000010 records.

records, only 2000000 - 2000010 records are returned, and other records are discarded. The cost of query sorting is very high.

Optimization idea : In general paging queries, the performance can be better improved by creating a covering index. You can add sub-queries through the covering index.

Query form is optimized.

explain select * from tb_sku t , (select id from tb_sku order by id
limit 2000000,10) a where t.id = a.id;

6.count optimization

6.1 Overview

select count(*) from tb_user ;

In previous tests, we found that if the amount of data is large, it is very time-consuming to perform the count operation.

The MyISAM engine stores the total number of rows of a table on disk, so when count(*) is executed, this will be returned directly.

Count, the efficiency is very high; but if it is a conditional count , MyISAM is also slow.

The InnoDB engine is in trouble. When it executes count(*) , it needs to read the data from the engine line by line.

Come and then accumulate the count.

If you want to greatly improve the count efficiency of the InnoDB table , the main optimization idea is to count by yourself ( you can use redis to

The database is used, but if it is a conditional count , it will be more troublesome ) .

6.2count usage

count() is an aggregate function. The returned result set is judged row by row. If the parameter of the count function is not

NULL , the cumulative value will be added by 1 , otherwise it will not be added, and finally the cumulative value will be returned.

Usage: count ( * ), count (primary key), count (field), count (number)

If sorted according to efficiency, count( field ) < count( primary key id) < count(1) ≈ count(*) , so try to

Use count(*) for the quantity .

7.update optimization

We mainly need to pay attention to the precautions when executing the update statement.

update course set name = 'javaEE' where id = 1 ; 1

When we execute the deleted SQL statement, the data in the row with ID 1 will be locked , and then after the transaction is committed, the row lock is released.

But when we execute the following SQL .

update course set name = 'SpringBoot' where name = 'PHP' ;

When we start multiple transactions and execute the above SQL , we find that the row lock is upgraded to a table lock. causing the update statement

Performance is greatly reduced

InnoDB 's row lock is a lock for the index, not for the record , and the index cannot fail, otherwise it will be upgraded from a row lock to a table lock.

[MySQL] How many of the seven SQL optimization methods do you know?