1. Insert data
1.1 insert
If we need to insert multiple records into the database table at one time, we can optimize from the following three aspects.
insert into tb_test values(1,'tom');
insert into tb_test values(2,'cat');
insert into tb_test values(3,'jerry');
1)
.Optimization plan one
Insert data in batches
Insert into tb_test values(1,'Tom'),(2,'Cat'),(3,'Jerry');
2)
.Optimization plan two
Manually control transactions
start transaction;
insert into tb_test values(1,'Tom'),(2,'Cat'),(3,'Jerry');
insert into tb_test values(4,'Tom'),(5,'Cat'),(6,'Jerry');
insert into tb_test values(7,'Tom'),(8,'Cat'),(9,'Jerry');
commit;
3)
.Optimization plan three
The performance of primary key sequential insertion is higher than that of out-of-order insertion.
主键乱序插入 : 8 1 9 21 88 2 4 15 89 5 7 3
主键顺序插入 : 1 2 3 4 5 7 8 9 15 21 88 89
1.2 Inserting data in large batches
If you need to insert a large amount of data at one time
(
for example
:
millions of records
)
, the insertion performance of using the
insert
statement is low. In this case, you can use
Use the
load command provided by the
MySQL database
to insert. Here's how to do it:
You can execute the following instructions to load the data in the data script file into the table structure:
-- 客户端连接服务端时,加上参数 -–local-infile
mysql –-local-infile -u root -p
-- 设置全局参数local_infile为1,开启从本地加载文件导入数据的开关
set global local_infile = 1;
-- 执行load指令将准备好的数据,加载到表结构中
load data local infile '/root/sql1.log' into table tb_user fields
terminated by ',' lines terminated by '\n' ;
The performance of primary key sequential insertion is higher than that of out-of-order insertion
Example demo
:
A.Create
table structure
CREATE TABLE `tb_user` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`username` VARCHAR(50) NOT NULL,
`password` VARCHAR(50) NOT NULL,
`name` VARCHAR(20) NOT NULL,
`birthday` DATE DEFAULT NULL,
`sex` CHAR(1) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_user_username` (`username`)
) ENGINE=INNODB DEFAULT CHARSET=utf8 ;
B.Set
parameters
-- 客户端连接服务端时,加上参数 -–local-infile
mysql –-local-infile -u root -p
-- 设置全局参数local_infile为1,开启从本地加载文件导入数据的开关
set global local_infile = 1;
C. loadload
data
load data local infile '/root/load_user_100w_sort.sql' into table tb_user
fields terminated by ',' lines terminated by '\n' ;
We saw that inserting
1 million
records
was completed in
17 seconds , and the performance was very good.
When loading , the performance of primary key sequential insertion is higher than that of out-of-order insertion.
2. Primary key optimization
In the previous section, we mentioned that the performance of primary key sequential insertion is higher than out-of-order insertion. In this section, we will introduce the tools
The reason for the entity, and then analyze how the primary key should be designed.
1)
.Data organization method
In
the InnoDB
storage engine, table data is organized and stored in order according to the primary key. Tables stored in this way are called index-organized tables.
(index organized table IOT)
。
Row data is stored on the leaf nodes of the clustered index. We have also explained
the logical structure diagram of
InnoDB before:
In the
InnoDB
engine, data rows are recorded in logical structure
pages
, and the size of each page is fixed. By default
16K
。
That means that the rows stored in a page are also limited. If the inserted data row
row
is not small in the page, it will be stored.
Stored to the next page, the pages will be connected through pointers.
2).Page
split
The page can be empty, half filled, or
100% filled
. Each page contains
2-N
rows of data
(
if a row of data is too large,
row overflow)
, arranged according to the primary key.
A.
Primary key sequence insertion effect
①
.
Apply for pages from disk and insert them in primary key order
②
.
The first page is not full, continue inserting to the first page.
③
.
When the first page is also full, write the second page, and the pages will be connected through pointers.
④
.
When the second page is full, write to the third page
B.
The effect of primary key out-of-order insertion
①
.
Add
1# and 2#
pages are already full and store the data as shown in the picture.
②
.
At this time, insert the record
with
id
50. Let’s see what happens.
Will a page be opened again and written to a new page?
Won't. Because the leaf nodes of the index structure are in order. In order, it should be stored after
47
.
However, page
1# where
47
is located
is already full, and the data corresponding to 50 cannot be stored. Then a new page 3# will be opened at this time .
However,
50 will not be directly
stored in page
3#
, but
the last half of the data in page
1# will be moved to page
3# , and then inserted into page 3#.
50
。
After moving the data and inserting the data
with
ID
50 , there is a problem with the data order between these three pages. Next to 1#
A page should be 3#
, and
the next page of
3# is
2# . Therefore, at this time, the linked list pointer needs to be reset.
The above-mentioned phenomenon is called
"
page splitting
"
and is a relatively performance-consuming operation.
3).
Page merge
The index structure (
leaf nodes
)
of the existing data in the table is currently
as follows:
When we delete existing data, the specific effects are as follows
:
When a row is deleted, the record is not physically deleted, it is just marked
for
deletion and it is empty.
The time becomes allowed to be used by other record declarations.
When we continue to delete the data record of 2#
When the number of deleted records in the page reaches
MERGE_THRESHOLD (the default is
50%
of the page
), InnoDB will start looking for the closest
page (front or back) to see if the two pages can be combined to optimize space usage.
After deleting the data and merging the pages, insert new data 21 again , then insert page 3# directly.
The phenomenon of merging pages that occurs inside is called " page merging " .
Knowledge tips:MERGE_THRESHOLD : The threshold for merging pages can be set by yourself and specified when creating a table or index.
4).
Index Design Principles
When meeting business needs, try to reduce the length of the primary key as much as possible.
When inserting data, try to insert sequentially and use
AUTO_INCREMENT
to auto-increment the primary key.
Try not to use
UUID
to do
The primary key or other natural primary key, such as ID number.
During business operations, avoid modification of primary keys.
3. order by optimization
There are two ways to sort MySQL :
Using filesort:
Read the data rows that meet the conditions through the table index or full table scan, and then sort in the
sort buffer
The sorting operation is completed in the buffer
. All sorting that does not directly return the sorting results through the index is called
FileSort
sorting.
Using index:
Ordered data is directly returned through ordered index sequential scanning. This case is
using index
and is not required.
Additional sorting, high operating efficiency.
For the above two sorting methods,
Using index
has high performance, while
Using filesort
has low performance. We are optimizing the sorting.
During operation, try to optimize to
Using index
.
Next, let's do a test:
A.
Data preparation
Directly delete some of the indexes created for
the tb_user table
during the previous test.
drop index idx_user_phone on tb_user;
drop index idx_user_phone_name on tb_user;
drop index idx_user_name on tb_user;
B. Execute sorting SQL
explain select id,age,phone from tb_user order by age ;
explain select id,age,phone from tb_user order by age, phone ;
Since there are no indexes for
age and phone
, when sorting at this time,
Using filesort appears
, and the sorting performance is low.
C.
Create an index
-- 创建索引
create index idx_user_age_phone_aa on tb_user(age,phone);
D.
After creating the index,
sort in ascending order according to
age and phone.
explain select id,age,phone from tb_user order by age; 1
explain select id,age,phone from tb_user order by age , phone; 1
After the index is established, the sorting query is performed again, and the original
Using filesort
becomes
Using index
. The performance
It's relatively high.
E.
After creating the index,
sort in descending order according to
age and phone.
explain select id,age,phone from tb_user order by age desc , phone desc ;
Using index
also appears
, but at this time
Backward index scan appears in
Extra
, which represents reverse scan index.
Because in the
index we create in
MySQL , the leaf nodes of the default index are sorted from small to large, and at this time we query
When sorting, it is from large to small, so when scanning, it is a reverse scan, and Backward index scan will appear
. exist
In MySQL
version 8, descending indexes are supported, and we can also create descending indexes.
F.
Sort in ascending order
according to
phone
and
age , with
phone first and age last.
explain select id,age,phone from tb_user order by phone , age;
When sorting
,
you also need to satisfy the leftmost prefix rule
,
otherwise
filesort will also occur
. Because when creating the index,
age
is the first
field,
phone
is the second field, so when sorting, you should follow this order, otherwise
Using filesort will appear.
F.
Sort according to
age, phone,
one ascending order, one descending order
explain select id,age,phone from tb_user order by age asc , phone desc ;
Because when creating an index, if the order is not specified, the default order is in ascending order, and when querying, one is in ascending order and the other is in descending order.
Using filesort will appear at this time
.
In order to solve the above problem, we can create an index. In this joint index,
age
is sorted in ascending order and
phone
is sorted in reverse order .
sequence.
G.
Create a joint index
(age
ascending order,
phone
sorting in reverse order
)
create index idx_user_age_phone_ad on tb_user(age asc ,phone desc);
H. Then execute the following SQL again
explain select id,age,phone from tb_user order by age asc , phone desc ;
Ascending / descending joint index structure diagram :
From the above tests
,
we derive
the order by
optimization principle
:
A.
Establish an appropriate index based on the sorting field. When sorting on multiple fields, the leftmost prefix rule is also followed.
B.
Try to use covering indexes.
C.
Multi-field sorting
,
one in ascending order and one in descending order. At this time, you need to pay attention to the rules when creating the joint index (
ASC/DESC
).
D. If
filesort
is unavoidable
, you can appropriately increase the sort buffer size when sorting large amounts of data.
sort_buffer_size (
default
256k)
.
4. group by optimization
For grouping operations, we mainly look at the impact of indexes on grouping operations.
First, we
delete all the indexes of the
tb_user table.
drop index idx_user_pro_age_sta on tb_user;
drop index idx_email_5 on tb_user;
drop index idx_user_age_phone_aa on tb_user;
drop index idx_user_age_phone_ad on tb_user;
Next, without an index, execute the following
SQL
to query the execution plan:
explain select profession , count(*) from tb_user group by profession ;
Then, we create a joint index for
profession
,
age
,
and status .
create index idx_user_pro_age_sta on tb_user(profession , age , status);
Immediately afterwards, execute the same
SQL as before
to view the execution plan.
explain select profession , count(*) from tb_user group by profession ;
Then execute the following group query
SQL
to view the execution plan:
We found that if we group only based on
age ,
Using temporary
will appear
; and if we group based on
If two fields of profession and age
are grouped at the same time,
Using temporary will not appear
. The reason is because for grouping operations,
In the joint index, it also conforms to the leftmost prefix rule.
Therefore, in grouping operations, we need to optimize through the following two points to improve performance:
A.
During grouping operations, indexes can be used to improve efficiency.
B.
During grouping operations, the use of indexes also satisfies the leftmost prefix rule.
5.limit optimization
When the amount of data is relatively large, if
limit
paging query is performed, the efficiency of paging query will be lower as the query goes later.
Let’s take a look at
the time-consuming comparison of executing limit paging queries:
Through testing, we will see that the further back, the lower the efficiency of paging query. This is the problem of paging query.
Because, when performing a paging query, if
limit 2000000,10 is executed
,
MySQL needs
to sort the first
2000010
records.
records, only
2000000 - 2000010
records are returned, and other records are discarded. The cost of query sorting is very high.
Optimization idea
:
In general paging queries, the performance can be better improved by creating a covering index. You can add sub-queries through the covering index.
Query form is optimized.
explain select * from tb_sku t , (select id from tb_sku order by id
limit 2000000,10) a where t.id = a.id;
6.count optimization
6.1 Overview
select count(*) from tb_user ;
In previous tests, we found that if the amount of data is large,
it is very time-consuming to perform
the count operation.
The MyISAM
engine stores the total number of rows of a table on disk, so
when
count(*) is executed, this will be returned directly.
Count, the efficiency is very high; but if it is a conditional
count
,
MyISAM
is also slow.
The InnoDB
engine is in trouble. When it executes
count(*)
, it needs to read the data from the engine line by line.
Come and then accumulate the count.
If you want to greatly improve the
count efficiency of
the InnoDB
table
, the main optimization idea is to count by yourself ( you can use redis to
The database is used,
but if it is a conditional
count
, it will be more troublesome
)
.
6.2count usage
count()
is an aggregate function. The returned result set is judged row by row. If
the parameter of the
count function is not
NULL
, the cumulative value will be added by
1
, otherwise it will not be added, and finally the cumulative value will be returned.
Usage:
count
(
*
),
count
(primary key),
count
(field),
count
(number)
If sorted according to efficiency, count( field ) < count( primary key id) < count(1) ≈ count(*) , so try toUse count(*) for the quantity .
7.update optimization
We mainly need to pay attention to
the precautions when executing the
update statement.
update course set name = 'javaEE' where id = 1 ; 1
When we execute the deleted
SQL statement,
the data in the row with ID 1 will
be locked , and then after the transaction is committed, the row lock is released.
But when we execute the following
SQL
.
update course set name = 'SpringBoot' where name = 'PHP' ;
When we start multiple transactions and execute the above
SQL
, we find that the row lock is upgraded to a table lock. causing the
update
statement
Performance is greatly reduced
InnoDB 's row lock is a lock for the index, not for the record , and the index cannot fail, otherwise it will be upgraded from a row lock to a table lock.