Summary of common database optimization techniques

One, index

This article takes mySQL as an example.
The use of indexes is the most common optimization method. In databases below tens of millions, indexes can greatly improve query efficiency.
Here are some points for attention when using the index.
First introduce several common index types.
Primary key primary key index (all primary keys are primary key indexes, unique, not empty)
Unique unique index (set to be unique after the index)
Index ordinary index (only means that the field is set to index)
Index (id, username) joint index (Set multiple fields as the index, and the index is triggered when these fields are searched at the same time)
Full text Full text index

1. The index has "leftmost matching principle"

In fuzzy query, if left fuzzy and full fuzzy appear, the index will not be hit.
That is, when there is a'%' on the left, the index is not hit.

//命中索引
SELECT id FROM user WHERE username LIKE "aa%";
//未命中索引
SELECT id FROM user WHERE username LIKE "%aa";
//未命中索引
SELECT id FROM user WHERE username LIKE "%aa%";

In a joint index, the left field hits the index.
We assume that only the joint index of Index (id, username, iphone) is set.
If the query condition only appears on the left field, the index can be hit.
But if the field on the left does not appear, but the field on the right appears, the index is not hit.

//名中索引
SELECT * FROM user WHERE id='001' AND username='tom' AND iphone='110';
//名中索引
SELECT * FROM user WHERE id='001' AND username='tom';
//名中索引
SELECT * FROM user WHERE id='001';
//未命中索引
SELECT * FROM user WHERE username='tom' AND iphone='110';
//未命中索引
SELECT * FROM user WHERE id='001' AND iphone='110';
//未命中索引
SELECT * FROM user WHERE iphone='110';

2. Other operations that cannot hit the index

When performing expression operations on fields in the WHERE clause, function operations and other operations cannot be indexed by name

//未名中索引
SELECT id FROM user WHERE number/2 = 100;
//未名中索引
SELECT id FROM user WHERE substring(name,1,3) = 'abc';
//将运算放在‘=’右边后,命中索引
SELECT id FROM user WHERE number = 100*2;

When using OR connection conditions, if one of the conditions does not set an index, the index will not be hit.

//如果id和phone中有一个未设置索引,则不命中索引。
SELECT * FROM user WHERE id='001' OR phone='110';

When judging whether a field is not NULL, the index will not be hit.
When judging whether a field is NULL, hit the index

//未命中索引
SELECT id FROM user WHERE phone IS NOT NULL;
//命中索引
SELECT id FROM user WHERE phone IS NULL;

Index is not hit when using NOT IN

//未命中索引
SELECT * FROM user WHERE id NOT IN(1,2,3);
//命中索引
SELECT * FROM user WHERE id IN(1,2,3);

Indexes are not as many as possible. While indexing can improve the efficiency of the corresponding select, it also reduces the efficiency of insert and update, because the index may be rebuilt during insert or update.
Recommend a good database optimization blog.
Database SQL Optimization Summary 1-Million-level database optimization program

Two, the choice of exists and in

I often hear that the efficiency of in is not as high as exists, and the in clause should be replaced with exists. But this is not the case, we should choose according to the actual situation.
We assume a situation, a student table student and a bedroom table dorm.
Suppose we want to find the student id of all the dormitories in area A, where the section field is not indexed.

select id from student where dorm_id in (select id from dorm where section='A');
select id from student where exists(select null from dorm where id=dorm_id and section='A');

Look at the above two SQL statements, do the same thing, one uses in and the other uses exists.
In the first statement, the outer table can use the index of dorm _id, but the inner table does not use the index.
In the second statement, the outer table does not use the index, and the inner table can use the id index.
Therefore, when the internal table is very large, the first statement does not use the index and it is inefficient, and the second statement uses the index to improve the efficiency.
When the inner table is not large, but the outer table is large, the first statement has an advantage.
All, when using exists, the index is used in the inner table. When in is used, the index is used on the external table.
Summary: IN is suitable for the case where the outer surface is large and the inner surface is small; EXISTS is suitable for the case where the outer surface is small and the inner surface is large.
Please read this blog for details.
Exists and in are more efficient

Three, sub-database and sub-table

When the database data reached tens of millions of levels, the role of indexing became worse and worse. In order to improve efficiency, programmers thought of a method of splitting data.

1. Level score table

When there is too much data in a table, query efficiency will naturally slow down. In order to improve query efficiency, we can split table A into tables A1, A2, A3... the fields are all the same, and the data is evenly distributed in multiple tables according to a certain rule , Improve the efficiency of database operations.

2. Vertical sub-table

Vertical table splitting means that some tables have many fields. We can split some infrequently used and relatively long fields into a new table, that is,'big table split small table', which is convenient for development and maintenance.

After all, the sub-table is still operated in a database, and the problem of insufficient database connections is not solved. The database bottleneck still exists, so we carry out the sub-database operation.

3. Horizontal sub-library

Horizontal sub-database means that all tables are not placed in one database. For example, the tables of the friend module are placed in one database, and the tables of other transaction modules are placed in another database. In this way, high cohesion and low coupling are realized. Today's microservice framework is the application of horizontal sub-database.

4. Vertical sub-library

Vertical sub-database is actually the concept of clustering. The structure of splitting database A into A1, A2, A3... is all the same, and only the data is evenly distributed among multiple databases to improve database operation efficiency.

At the same time, sub-database and sub-table will also bring many new problems, for example, the transaction becomes more complicated, the global primary key to avoid duplication and so on.

Please read this blog for details.
Database sub-database sub-table ideas
(to be continued)

Guess you like

Origin blog.csdn.net/qq_42068856/article/details/86705814