Quality of writing SQL 30 recommendations

Foreword

This will combine the example demo, set forth 30 recommendations regarding the optimization of SQL, most of the actual development summed up, I want to help.

1, try not to use SQL query select *, but select specific fields.

Anti examples:

select * from employee;

Positive examples:

select id,name from employee;

Reason:

  • Take only required fields, save resources, reduce network overhead.
  • select * When queried, most likely will not be used to cover the index, it will cause the query back to the table.

2. If you know the results as long as only one or maximum / minimum record, recommended limit 1

Suppose there are employee staff table, to find out a person's name is jay.

CREATE TABLE `employee` (
`id` int(11) NOT NULL,
`name` varchar(255) DEFAULT NULL,
`age` int(11) DEFAULT NULL,
`date` datetime DEFAULT NULL,
`sex` int(1) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Counterexample:

select id,name from employee where name='jay'

Positive example

select id,name from employee where name='jay' limit 1;

Reason:

  • Plus the limit 1, as long as a corresponding record is found, it will not continue to scan downward, efficiency will be greatly improved.
  • Of course, if the name is a unique index, it is not necessary to add a limit 1, mainly because of the limit is to prevent the full table scan to improve performance, if a statement in itself can not predict the full table scan, there is no limit, performance the difference is not significant.

3, should be avoided or used to join condition in the where clause

Create a user table, which has the userId a general index, the table structure is as follows:

CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`userId` int(11) NOT NULL,
`age` int(11) NOT NULL,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_userId` (`userId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Suppose now need to check userid is 1 to 18 years of age or users, it is easy to have the following sql

Counterexample:

select * from user where userid=1 or age =18

Positive examples:

//使用union all
select * from user where userid=1
union all
select * from user where age = 18

//或者分开两条sql写:
select * from user where userid=1
select * from user where age = 18

Reason:

  • Or may make use of the index fail, so a full table scan.

For there is no index or + age this case, assuming it's gone userId of the index, but walked age when the query conditions, it will have a full table scan, which is a three-step process: full table scan index scan + + merger if it the beginning and left a full table scan, scan it again directly on the bin. mysql is optimizer, in efficiency and cost considerations, experience or condition, the index may fail, it looks reasonable.

4, page optimization limit

When we do daily demand paging, usually with a limit to achieve, but when the offset is particularly large, the query efficiency becomes low.

Counterexample:

select id,name,age from employee limit 10000,10

Positive examples:

//方案一 :返回上次查询的最大记录(偏移量)
select id,name from employee where id>10000 limit 10.

//方案二:order by + 索引
select id,name from employee order by id limit 10000,10

//方案三:在业务允许的情况下限制页数:

Reason:

  • When the maximum offset time, query efficiency will be lower, because the offset Mysql not Skip back to fetch the data directly, but the amount of + first offset to take the number of pieces, and then the front partial this section of the shift amount of data discarded back again.
  • If you use an optimization program, the maximum query returns the last record (offset), so you can skip offset efficiency gains a lot.
  • The second scheme, order by + index, but also can improve query efficiency.
  • Program III, the proposal discussed with businesses, so there is no need to search page after it. Because the vast majority of users will not be too much to turn back page.

5, optimize your statement like

Daily development, if used fuzzy keyword search, it is easy to think like, but like the index is likely to make you fail.

Counterexample:

select userId,name from user where userId like '%123';

Positive examples:

select userId,name from user where userId like '123%';

Reason:

  • % Put the front, do not take the index, as follows:
  • % Put the key back, or will it go the index. as follows:

6, where the use of defined conditions to query the data, avoiding redundant rows returned

Assuming that business scenario is this: whether a user is a member of the query. Have seen the realization of the code is so old. . .

Counterexample:

List<Long> userIds = sqlMap.queryList("select userId from user where isVip=1");
boolean isVip = userIds.contains(userId);

Positive examples:

Long userId = sqlMap.queryObject("select userId from user where userId='userId' and isVip='1' ")
boolean isVip = userId!=null;

Reason:

  • What data need, went to check what data to avoid returning unnecessary data, save money.

7, to avoid the use of built-in functions mysql index columns

Business needs: Query landed within the last seven days user (assuming that loginTime plus the index)

Counterexample:

select userId,loginTime from loginuser where Date_ADD(loginTime,Interval 7 DAY) >=now();

Positive examples:

explain  select userId,loginTime from loginuser where  loginTime >= Date_ADD(NOW(),INTERVAL - 7 DAY);

Reason:

  • Mysql using built-in functions of the index columns, the index failed

img

  • If the index column without built-in functions, the index will still go.

8, should be avoided for the fields in the where clause expression operations, which will cause the system to abandon using the index and full table scan

Counterexample:

select * from user where age-1 =10;

Positive examples:

select * from user where age =11;

Reason:

  • Although age plus the index, but because it calculates the index directly lost. . .

9, Inner join, left join, right join, precedence Inner join, if it is left join, the results table on the left as small as possible

  • When connected in the Inner join, to connect the two tables in a query, the result set to retain only two tables exact match
  • left join When two tables join query will return all rows from the left table, even if there is no matching record in the right table.
  • right join two tables when the connection query returns all rows in the right table, even if there is no matching records in the left table.

Premise SQL needs are met, we recommend using the priority Inner join (en), if you want to use the left join, the results left the table data as small as possible, if the conditions of the process as much as possible into the left.

Counterexample:

select * from tab1 t1 left join tab2 t2  on t1.size = t2.size where t1.id>2;

Positive examples:

select * from (select * from tab1 where id >2) t1 left join tab2 t2 on t1.size = t2.size;

Reason:

  • If the inner join is equivalent connection, perhaps the number of rows returned relatively small, so the relative performance will be better.
  • Similarly, the use of the left join, the results left the table data as small as possible, as far as possible into the left processing conditions, the number of rows returned may mean less.

10, should be avoided in the where clause! = Or <> operator, otherwise the engine to give up using the index and a full table scan.

Counterexample:

select age,name  from user where age <>18;

Positive examples:

//可以考虑分开两条sql写
select age,name from user where age <18;
select age,name from user where age >18;

Reason:

  • Use! = And <> is likely to make the index fail

img

11, when used in conjunction index, pay attention to the order of the index column, follow the general principles of the leftmost match.

Table Structure :( a joint index idx_userid_age, userId front, age after)

CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`userId` int(11) NOT NULL,
`age` int(11) DEFAULT NULL,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_userid_age` (`userId`,`age`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;

Counterexample:

select * from user where age = 10;

img

Positive examples:

//符合最左匹配原则
select * from user where userid=10 and age =10;
//符合最左匹配原则
select * from user where userid =10;

Reason:

  • When we create a joint index of the time, such as (k1, k2, k3), the equivalent of creating (k1), (k1, k2) and (k1, k2, k3) three indexes, this is the most left-matching principle.
  • Joint index does not meet the leftmost principle, the general index will fail, but this was kind enough to Mysql optimizer related.

12, query optimization, indexing should be considered in the column where and order by involved, try to avoid full table scan.

Counterexample:

select * from user where address ='深圳' order by age ;

Positive examples:

添加索引
alter table user add index idx_address_age (address,age)

img

13, if you insert too much data, consider the bulk insert.

Counterexample:

for(User u :list){
INSERT into user(name,age) values(#name#,#age#)
}

Positive examples:

//一次500批量插入,分批进行
insert into user(name,age) values
<foreach collection="list" item="item" index="index" separator=",">
(#{item.name},#{item.age})
</foreach>

Reason:

  • Bulk insert performance, save more time

Analogy: If you need to move a million bricks to the roof, you have an elevator, the elevator one can put the right amount of brick (put up to 500), you can choose a delivery of a brick, can carry 500 once, you feel what time consumption?

14, when appropriate, the use of a covering index.

Covering index can make your SQL statement does not need back to the table, the index only access will be able to get all the necessary data, greatly improving the efficiency of queries.

Counterexample:

// like模糊查询,不走索引了
select * from user where userid like '%123%'

Positive examples:

//id为主键,那么为普通索引,即覆盖索引登场了。
select id,name from user where userid like '%123%';

img

15, caution distinct keywords

distinct keyword is generally used to filter duplicate records to return records will not be repeated. When used in the query field or a few field situation, to bring query optimization results. But a lot of time in the field use, but it will greatly reduce the search efficiency.

Counterexample:

SELECT DISTINCT * from  user;

Positive examples:

select DISTINCT name from user;

Reason:

  • Statements cpu time and elapsed time are higher than with a distinct statement with no distinct of. Because when a lot of queries field, if you use distinct, the database engine will compare the data, filter out duplicate data, however this comparison, the filtration process consumes system resources, cpu time.

16, delete redundant and duplicate index

Counterexample:

  KEY `idx_userId` (`userId`)
KEY `idx_userId_age` (`userId`,`age`)

Positive examples:

  //删除userId索引,因为组合索引(A,B)相当于创建了(A)和(A,B)索引
KEY `idx_userId_age` (`userId`,`age`)

Reason:

  • The need to maintain duplicate index, and the optimizer also need to be considered individually when optimizing the query, which can affect performance.

17, if a large amount of data, optimize your modify / delete statement.

Avoid too much at the same time to modify or delete data, because it will cause high cpu utilization, thus affecting others access to the database.

Counterexample:

//一次删除10万或者100万+?
delete from user where id <100000;
//或者采用单一循环操作,效率低,时间漫长
for(User user:list){
delete from user;
}

Positive examples:

//分批进行删除,如每次500
delete user where id<500
delete product where id>=500 and id<1000;

Reason:

  • Deleted at once too much data, there may be a lock wait timeout exceed the error, it is recommended that a batch operation.

18, where considered clause default value instead of null.

Counterexample:

select * from user where age is not null;

Positive examples:

//设置0为默认值
select * from user where age>0;

Reason:

  • Not to say that the use is null or is not null will not take the index, this query with mysql version and costs are related.

If mysql optimizer found, take the index is even higher than the cost of not taking the index will certainly give up the index, these conditions are !=,>is null,is not nulloften considered to make index failure, in fact, because under normal circumstances, the high cost of the query optimizer automatically give up.

  • If the null value, replace the default value, a lot of time to make it possible to take the index, while the meaning of the expression will be relatively clearer.

19, do not have table joins more than five or more

  • Even more the table, the greater the compile time and expense.
  • The connection table apart into several smaller execution, more readable.
  • If we need to connect the table to get a lot of data, it means bad design.

20, exist & in the rational use of

Assumptions Table A shows a company's employees table, Table B shows the sector tables, queries, all employees in all sectors, it is likely to have the following SQL:

select * from A where deptId in (select deptId from B);

This is equivalent to write:

First query the department table B

select deptId from B

Then by the department deptId, A query of employees

select * from A where A.deptId = B.deptId

Such can be abstracted into a loop:

   List<> resultSet ;
for(int i=0;i<B.length;i++) {
for(int j=0;j<A.length;j++) {
if(A[i].id==B[j].id) {
resultSet.add(A[i]);
break;
}
}
}

Obviously, in addition to use in, we can also use the search function exists to achieve the same, as follows:

select * from A where exists (select 1 from B where A.deptId = B.deptId);

Because the query is understanding exists, the first main query is executed, after obtaining the data, and then into sub-query conditions do verification, according to the verification result (true or false), to determine whether the results of the main query data proud reservations.

So, this is equivalent to write:

select * from A, Table A do loop start

select * from B where A.deptId = B.deptId, then do the loop B from the table.

Similarly, such a loop can be abstracted as:

   List<> resultSet ;
for(int i=0;i<A.length;i++) {
for(int j=0;j<B.length;j++) {
if(A[i].deptId==B[j].deptId) {
resultSet.add(A[i]);
break;
}
}
}

The most strenuous database release is linked with the program. If a link twice, each time to cook millions of dataset query, finishing up and left, so only twice; on the contrary established millions of links, links for the release repeated again and again, so that the system by not come. That mysql optimization principle, is a small table-driven large table, small data set to drive large data sets, allowing better performance.

Therefore, we have to choose the outermost small loop, i.e., if the data amount of B is less than A, suitable for use in, if the data B is greater than A, that is adapted to select exist .

21, with the union all possible alternative union

If you do not have duplicate records in the search results, it is recommended to replace union all union.

Counterexample:

select * from user where userid=1 
union
select * from user where age = 10

Positive examples:

select * from user where userid=1 
union all
select * from user where age = 10

Reason:

  • If a union, regardless of search results there are no duplicates, will try to merge, and then sort the results before final output. If the search result is no duplicate records is known, instead of using the union all Union, which will increase efficiency.

22, the index should not be too much, generally less than 5.

  • The index is not possible, although the index to improve query efficiency, but also reduces the efficiency of insert and update.
  • There insert or update you may rebuild the index, the indexing requires careful consideration, as the case may be set.
  • An index number table is best not more than five, if too much need to consider whether some of the index no longer necessary.

23, make use of numeric fields, if only the fields containing numerical information as not designed for the character

Counterexample:

king_id` varchar(20) NOT NULL COMMENT '守护者Id'

Positive examples:

`king_id` int(11) NOT NULL COMMENT '守护者Id'`

Reason:

  • With respect to the numeric field, the query character and reduces the performance of the connection, and increases storage costs.

24, the index built on the field is not suitable for a large number of duplicate data, such as gender this type of database fields.

Because SQL query optimizer is to optimize the amount of data in accordance with the table, if the index column has a lot of duplicate data, Mysql query optimizer find a lower cost projections do not take the index, the index is likely to give up.

25, returned to the client to avoid excessive amount of data.

Assuming that business demand is that the user requests to view the live data you've watched the last year.

Counterexample:

//一次性查询所有数据回来
select * from LivingInfo where watchId =useId and watchTime >= Date_sub(now(),Interval 1 Y)

Positive examples:

//分页查询
select * from LivingInfo where watchId =useId and watchTime>= Date_sub(now(),Interval 1 Y) limit offset,pageSize

//如果是前端分页,可以先查询前两百条记录,因为一般用户应该也不会往下翻太多页,
select * from LivingInfo where watchId =useId and watchTime>= Date_sub(now(),Interval 1 Y) limit 200 ;

26, when connecting the plurality of tables in the SQL statement, the table using the alias, and the alias prefix on each column, so that the semantics more clear.

Counterexample:

select  * from A inner
join B on A.deptId = B.deptId;

Positive examples:

select  memeber.name,deptment.deptName from A member inner
join B deptment on member.deptId = deptment.deptId;

27, as far as possible the use of varchar / nvarchar instead of char / nchar.

Counterexample:

  `deptName` char(100) DEFAULT NULL COMMENT '部门名称'

Positive examples:

  `deptName` varchar(100) DEFAULT NULL COMMENT '部门名称'

Reason:

  • Because first of all variable-length fields small storage space, you can save storage space.
  • Followed by the query, search within a relatively small field, more efficient.

28, in order to improve the efficiency of the group by statement, may be performed prior to the statement, to filter out unwanted records.

Counterexample:

select job,avg(salary) from employee  group by job having job ='president' 
or job = 'managent'

Positive examples:

select job,avg(salary) from employee where job ='president' 
or job = 'managent' group by job;

29, the field type is a string, surrounded by quotation marks when certain where, otherwise failure index

Counterexample:

select * from user where userid =123;

Positive examples:

select * from user where userid ='123';

Reason:

  • Why did not the first statement in single quotation marks do not take the index of it? This is because without the single quotation marks, is comparing strings with numbers that do not match the type, MySQL will do implicit type conversion, to convert them into floating-point numbers do compare.

30, using SQL explain your analysis plan

Daily development time to write SQL, try to develop a habit of it. Explain what you write with the analysis of SQL, especially the index do not go away this one.

explain select * from user where userid =10086 or age =18;

 

Guess you like

Origin www.cnblogs.com/frankyou/p/12597853.html