How to write elegant native SQL statements?

Preface:

When the article about Mysql basic infrastructure, in order to "sql query specifically how to perform in MySql architecture" for a comprehensive explanation. You know the specific implementation process sql query in MySql architecture, but in order to be able to write better and faster sql statement, I think it is very important to know the order of execution sql statement in each clause. Read an article on a small partner should know that the last execution of each clause sql statement should be completed in the actuator, the actuator data storage engine provides read and write interfaces. Now we learn

Statement summarizes each clause of the complete order of execution (executed in the order number)

  1. from (Note: This also includes statements from the child)
  2. join
  3. on
  4. where
  5. group by (select to start using aliases in the back of the statement can be used)
  6. avg, sum .... like aggregate functions
  7. having
  8. select
  9. distinct
  10. order by
  11. limit

Each order of execution clause analysis

All queries are executed from the from the start, in the implementation process, each step will generate a virtual table for the next step, the virtual table will serve as input to the next step of the implementation.

1. from

form is a query start.

  • If it is a table, this table will directly operate;
  • If this is back from a sub-query first performs content sub-query, the results of the first sub-query is a virtual table T1. (Note: The execution process subquery is in accordance with this article talking about the order oh).
  • If you need to associate tables using join, see 2,3

2. join

If more than one table is from the back, join association will be performed first before a Cartesian product of the two tables, this time will generate a first virtual table T1 (Note: This table will be selected as a basis for a relatively small table);

3. on

Table T1 ON and imaginary filter, matching only those rows would be recorded in the virtual table T2. (Note that, here, where if there is associated with the third table, will be carried out with a third table T2 and T3 table Cartesian product production, continue repeating the step of generating 3. on table T4, but the following explanation order temporarily here for T3 and T4, but from a table associated with the query T2 continued)

4. where

Filtered virtual table T2 WHERE condition. Only records matching will be inserted into the virtual table T3.

5.group by

group by 子句将中的唯一的值组合成为一组,得到虚拟表T4。如果应用了group by,那么后面的所有步骤都只能操作T4的列或者是执行6.聚合函数(count、sum、avg等)。(注意:原因在于分组后最终的结果集中只包含每个组中的一行。谨记,不然这里会出现很多问题,下面的代码误区会特别说。)

6. avg,sum.... 等聚合函数

聚合函数只是对分组的结果进行一些处理,拿到某些想要的聚合值,例如求和,统计数量等,并不生成虚拟表。

7. having

应用having筛选器,生成T5。HAVING子句主要和GROUP BY子句配合使用,having筛选器是第一个也是为唯一一个应用到已分组数据的筛选器。

8. select

执行select操作,选择指定的列,插入到虚拟表T6中。

9. distinct

对T6中的记录进行去重。移除相同的行,产生虚拟表T7.(注意:事实上如果应用了group by子句那么distinct是多余的,原因同样在于,分组的时候是将列中唯一的值分成一组,同时只为每一组返回一行记录,那么所以的记录都将是不相同的。 )

10. order by

应用order by子句。按照order_by_condition排序T7,此时返回的一个游标,而不是虚拟表。sql是基于集合的理论的,集合不会预先对他的行排序,它只是成员的逻辑集合,成员的顺序是无关紧要的。对表进行排序的查询可以返回一个对象,这个对象包含特定的物理顺序的逻辑组织。这个对象就叫游标。
oder by的几点说明

  • 因为order by返回值是游标,那么使用order by 子句查询不能应用于表表达式。
  • order by排序是很需要成本的,除非你必须要排序,否则最好不要指定order by,
  • order by的两个参数 asc(升序排列) desc(降序排列)

11. limit

取出指定行的记录,产生虚拟表T9, 并将结果返回。

The latter may be a limit parameter limit m, it may be limit mn, m represents data from article to article n.

(Note: Many developers prefer to use the statement to solve the paging problem for small data, use the LIMIT clause without any problems, when the amount of data is very large, use LIMIT n, m is very inefficient because of the mechanism LIMIT. it is to start from scratch every time the scan, if you need to start from the 60 million lines, reads three data, you need to position the scan to 60 million lines, then read, and the scanning process is a very inefficient process. Therefore, for large data processing, it is very necessary to establish certain cache mechanism at the application layer)

Development of a written demand for a sql

SELECT `userspk`.`avatar` AS `user_avatar`, 
`a`.`user_id`, 
`a`.`answer_record`, 
 MAX(`score`) AS `score`
FROM (select * from pkrecord  order by score desc) as a 
INNER JOIN `userspk` AS `userspk` 
ON `a`.`user_id` = `userspk`.`user_id`
WHERE `a`.`status` = 1 
AND `a`.`user_id` != 'm_6da5d9e0-4629-11e9-b5f7-694ced396953' 
GROUP BY `user_id`
ORDER BY `a`.`score` DESC 
LIMIT 9;
复制代码

search result:

  • First a brief talk about what I want to check:

Pk records in the table you want to query the highest score of nine user records and their avatar.

  • Through this actually wants to again sql sql each sentence execution order

pk record data structure design tables, each user may have multiple records every day, we need to be grouped under each museum, and the results just want to get a piece of the highest record in each group .

Some explain this sql:

  1. Some students may think that there is no need to directly query sub-query table can pk record, but did not get the expected results, because each set of results after the grouping is not sorted , and the max to get the highest score is certainly correspond the highest score in the packet, but other recording might not be the highest score corresponding to that record. So subquery is necessary, it can be to the original data is first sorted , the highest score of the first piece is the first record corresponding.

Look at the code and the results compared with a sub-query will be, will be able to understand what I have to say the words:

//不使用子查询
SELECT `userspk`.`avatar` AS `user_avatar`, 
`pkrecord`.`user_id`, 
`pkrecord`.`answer_record`, 
`pkrecord`.`id`, 
 MAX(`score`) AS `score`
FROM pkrecord
INNER JOIN `userspk` AS `userspk` 
ON `pkrecord`.`user_id` = `userspk`.`user_id`
WHERE `pkrecord`.`status` = 1 
AND `pkrecord`.`user_id` != 'm_6da5d9e0-4629-11e9-b5f7-694ced396953' 
GROUP BY `user_id`
ORDER BY `pkrecord`.`score` DESC 
LIMIT 9;
复制代码

search result

2. After the data has been sorted in a subquery, the outer sub-query and sort if ordering the same score, the score is reverse, the outer layer of the sort can be removed, there is no need to write it twice.

sql statement aliases

In which case the use of aliases

In SQL statements, you can specify the name of the alias for the table name and field (column)

  • Specifies the name of the alias table

Meanwhile two tables of data query time: Alias ​​not set before:

SELECT article.title,article.content,user.username FROM article, user
复制代码

WHERE article.aid=1 AND article.uid=user.uid 复制代码

After setting an alias:

SELECT a.title,a.content,u.username FROM article AS a, user AS u where a.aid=1 and a.uid=u.uid
复制代码

好处:使用表别名查询,可以使 SQL 变得简洁而更易书写和阅读,尤其在 SQL 比较复杂的情况下

  • 查询字段指定别名

查询一张表,直接对查询字段设置别名

SELECT username AS name,email FROM user
复制代码

查询两张表

好处:字段别名一个明显的效果是可以自定义查询数据返回的字段名;当两张表有相同的字段需要都被查询出,使用别名可以完美的进行区分,避免冲突

SELECT a.title AS atitle,u.username,u.title AS utitle FROM article AS a, user AS u where a.uid=u.uid
复制代码
  • 关联查询时候,关联表自身的时候,一些分类表,必须使用别名。

  • 别名也可以在group by与having的时候都可使用

  • 别名可以在order by排序的时候被使用

    查看上面一段sql

  • delete , update MySQL都可以使用别名,别名在多表(级联)删除尤为有用

delete t1,t2 from t_a t1 , t_b t2 where t1.id = t2.id
复制代码
  • 子查询结果需要使用别名

    查看上面一段sql

别名使用注意事项

  • 虽然定义字段别名的 AS 关键字可以省略,但是在使用别名时候,建议不要省略 AS 关键字

书写sql语句的注意事项

书写规范上的注意

  • 字符串类型的要加单引号
  • select后面的每个字段要用逗号分隔,但是最后连着from的字段不要加逗号
  • 使用子查询创建临时表的时候要使用别名,否则会报错。

为了增强性能的注意

  • 不要使用“select * from ……”返回所有列,只检索需要的列,可避免后续因表结构变化导致的不必要的程序修改,还可降低额外消耗的资源
  • 不要检索已知的列
select  user_id,name from User where user_id = ‘10000050复制代码
  • 使用可参数化的搜索条件,如=, >, >=, <, <=, between, in, is null以及like ‘%’;尽量不要使用非参数化的负向查询,这将导致无法使用索引,如<>, !=, !>, !<, not in, not like, not exists, not between, is not null, like ‘%’
  • 当需要验证是否有符合条件的记录时,使用exists,不要使用count(*),前者在第一个匹配记录处返回,后者需要遍历所有匹配记录
  • Where子句中列的顺序与需使用的索引顺序保持一致,不是所有数据库的优化器都能对此顺序进行优化,保持良好编程习惯(索引相关)
  • 不要在where子句中对字段进行运算或函数(索引相关)
  1. 如where amount / 2 > 100,即使amount字段有索引,也无法使用,改成where amount > 100 * 2就可使用amount列上的索引
  2. 如where substring( Lastname, 1, 1) = ‘F’就无法使用Lastname列上的索引,而where Lastname like ‘F%’或者where Lastname >= ‘F’ and Lastname < ‘G’就可以
  • 在有min、max、distinct、order by、group by操作的列上建索引,避免额外的排序开销(索引相关)

  • 小心使用or操作,and操作中任何一个子句可使用索引都会提高查询性能,但是or条件中任何一个不能使用索引,都将导致查询性能下降,如where member_no = 1 or provider_no = 1,在member_no或provider_no任何一个字段上没有索引,都将导致表扫描或聚簇索引扫描(索引相关)

  • Between一般比in/or高效得多,如果能在between和in/or条件中选择,那么始终选择between条件,并用>=和<=条件组合替代between子句,因为不是所有数据库的优化器都能把between子句改写为>=和<=条件组合,如果不能改写将导致无法使用索引(索引相关)

  • 调整join操作顺序以使性能最优,join操作是自顶向下的,尽量把结果集小的两个表关联放在前面,可提高性能。(join相关) 注意:索引和关联我会单独拿出来两篇文章进行详细讲解,在这个注意事项中只是简单提一下。

力推文章:

Mysql基础架构你不知道的那些事!

两篇文章一起学习能彻底搞懂sql语句到底怎么在架构中执行的,到底应该怎么写优秀的sql。

觉得本文对你有帮助?请分享给更多人

欢迎大家关注我的公众号——程序员成长指北。请自行微信搜索——“程序员成长指北”

Guess you like

Origin blog.csdn.net/weixin_33670786/article/details/91367141