MySQL skills and optimization, row to column, ranking query

Common skills and optimization

Find duplicate records

Finding duplicate records is achieved by grouping and filtering

Use the sql statement to query the duplicate product records in the table, the code is as follows:

-- 使用分组加筛选的方式实现
select pname,count(pname)
from product
group by pname -- 根据名字进行分组
having count(pname)>1;-- 加上条件count(pname)>1说明是重复的

Use the sql statement to query the records in the table with duplicate product names and quantities, the code is as follows:

-- 两条完全一样的数据
select pname,num,count(*)
from product 
group by pname,num
having count(pname)>1 and count(num)>1

Delete duplicate records

Use the sql statement to delete the records with duplicate product names, and only keep the record with the largest id. The code is as follows:

-- 删除商品重复记录
-- p1内连接p2(同一张表)
-- 名字相等且id小的
delete p1 from product p1
inner join product p2
where p1.pid<p2.pid 
and p1.pname=p2.pname;
-- 查询商品表
select * from product;
-- 思路:自己连接自己
-- 加上where条件(名字相等,id小的)

Select random records

select * from table name
order by rand()
limit to randomly select the number of records;

Use the sql statement to randomly select a student record from the studentinfo table, the code is as follows:

-- select * from 表名
select * from studentinfo
order by rand()
-- limit 要随机选择记录的条数;
limit 1;

Select the nth highest record

Query the information of the second-most-stocked product, the code is as follows:

-- 先拿到前两条数据,把它当做一张表
select * from product
order by num desc
limit 2
-- 去查询这个表
select * from 
(select * from product
order by num desc
limit 2) as t1 -- 起一个别名
order by num asc limit 1; -- 进行升序排序拿到第一条

Compare the data of two tables

There are two tables, please check the A1 field, there is a t_a table, but there is no data in the t_b table, the table data is shown in the following figure:

-- select * from 表名a where 字段 not in
select * from t_a where a1 not in
-- (select 字段 from 表名b)
(select a1 from t_b)

The above code first obtains all the values ​​of the a1 field in the t_b table through a subquery, and then uses not in to exclude them in the external query. This method itself is not wrong, but in the case of a large amount of data (above a million), especially when the A1 field has an index, the query speed will be very slow. In addition to not in, left join can also be used. The code is as follows:

-- select 表a.* from 表a left join 表b
select t_a.* from t_a left join t_b
-- on 表a.字段 = 表b.字段
on t_a.A1 = t_b.A1
-- where 表b.字段 is null;
where t_b.B1 is null;

Row to column

In order to view the scores of each student conveniently, use the sql statement to query the following results: The
Insert picture description here
implementation code is as follows:

-- 先分组,拿到名字
select user_name from test_tb_grade
group by user_name;
-- 最终代码如下:
select user_name,
	-- 拿到名字,根据名字,判断科目,根据科目拿到成绩的最大值,拿到张三数学的最大值
	MAX(CASE course WHEN '数学' THEN score ELSE 0 END ) 数学,-- ELSE 0 END如果没有成绩默认为0
	MAX(CASE course WHEN '语文' THEN score ELSE 0 END ) 语文,
	MAX(CASE course WHEN '英语' THEN score ELSE 0 END ) 英语,
	sum(score) 总分
from test_tb_grade
group by user_name;-- group by后只能显示聚合函数

exists query

The parameter after the EXISTS keyword is an arbitrary subquery. The system performs operations on the subquery to determine whether it returns rows. If at least one row is returned, the result of EXISTS is TRUE. At this time, the outer query statement will perform the query; If the query does not return any rows, the result returned by EXISTS is FALSE, and the outer statement will not perform the query at this time.

Query whether there are any students who have failed in the test results of subject number 2 in the transcript table. If there are students who have failed, the number and scores of students who took the subject number 2 exam will be queried and displayed. The code is as follows:

SELECT StudentID,Exam FROM EXAM WHERE SubjectID=2 AND EXISTS (SELECT StudentID from EXAM WHERE
Exam<60)

all, any, some queries

ALL is used before a subquery. The comparison operator compares the value of an expression or column with each row in a column of values ​​returned by the subquery. As long as the result of a comparison is FALSE, the ALL test returns FALSE.

Query the test information of students whose scores are higher than all the scores of this course whose subject code is "1", the code is as follows:

-- 科目编号为一的成绩
select exam from exam where subjectid=1
-- 查询比科目一所有成绩都大的成绩=大于科目一最大值
-- all所有的
-- 只要有一次比较的结果为 FALSE,则 ALL 测试返回 FALSE。
select * from exam where exam>all(select exam from exam where subjectid=1);

Here, >ALL means greater than every value. In other words, it means greater than the maximum value. For example, >ALL (1, 2, 3) means greater than 3.

When ANY is used with a subquery, each row of the result of the subquery is calculated and compared according to the comparison operator, expression, or field. As long as the condition is met once, the result of ANY is true.

Query information about the test whose score is greater than any score with the subject code "1", the code is as follows:

-- 科目编号为一的成绩
select exam from exam where subjectid=1
-- 查询比科目一最小成绩都大的成绩=大于科目一最小值
-- 只要有一次满足条件,那么 ANY 的结果就是真。
select * from exam where exam>any(select exam from exam where subjectid=1);

SOME and ANY have the same effect

Note: The "=ANY" operator is equivalent to "IN". The "< >ANY" operator is different from "NOT IN". "< >ANY(A,B,C)" means not equal to A, or not equal to B, or not equal to C. "NOT IN(A,B,C)" means not equal to A, not equal to B, and not equal to C. "< >ALL" has the same meaning as "NOT IN".

union merge query

The MySQL union operator is used to combine the results of two or more select statements into a result set and delete duplicate data.

Union deletes duplicate records when merging, which is equivalent to distinct. If you don't want to remove duplicates, you can use union all.

-- select 字段 from 表a
-- union/union all
-- select 字段 from 表b(a和b的字段要一致)
select id,name,age from student
-- union 在合并时会删除重复记录,union all显示所有记录
union all
select id,name,age from teacher;

Note: There are several key points for using union query:

  • The two selects of the union must have the same number of columns;
  • The columns must have similar data types;
  • The order of the columns must be the same;
  • Union is far less efficient than union all because it has to be de-duplicated.

Ranking query

It is required to find out the score and ranking of the subject with subject number 2. The code is as follows:

-- 查询出科目为二的成绩,并排序
select * from exam a where subjectid=2 order by a.exam desc;
select 
examid,studentid,exam,
-- “(select @rownum:=0) b”的作用是给变量 rownum 赋值为 0
@rownum:=@rownum+1 as rank
from exam a,(select @rownum:=0) b
where subjectid=2 order by a.exam desc;

The results of the operation are as follows:
Insert picture description here

When there is a tie, the students behind are required to perform a jump ranking. The implementation code is as follows:

select examid,studentid,exam,
CASE
-- 成绩一样@rownum不变
when @prev = a.exam then @rownum
-- 成绩不一样@rownum加一
when @prev := a.exam then @rownum := @rownum+1
end as rank
from exam a,(select @rownum:=0,@prev:=null) b
where SubjectId=2
order by a.exam desc;

The results of the operation are as follows:
Insert picture description here

When there is a tie, the students behind are required to perform a jump ranking. The implementation code is as follows:

select examid,studentid,exam,rank from
(select examid,studentid,exam,
-- 如果@rownum:=if(@prev=a.exam,@rownum,@inc)
@rownum:=if(@prev=a.exam,@rownum,@inc) as rank,
-- 如果一样每次加一
@inc:=@inc+1,
@prev:=a.exam
from exam a,(select @rownum:=0,@prev:=null,@inc:=1) b
where SubjectId=2
order by a.exam desc) tb;

The results of the operation are as follows:
Insert picture description here

Group statistics

To query the number of clicks for each city in each month, the code is as follows:

select city_name,state_month,sum(sx_sum) 
from sx_target
-- 根据城市、月份排序
group by city_name,state_month;
-- 使用group by 后select 后要么是group by后的字段,要么是聚合函数

If one condition is added, based on the existing statistics, the total of each city and the total of all cities are displayed. The following code is used to realize the total function:

select city_name,state_month,sum(sx_sum)
from sx_target
group by city_name,state_month with rollup

with rollup is used to perform statistics on the basis of grouping, such as group by a, b with rollup, first statistics according to (a, b), then statistics according to (a)=, and statistics according to (null).

Guess you like

Origin blog.csdn.net/chaotiantian/article/details/114914984