SQL进阶-去重

一、去重的2种基本方法

1、DISTINCT

##
建表:
CREATE TABLE teacher(
teacher_id VARCHAR(50),
teacher_name VARCHAR(50),
id_no VARCHAR(50)
);

CREATE INDEX idx_teacher_id ON teacher(teacher_id);

插入重复数据:
DELETE FROM teacher;
INSERT INTO teacher VALUES('20180204060001','李斌','530102192005080114');
INSERT INTO teacher VALUES('20180204060002','张成','530102192005080115');
INSERT INTO teacher VALUES('20180204060001','李斌','530102192005080114');
INSERT INTO teacher VALUES('20180204060002','张成','530102192005080115');
INSERT INTO teacher VALUES('20180204060001','李斌','530102192005080114');
INSERT INTO teacher VALUES('20180204060002','张成','530102192005080115');


DISTINCT去重查询:
SELECT DISTINCT * FROM teacher;


group by去重查询:
SELECT 
teacher_id,
MIN(teacher_name),
MIN(id_no)
FROM teacher
GROUP BY teacher_id;

在重复数据都一样的时候,可以用group by,但是当重复数据不完全一样时,可能就会出问题;


二、其他6中去重方法

##
在重复数据都一样的时候,可以用group by,但是当重复数据不完全一样时,可能就会出问题;

插入数据
DELETE FROM teacher;
INSERT INTO teacher VALUES('20180204060001','李斌','530102192005080114');
INSERT INTO teacher VALUES('20180204060002','张成','530102192005080115');
INSERT INTO teacher VALUES('20180204060001','白斌','630102192005080114');
INSERT INTO teacher VALUES('20180204060002','赵成','630102192005080115');


###
(1)可以用子查询+group byselect * from teacher a where (a.teacher_id,a.id_no) in
(select b.teacher_id,max(b.id_no) id_no from teacher b group by b.teacher_id);


###
(2)
SELECT * FROM teacher a WHERE NOT EXISTS 
(SELECT 1 FROM teacher b WHERE a.teacher_id = b.teacher_id AND a.id_no > b.id_no);


###
(3)
SELECT a.* FROM teacher a WHERE a.id_no <= ALL(
  SELECT b.id_No
  FROM teacher b
  WHERE a.teacher_id = b.teacher_id
);


###
(4)
SELECT a.*
FROM teacher a
INNER JOIN teacher b
ON a.teacher_id = b.teacher_id
AND a.id_no >= b.id_no
GROUP BY a.teacher_id,a.teacher_name,a.id_no
HAVING COUNT(*)=1;


###
(5)
SELECT 
teacher_id,
SUBSTRING_INDEX(
GROUP_CONCAT(teacher_name ORDER BY id_no ASC),
',',1),
SUBSTRING_INDEX(
GROUP_CONCAT(id_no ORDER BY id_no ASC),
',',1)
FROM teacher
GROUP BY teacher_id;


###
(6)
SELECT teacher_id,teacher_name,id_no
FROM
(
SELECT 
teacher_id,teacher_name,id_no,
ROW_NUMBER() OVER(PARTITIONING BY teacher_id
ORDER BY id_no ASC) AS rn_no
FROM teacher
)
WHERE rn_no = 1;

MySQL不支持ROW_NUMBER()的写法!

猜你喜欢

转载自www.cnblogs.com/weiyiming007/p/11463167.html