This article describes a distict, group by and row_number () over.
Note: Here is the deduplication: query time, do not show duplicate, duplicate entries not delete the table, deletes duplicate data relationship table sql please refer to the link:
https://www.cnblogs.com/171207xiaohutu/p/11520763.html
1. distinct
Userinfo data table as follows:
id | name | age | height |
10 | xiaogang | 23 | 181 |
11 | xiaoli | 31 | 176 |
12 | xiaohei | 22 | 152 |
13 | xiaogang | 26 | 172 |
14 | xiaoming | 31 | 176 |
Now we need not repeat the current user table user name
select distinct name from userinfo
The results (. 1):
name
Xiaogang
xiaohei
Xiaoli
Xiaoming
But I now want value Id, the changes are as follows
select distinct name,id from userinfo
As a result (2)
10 Xiaogang
Xiaoli. 11
xiaohei 12 is
Xiaogang 13 is
Xiaoming 14
At this time, while the role of the two distinct fields, i.e., id have to have the same name will be excluded
2. group by
select name
from userinfo
groub by name
Results Run 3 rows above sql distinct as a result of the above (1)
select name,id
from userinfo
groub by name ,id
Results Run 3 rows above sql distinct as a result of the above (2)
3. row_number() over
SQL Server numeral performed by the recording function Row_Number database table, when in use will follow later clause over, and over the main clause is used to record in the table are grouped and ordered.
The syntax is as follows:
ROW_NUMBER() OVER(PARTITION BY COLUMN1 ORDER BY COLUMN2)
1: Partition BY packet to
2: Order by used to sort
Next, with row_number () over for de-duplication. First, a group name, id sort.
Specific SQL statement is as follows
SELECT * FROM (
select *,ROW_NUMBER() over(partition by name order by id desc) AS rn from userinfo ) AS u WHERE u.rn=1
The results are as follows
id name age height rn
172. 1 Xiaogang 26 is 13 is
12 is 22 is 152 xiaohei. 1
. 11. 1 Xiaoli 31 is 176
14 176. 1 Xiaoming 31 is
You will be able to demonstrate through the use of row_number over all the columns out clause, at the same time de-duplication.
4. Thinking
distinct and group by the difference between:
(1) distinct used to query the number of unique records: count (distinct name), group by used it to return all the values are not re-recordable.
(2) After the group by using the packet, the packet may be selected in the select field, and the function value of the non-packet field, such as max (), min (), sum, count (), etc.
distinct and row_number over () difference:
(. 1) distinct and row_number over can be achieved deduplication function, and distinct acting when the line time, which is "de-duplication" is to remove the field in the table for all duplicate data, acting when the plurality of rows is, the "de-emphasis" All fields are the same data.
(2) using row_number over time is the first packet clause, then sorted, and then remove each of the first record "deduplication"