Sql statement _ three kinds of de-emphasis method

This article describes a distict, group by and row_number () over.

Note: Here is the deduplication: query time, do not show duplicate, duplicate entries not delete the table, deletes duplicate data relationship table sql please refer to the link:

https://www.cnblogs.com/171207xiaohutu/p/11520763.html

1. distinct

Userinfo data table as follows:

id    name    age  height   
10 xiaogang 23 181
11 xiaoli 31 176
12 xiaohei 22 152
13 xiaogang 26 172
14 xiaoming 31 176









Now we need not repeat the current user table user name
select distinct name from userinfo

The results (. 1):
name
Xiaogang
xiaohei
Xiaoli
Xiaoming

But I now want value Id, the changes are as follows

select distinct name,id from userinfo

As a result (2)

10 Xiaogang
Xiaoli. 11
xiaohei 12 is
Xiaogang 13 is
Xiaoming 14

At this time, while the role of the two distinct fields, i.e., id have to have the same name will be excluded

 

2. group by 

select name

from userinfo

groub by name 

Results Run 3 rows above sql distinct as a result of the above (1)

 

select  name,id

from userinfo 

groub by name ,id 

Results Run 3 rows above sql distinct as a result of the above (2)

3. row_number() over 

SQL Server numeral performed by the recording function Row_Number database table, when in use will follow later clause over, and over the main clause is used to record in the table are grouped and ordered.

The syntax is as follows:

ROW_NUMBER() OVER(PARTITION BY COLUMN1 ORDER BY COLUMN2)

1: Partition BY packet to

2: Order by used to sort

Next, with row_number () over for de-duplication. First, a group name, id sort.

Specific SQL statement is as follows

SELECT * FROM (
select *,ROW_NUMBER() over(partition by name order by id desc) AS rn from userinfo ) AS u WHERE u.rn=1

 

The results are as follows

id  name   age height rn

172. 1 Xiaogang 26 is 13 is
12 is 22 is 152 xiaohei. 1
. 11. 1 Xiaoli 31 is 176
14 176. 1 Xiaoming 31 is

You will be able to demonstrate through the use of row_number over all the columns out clause, at the same time de-duplication.

 

4. Thinking

 

distinct and group by the difference between:

(1) distinct used to query the number of unique records: count (distinct name), group by used it to return all the values ​​are not re-recordable.

(2) After the group by using the packet, the packet may be selected in the select field, and the function value of the non-packet field, such as max (), min (), sum, count (), etc.

 

distinct and row_number over () difference:

(. 1) distinct and row_number over can be achieved deduplication function, and distinct acting when the line time, which is "de-duplication" is to remove the field in the table for all duplicate data, acting when the plurality of rows is, the "de-emphasis" All fields are the same data.

(2) using row_number over time is the first packet clause, then sorted, and then remove each of the first record "deduplication"

Guess you like

Origin www.cnblogs.com/171207xiaohutu/p/11520759.html