Usage of percent_rank, first_value, nth of SQL windowing function

window function

When we need to perform some more complex subqueries, the aggregation function will be very troublesome, so we can use the window function to group and then use the function query. The window function can display both the data before aggregation and the data after aggregation, and can return the column value of the basic row and the result column after aggregation in the same row

Common application scenario: sorting the grades of students in the class

Common window functions
insert image description here

The basic form of the window function

func_name(<parameter>) 
OVER(
[PARTITION BY <part_by_condition>] 
[ORDER BY <order_by_list> ASC|DESC]
[rows between ?? And ??]
)

For the explanation of specific fields, see my previous article: Basic usage of SQL window function and aggregation function


Distribution function

percent_rank():(rank-1)/(row-1) Calculate according to the formula
Application scenario: not commonly used

cume_dist(): Obtain the number of rows less than or equal to the current rank value in the group/the total number of rows in the group
Application scenario: query the ratio of less than the current salary

eg Sales table
insert image description here

select *, 
rank() over(order by sales) as ranking,
percent_rank() over(order by sales) as percent_ranking,
cume_dist() over(order by sales) as cume 
from Sales

insert image description here


head and tail function

The head and tail functions first_value()and last_value()are mainly used to obtain the first or last value in the grouping field, which in some cases is equivalent to maxor min
application scenario: query the date of the earliest sales record and the date of the latest sales record in a department

eg score sheet
insert image description here

select *,
first_value(score) over(partition by cid),
first_value(score) over(partition by cid order by score),
last_value(score) over(partition by cid), 
last_value(score) over(partition by cid order by score) 
from sc;

insert image description here

Ntile function

NTILE()The function is used to divide the ordered data in the partition into n levels, record the number of levels

NTILE(n) 
OVER (
    PARTITION BY <expression>[{
   
   ,<expression>...}]
    ORDER BY <expression> [ASC|DESC], [{
   
   ,<expression>...}]
) 

Ntile(n)Indicates that it is divided into n groups

eg score sheet
insert image description here

  • Divide students' grades into 2 and 3 groups
select *,
ntile(2) over(order by score desc) as 2_tile,
ntile(3) over(order by score desc) as 3_tile,
from sc;

insert image description here

  • Divide the results into 2 groups and 3 groups according to cid
select *,
ntile(2) over(partition by cid order by score desc) as 2_tile_group,
ntile(3) over(partition by cid order by score desc) as 3_tile_group
from sc;

insert image description here

Application: Select the employee
table with the top 50% of salary
insert image description here

Ideas:

  1. Divide employee salaries into 2 groups for sorting
  2. Filter the serial numbers of the employees with the top 50% salary
-- 先给员工薪水分成2组
select *, 
ntile(2) over(order by salary) as ranks 
from employees

insert image description here

-- 筛选薪水前50%的员工的序号
select * from (
	select *, ntile(2) over(order by salary) as ranks from employees
) t
where ranks = 2

insert image description here


nth_value function

nth_value()The function is used to return the value of the Nth row from the current row in the group. If row N does not exist, the function returns NULL.
N must be a positive integer such as 1, 2 and 3.

Application scenario: Query the information of the Nth classmate
. Basic form:

NTH_VALUE(expression, N)
OVER (
	[PARTITION BY <part_by_condition>] 
	[ORDER BY <order_by_list> ASC|DESC]
	[rows between ?? And ??]
)

NTH_VALUE(expression, N)Expressed in expressionrowN

eg score sheet
insert image description here

select *,
nth_value(score, 1) over(partition by cid order by score desc) as 1th,
nth_value(score, 2) over(partition by cid order by score desc) as 2th,
nth_value(score, 3) over(partition by cid order by score desc) as 3th
from sc;

insert image description here

select *,
nth_value(score, 1) over(partition by cid) as 1th,
nth_value(score, 2) over(partition by cid) as 2th,
nth_value(score, 3) over(partition by cid) as 3th
from sc;

insert image description here

Application: Get the id of the classmate with the first score in each class

insert image description here

Ideas:

  1. First sort the scores of each class and get the score of the first place
  2. Get the information of the first classmate based on the score
-- 先获取每个班第一名的分数
select *, 
nth_value(score,1) over(partition by cid order by score desc) as 1th_score 
from sc;

insert image description here

-- 根据分数获得同学信息
select sid, cid, score from (
    select *, 
    nth_value(score,1) over(partition by cid order by score desc) as 1th_score 
    from sc) t
where score=1th_score

insert image description here


Reference source:
MySQL module: Window function
SQL window function first_value() and last_value()
Window function in MySQL8

Guess you like

Origin blog.csdn.net/weixin_46599926/article/details/128277127