window function
When we need to perform some more complex subqueries, the aggregation function will be very troublesome, so we can use the window function to group and then use the function query. The window function can display both the data before aggregation and the data after aggregation, and can return the column value of the basic row and the result column after aggregation in the same row
Common application scenario: sorting the grades of students in the class
Common window functions
The basic form of the window function
func_name(<parameter>)
OVER(
[PARTITION BY <part_by_condition>]
[ORDER BY <order_by_list> ASC|DESC]
[rows between ?? And ??]
)
For the explanation of specific fields, see my previous article: Basic usage of SQL window function and aggregation function
Distribution function
percent_rank():(rank-1)/(row-1)
Calculate according to the formula
Application scenario: not commonly used
cume_dist(): Obtain the number of rows less than or equal to the current rank value in the group/the total number of rows in the group
Application scenario: query the ratio of less than the current salary
eg Sales table
select *,
rank() over(order by sales) as ranking,
percent_rank() over(order by sales) as percent_ranking,
cume_dist() over(order by sales) as cume
from Sales
head and tail function
The head and tail functions first_value()
and last_value()
are mainly used to obtain the first or last value in the grouping field, which in some cases is equivalent to max
or min
application scenario: query the date of the earliest sales record and the date of the latest sales record in a department
eg score sheet
select *,
first_value(score) over(partition by cid),
first_value(score) over(partition by cid order by score),
last_value(score) over(partition by cid),
last_value(score) over(partition by cid order by score)
from sc;
Ntile function
NTILE()
The function is used to divide the ordered data in the partition into n levels, record the number of levels
NTILE(n)
OVER (
PARTITION BY <expression>[{
,<expression>...}]
ORDER BY <expression> [ASC|DESC], [{
,<expression>...}]
)
Ntile(n)
Indicates that it is divided into n groups
eg score sheet
- Divide students' grades into 2 and 3 groups
select *,
ntile(2) over(order by score desc) as 2_tile,
ntile(3) over(order by score desc) as 3_tile,
from sc;
- Divide the results into 2 groups and 3 groups according to cid
select *,
ntile(2) over(partition by cid order by score desc) as 2_tile_group,
ntile(3) over(partition by cid order by score desc) as 3_tile_group
from sc;
Application: Select the employee
table with the top 50% of salary
Ideas:
- Divide employee salaries into 2 groups for sorting
- Filter the serial numbers of the employees with the top 50% salary
-- 先给员工薪水分成2组
select *,
ntile(2) over(order by salary) as ranks
from employees
-- 筛选薪水前50%的员工的序号
select * from (
select *, ntile(2) over(order by salary) as ranks from employees
) t
where ranks = 2
nth_value function
nth_value()
The function is used to return the value of the Nth row from the current row in the group. If row N does not exist, the function returns NULL.
N must be a positive integer such as 1, 2 and 3.
Application scenario: Query the information of the Nth classmate
. Basic form:
NTH_VALUE(expression, N)
OVER (
[PARTITION BY <part_by_condition>]
[ORDER BY <order_by_list> ASC|DESC]
[rows between ?? And ??]
)
NTH_VALUE(expression, N)
Expressed in expression
rowN
eg score sheet
select *,
nth_value(score, 1) over(partition by cid order by score desc) as 1th,
nth_value(score, 2) over(partition by cid order by score desc) as 2th,
nth_value(score, 3) over(partition by cid order by score desc) as 3th
from sc;
select *,
nth_value(score, 1) over(partition by cid) as 1th,
nth_value(score, 2) over(partition by cid) as 2th,
nth_value(score, 3) over(partition by cid) as 3th
from sc;
Application: Get the id of the classmate with the first score in each class
Ideas:
- First sort the scores of each class and get the score of the first place
- Get the information of the first classmate based on the score
-- 先获取每个班第一名的分数
select *,
nth_value(score,1) over(partition by cid order by score desc) as 1th_score
from sc;
-- 根据分数获得同学信息
select sid, cid, score from (
select *,
nth_value(score,1) over(partition by cid order by score desc) as 1th_score
from sc) t
where score=1th_score
Reference source:
MySQL module: Window function
SQL window function first_value() and last_value()
Window function in MySQL8