The use of mysql window function

Window function as a new function introduced by mysql8.0

Can solve many ranking-related problems

One, build a table

create table  student(
id int ,
st_name  varchar(20),
score int,
date_time varchar(20),
primary key (id,date_time) 
);

Two, insert data

insert  into student value(101,'张飞',62,'2017');
insert  into student value(101,'张飞',30,'2007');
insert  into student value(101,'张飞',82,'2020');
insert  into student value(101,'张飞',62,'2015');
insert  into student value(101,'张飞',62,'2019');
insert  into student value(102,'刘备',92,'2017');
insert  into student value(102,'刘备',72,'2010');
insert  into student value(102,'刘备',62,'2011');
insert  into student value(102,'刘备',98,'2020');
insert  into student value(103,'孙悟空',98,'2010');
insert  into student value(103,'孙悟空',98,'2011');
insert  into student value(103,'孙悟空',98,'2012');
insert  into student value(103,'孙悟空',97,'2013');
insert  into student value(103,'孙悟空',98,'2014');
insert  into student value(103,'孙悟空',99,'2015');
insert  into student value(103,'孙悟空',100,'2018');
insert  into student value(104,'哪吒',10,'2010');
insert  into student value(104,'哪吒',20,'2011');
insert  into student value(104,'哪吒',30,'2012');
insert  into student value(104,'哪吒',40,'2013');
insert  into student value(104,'哪吒',50,'2014');
insert  into student value(104,'哪吒',40,'2015');
insert  into student value(104,'哪吒',50,'2016');

Three, window function

Mainly divided into 4 categories:

Usage: window function over (partition by grouping field order by sorting field)
(1) Sequence number function: rank(), dense_rank(), row_number()

  • select *,rank() over(partition by id order by score)as ranking from student;
id, st_name, score, date_time, ranking
101, 张飞, 30, 2007, 1
101, 张飞, 62, 2015, 2
101, 张飞, 62, 2017, 2
101, 张飞, 62, 2019, 2
101, 张飞, 82, 2020, 5
102, 刘备, 62, 2011, 1
102, 刘备, 72, 2010, 2
102, 刘备, 92, 2017, 3
102, 刘备, 98, 2020, 4
103, 孙悟空, 97, 2013, 1
103, 孙悟空, 98, 2010, 2
103, 孙悟空, 98, 2011, 2
103, 孙悟空, 98, 2012, 2
103, 孙悟空, 98, 2014, 2
103, 孙悟空, 99, 2015, 6
103, 孙悟空, 100, 2018, 7
104, 哪吒, 10, 2010, 1
104, 哪吒, 20, 2011, 2
104, 哪吒, 30, 2012, 3
104, 哪吒, 40, 2013, 4
104, 哪吒, 40, 2015, 4
104, 哪吒, 50, 2014, 6
104, 哪吒, 50, 2016, 6

 

  • select *,dense_rank() over(partition by id order by score)as ranking from student;
id, st_name, score, date_time, ranking
101, 张飞, 30, 2007, 1
101, 张飞, 62, 2015, 2
101, 张飞, 62, 2017, 2
101, 张飞, 62, 2019, 2
101, 张飞, 82, 2020, 3
102, 刘备, 62, 2011, 1
102, 刘备, 72, 2010, 2
102, 刘备, 92, 2017, 3
102, 刘备, 98, 2020, 4
103, 孙悟空, 97, 2013, 1
103, 孙悟空, 98, 2010, 2
103, 孙悟空, 98, 2011, 2
103, 孙悟空, 98, 2012, 2
103, 孙悟空, 98, 2014, 2
103, 孙悟空, 99, 2015, 3
103, 孙悟空, 100, 2018, 4
104, 哪吒, 10, 2010, 1
104, 哪吒, 20, 2011, 2
104, 哪吒, 30, 2012, 3
104, 哪吒, 40, 2013, 4
104, 哪吒, 40, 2015, 4
104, 哪吒, 50, 2014, 5
104, 哪吒, 50, 2016, 5
  • select *,row_number() over(partition by id order by score)as ranking from student;
 id, st_name, score, date_time, ranking
101, 张飞, 30, 2007, 1
101, 张飞, 62, 2015, 2
101, 张飞, 62, 2017, 3
101, 张飞, 62, 2019, 4
101, 张飞, 82, 2020, 5
102, 刘备, 62, 2011, 1
102, 刘备, 72, 2010, 2
102, 刘备, 92, 2017, 3
102, 刘备, 98, 2020, 4
103, 孙悟空, 97, 2013, 1
103, 孙悟空, 98, 2010, 2
103, 孙悟空, 98, 2011, 3
103, 孙悟空, 98, 2012, 4
103, 孙悟空, 98, 2014, 5
103, 孙悟空, 99, 2015, 6
103, 孙悟空, 100, 2018, 7
104, 哪吒, 10, 2010, 1
104, 哪吒, 20, 2011, 2
104, 哪吒, 30, 2012, 3
104, 哪吒, 40, 2013, 4
104, 哪吒, 40, 2015, 5
104, 哪吒, 50, 2014, 6
104, 哪吒, 50, 2016, 7

Comparing the above data, it can be seen that according to the ranking performed by score, when the same score appears, the results of these three functions are different.

rank: There is a situation where the rank is tied, the same rank occupies the rank value, until the next different value appears, the value is increasing

dense_rank: There is a situation where the rank is tied, the same rank does not occupy the ranking value, until the next different value appears, the value can be increased by 1

row_number: There is no case where there is a tie, the ranking can increase in order

(2) Distribution function: percent_rank(), cume_dist()

  • select *,percent_rank() over(partition by id order by score)as ranking from student;
id, st_name, score, date_time, ranking
101, 张飞, 30, 2007, 0
101, 张飞, 62, 2015, 0.25
101, 张飞, 62, 2017, 0.25
101, 张飞, 62, 2019, 0.25
101, 张飞, 82, 2020, 1
102, 刘备, 62, 2011, 0
102, 刘备, 72, 2010, 0.3333333333333333
102, 刘备, 92, 2017, 0.6666666666666666
102, 刘备, 98, 2020, 1
103, 孙悟空, 97, 2013, 0
103, 孙悟空, 98, 2010, 0.16666666666666666
103, 孙悟空, 98, 2011, 0.16666666666666666
103, 孙悟空, 98, 2012, 0.16666666666666666
103, 孙悟空, 98, 2014, 0.16666666666666666
103, 孙悟空, 99, 2015, 0.8333333333333334
103, 孙悟空, 100, 2018, 1
104, 哪吒, 10, 2010, 0
104, 哪吒, 20, 2011, 0.16666666666666666
104, 哪吒, 30, 2012, 0.3333333333333333
104, 哪吒, 40, 2013, 0.5
104, 哪吒, 40, 2015, 0.5
104, 哪吒, 50, 2014, 0.8333333333333334
104, 哪吒, 50, 2016, 0.8333333333333334

 How is this value calculated? The answer is: (rank-1) / (rows-1)

Rank is the rank calculated by the rank() function, and rows is the number of rows

For example, let’s take Zhang Fei as an example

Rank=1 in the first row, so the value is (1-1)/4=0

Rank=2 in the second row, so the value is (2-1)/4=0.25

Rank=2 in the third row, so the value is (2-1)/4=0.25

Rank=2 in the fourth row, so the value is (2-1)/4=0.25

Rank=5 in the fifth row, so the value is (5-1)/4=1

  • select *,cume_dist() over(partition by id order by score)as ranking from student;
id, st_name, score, date_time, ranking
101, 张飞, 30, 2007, 0.2
101, 张飞, 62, 2015, 0.8
101, 张飞, 62, 2017, 0.8
101, 张飞, 62, 2019, 0.8
101, 张飞, 82, 2020, 1
102, 刘备, 62, 2011, 0.25
102, 刘备, 72, 2010, 0.5
102, 刘备, 92, 2017, 0.75
102, 刘备, 98, 2020, 1
103, 孙悟空, 97, 2013, 0.14285714285714285
103, 孙悟空, 98, 2010, 0.7142857142857143
103, 孙悟空, 98, 2011, 0.7142857142857143
103, 孙悟空, 98, 2012, 0.7142857142857143
103, 孙悟空, 98, 2014, 0.7142857142857143
103, 孙悟空, 99, 2015, 0.8571428571428571
103, 孙悟空, 100, 2018, 1
104, 哪吒, 10, 2010, 0.14285714285714285
104, 哪吒, 20, 2011, 0.2857142857142857
104, 哪吒, 30, 2012, 0.42857142857142855
104, 哪吒, 40, 2013, 0.7142857142857143
104, 哪吒, 40, 2015, 0.7142857142857143
104, 哪吒, 50, 2014, 1
104, 哪吒, 50, 2016, 1

 How is this value calculated? The answer is: rank_rows / rows

rank_rows: The number of rows in the group that are less than or equal to the current rank value / the total number of rows in the group

rows: the total number of rows in the group

Let's take Zhang Fei as an example

In the first row, rank_rows is equal to 1, and rows is equal to 5, so the result is 0.2

In the second row, rank_rows is equal to 4 and rows is equal to 5, so the result is 0.8

In the third row, rank_rows is equal to 4 and rows is equal to 5, so the result is 0.8

In the fourth row, rank_rows is equal to 4 and rows is equal to 5, so the result is 0.8

In the fifth row, rank_rows is equal to 5, and rows is equal to 5, so the result is 1

(3) Before and after functions: lag(), lead()

  • select *,lag(score,1) over(partition by id order by score)as score from student;
id, st_name, score, date_time, score
101, 张飞, 30, 2007, 
101, 张飞, 62, 2015, 30
101, 张飞, 62, 2017, 62
101, 张飞, 62, 2019, 62
101, 张飞, 82, 2020, 62
102, 刘备, 62, 2011, 
102, 刘备, 72, 2010, 62
102, 刘备, 92, 2017, 72
102, 刘备, 98, 2020, 92
103, 孙悟空, 97, 2013, 
103, 孙悟空, 98, 2010, 97
103, 孙悟空, 98, 2011, 98
103, 孙悟空, 98, 2012, 98
103, 孙悟空, 98, 2014, 98
103, 孙悟空, 99, 2015, 98
103, 孙悟空, 100, 2018, 99
104, 哪吒, 10, 2010, 
104, 哪吒, 20, 2011, 10
104, 哪吒, 30, 2012, 20
104, 哪吒, 40, 2013, 30
104, 哪吒, 40, 2015, 40
104, 哪吒, 50, 2014, 40
104, 哪吒, 50, 2016, 50
  • select *,lead(score,1) over(partition by id order by score)as score from student;
id, st_name, score, date_time, score
101, 张飞, 30, 2007, 62
101, 张飞, 62, 2015, 62
101, 张飞, 62, 2017, 62
101, 张飞, 62, 2019, 82
101, 张飞, 82, 2020, 
102, 刘备, 62, 2011, 72
102, 刘备, 72, 2010, 92
102, 刘备, 92, 2017, 98
102, 刘备, 98, 2020, 
103, 孙悟空, 97, 2013, 98
103, 孙悟空, 98, 2010, 98
103, 孙悟空, 98, 2011, 98
103, 孙悟空, 98, 2012, 98
103, 孙悟空, 98, 2014, 99
103, 孙悟空, 99, 2015, 100
103, 孙悟空, 100, 2018, 
104, 哪吒, 10, 2010, 20
104, 哪吒, 20, 2011, 30
104, 哪吒, 30, 2012, 40
104, 哪吒, 40, 2013, 40
104, 哪吒, 40, 2015, 50
104, 哪吒, 50, 2014, 50
104, 哪吒, 50, 2016, 

(4) Head and tail functions: first_value(), last_value()

  • select *,first_value(score) over(partition by id order by score)as score from student;
id, st_name, score, date_time, score
101, 张飞, 30, 2007, 30
101, 张飞, 62, 2015, 30
101, 张飞, 62, 2017, 30
101, 张飞, 62, 2019, 30
101, 张飞, 82, 2020, 30
102, 刘备, 62, 2011, 62
102, 刘备, 72, 2010, 62
102, 刘备, 92, 2017, 62
102, 刘备, 98, 2020, 62
103, 孙悟空, 97, 2013, 97
103, 孙悟空, 98, 2010, 97
103, 孙悟空, 98, 2011, 97
103, 孙悟空, 98, 2012, 97
103, 孙悟空, 98, 2014, 97
103, 孙悟空, 99, 2015, 97
103, 孙悟空, 100, 2018, 97
104, 哪吒, 10, 2010, 10
104, 哪吒, 20, 2011, 10
104, 哪吒, 30, 2012, 10
104, 哪吒, 40, 2013, 10
104, 哪吒, 40, 2015, 10
104, 哪吒, 50, 2014, 10
104, 哪吒, 50, 2016, 10
  • select *,last_value(score) over(partition by id order by score)as score from student;
id, st_name, score, date_time, score
101, 张飞, 30, 2007, 30
101, 张飞, 62, 2015, 62
101, 张飞, 62, 2017, 62
101, 张飞, 62, 2019, 62
101, 张飞, 82, 2020, 82
102, 刘备, 62, 2011, 62
102, 刘备, 72, 2010, 72
102, 刘备, 92, 2017, 92
102, 刘备, 98, 2020, 98
103, 孙悟空, 97, 2013, 97
103, 孙悟空, 98, 2010, 98
103, 孙悟空, 98, 2011, 98
103, 孙悟空, 98, 2012, 98
103, 孙悟空, 98, 2014, 98
103, 孙悟空, 99, 2015, 99
103, 孙悟空, 100, 2018, 100
104, 哪吒, 10, 2010, 10
104, 哪吒, 20, 2011, 20
104, 哪吒, 30, 2012, 30
104, 哪吒, 40, 2013, 40
104, 哪吒, 40, 2015, 40
104, 哪吒, 50, 2014, 50
104, 哪吒, 50, 2016, 50

Reference: https://blog.csdn.net/weixin_39010770/article/details/87862407

Guess you like

Origin blog.csdn.net/zhou_438/article/details/108510339