Hive排名函数

v2-4b22527985384cedba02ae95e6d244cc_250x0

排名函数

第一种函数:row_number从1开始,按照顺序,生成分组内记录的序列,row_number()的值不会存在重复,当排序的值相同时,按照表中记录的顺序进行排列

效果如下:
98		1
97		2
97		3
96		4
95		5
95		6

没有并列名次情况,顺序递增

第二种函数:RANK() 生成数据项在分组中的排名,排名相等会在名次中留下空位

效果如下:
98		1
97		2
97		2
96		4
95		5
95		5
94		7

有并列名次情况,顺序跳跃递增

第三种函数:DENSE(但是)_RANK() 生成数据项在分组中的排名,排名相等会在名次中不会留下空位

效果如下:
98		1
97		2
97		2
96		3
95		4
95		4
94		5

有并列名次情况,顺序递增

准备数据

userid	classno	score
1 qf1808 80   
2 qf1808 92
3 qf1808 84
4 qf1808 86
5 qf1808 88
6 qf1808 70
7 qf1808 98
8 qf1808 84
9 qf1808 86
10 qf1807 90
11 qf1807 92
12 qf1807 84
13 qf1807 86
14 qf1807 88
15 qf1807 80
16 qf1807 92
17 qf1807 84
18 qf1807 86
19 qf1805 80
20 qf1805 92
21 qf1805 94
22 qf1805 86
23 qf1805 88
24 qf1805 80
25 qf1805 92
26 qf1805 94
27 qf1805 86
create table if not exists stu_score(
userid int,
classno string,
score int
)
row format delimited 
fields terminated by ' ';

load data local inpath './data/stu_score.txt' overwrite into table stu_score;

需求1:对每次考试按照考试成绩倒序

select *,
row_number() over(partition by classno order by score desc) rn1
from stu_score;

select *,
rank() over(partition by classno order by score desc) rn2
from stu_score;

select *,
dense_rank() over(distribute by classno sort by score desc) rn3
from stu_score;

select *,
dense_rank() over(order by score desc) `全年级排名`
from stu_score;

需求2:获取每次考试的排名情况

select *,
-- 没有并列,相同名次依顺序排
row_number() over(distribute by classno sort by score desc) rn1,
-- rank():有并列,相同名次空位
rank() over(distribute by classno sort by score desc) rn2,
-- dense_rank():有并列,相同名次不空位
dense_rank() over(distribute by classno sort by score desc) rn3
from stu_score;

需求3:求每个班级的前三名

select *
from
(
select *,
row_number() over(partition by classno order by score desc) rn1
from stu_score
) A 
where rn1 < 4;

更多大数据精彩内容欢迎B站搜索“千锋教育”或者扫码领取全套资料   

【千锋教育】大数据开发全套教程,史上最全面的大数据学习视频

猜你喜欢

转载自blog.csdn.net/longz_org_cn/article/details/131822273