蚂蚁森林(hive)查询练习

蚂蚁森林背景说明:

以下表记录了用户每天的蚂蚁森林低碳生活领取的记录流水。
table_name:user_low_carbon
user_id data_dt low_carbon
用户 日期 减少碳排放(g)

蚂蚁森林植物换购表,用于记录申领环保植物所需要减少的碳排放量
table_name: plant_carbon
plant_id plant_name low_carbon
植物编号 植物名 换购植物所需要的碳

1.创建表

create table user_low_carbon(
user_id String,data_dt String,low_carbon int)
 row format delimited fields terminated by '\t';
 
create table plant_carbon(
plant_id string,plant_name String,low_carbon int) 
row format delimited fields terminated by '\t';

2.加载数据
tree.txt

p001	梭梭树	17
p002	沙柳	19
p003	樟子树	146
p004	胡杨	215

zfb.txt

u_001	2017/1/1	10
u_001	2017/1/2	150
u_001	2017/1/2	110
u_001	2017/1/2	10
u_001	2017/1/4	50
u_001	2017/1/4	10
u_001	2017/1/6	45
u_001	2017/1/6	90
u_002	2017/1/1	10
u_002	2017/1/2	150
u_002	2017/1/2	70
u_002	2017/1/3	30
u_002	2017/1/3	80
u_002	2017/1/4	150
u_002	2017/1/5	101
u_002	2017/1/6	68
u_003	2017/1/1	20
u_003	2017/1/2	10
u_003	2017/1/2	150
u_003	2017/1/3	160
u_003	2017/1/4	20
u_003	2017/1/5	120
u_003	2017/1/6	20
u_003	2017/1/7	10
u_003	2017/1/7	110
u_004	2017/1/1	110
u_004	2017/1/2	20
u_004	2017/1/2	50
u_004	2017/1/3	120
u_004	2017/1/4	30
u_004	2017/1/5	60
u_004	2017/1/6	120
u_004	2017/1/7	10
u_004	2017/1/7	120
u_005	2017/1/1	80
u_005	2017/1/2	50
u_005	2017/1/2	80
u_005	2017/1/3	180
u_005	2017/1/4	180
u_005	2017/1/4	10
u_005	2017/1/5	80
u_005	2017/1/6	280
u_005	2017/1/7	80
u_005	2017/1/7	80
u_006	2017/1/1	40
u_006	2017/1/2	40
u_006	2017/1/2	140
u_006	2017/1/3	210
u_006	2017/1/3	10
u_006	2017/1/4	40
u_006	2017/1/5	40
u_006	2017/1/6	20
u_006	2017/1/7	50
u_006	2017/1/7	240
u_007	2017/1/1	130
u_007	2017/1/2	30
u_007	2017/1/2	330
u_007	2017/1/3	30
u_007	2017/1/4	530
u_007	2017/1/5	30
u_007	2017/1/6	230
u_007	2017/1/7	130
u_007	2017/1/7	30
u_008	2017/1/1	160
u_008	2017/1/2	60
u_008	2017/1/2	60
u_008	2017/1/3	60
u_008	2017/1/4	260
u_008	2017/1/5	360
u_008	2017/1/6	160
u_008	2017/1/7	60
u_008	2017/1/7	60
u_009	2017/1/1	70
u_009	2017/1/2	70
u_009	2017/1/2	70
u_009	2017/1/3	170
u_009	2017/1/4	270
u_009	2017/1/5	70
u_009	2017/1/6	70
u_009	2017/1/7	70
u_009	2017/1/7	70
u_010	2017/1/1	90
u_010	2017/1/2	90
u_010	2017/1/2	90
u_010	2017/1/3	90
u_010	2017/1/4	90
u_010	2017/1/4	80
u_010	2017/1/5	90
u_010	2017/1/5	90
u_010	2017/1/6	190
u_010	2017/1/7	90
u_010	2017/1/7	90
u_011	2017/1/1	110
u_011	2017/1/2	100
u_011	2017/1/2	100
u_011	2017/1/3	120
u_011	2017/1/4	100
u_011	2017/1/5	100
u_011	2017/1/6	100
u_011	2017/1/7	130
u_011	2017/1/7	100
u_012	2017/1/1	10
u_012	2017/1/2	120
u_012	2017/1/2	10
u_012	2017/1/3	10
u_012	2017/1/4	50
u_012	2017/1/5	10
u_012	2017/1/6	20
u_012	2017/1/7	10
u_012	2017/1/7	10
u_013	2017/1/1	50
u_013	2017/1/2	150
u_013	2017/1/2	50
u_013	2017/1/3	150
u_013	2017/1/4	550
u_013	2017/1/5	350
u_013	2017/1/6	50
u_013	2017/1/7	20
u_013	2017/1/7	60
u_014	2017/1/1	220
u_014	2017/1/2	120
u_014	2017/1/2	20
u_014	2017/1/3	20
u_014	2017/1/4	20
u_014	2017/1/5	250
u_014	2017/1/6	120
u_014	2017/1/7	270
u_014	2017/1/7	20
u_015	2017/1/1	10
u_015	2017/1/2	20
u_015	2017/1/2	10
u_015	2017/1/3	10
u_015	2017/1/4	20
u_015	2017/1/5	70
u_015	2017/1/6	10
u_015	2017/1/7	80
u_015	2017/1/7	60
load data local inpath "/home/hdfs/zfb.txt" into table user_low_carbon;
load data local inpath "/home/hdfs/tree.txt" into table plant_carbon;

3.设置本地模式
set hive.exec.mode.local.auto=true;

题目

(1)统计在10月1日前每个用户减少碳排放量的总和(取前11名)

select
user_id,
sum(low_carbon) lc
from user_low_carbon
where date_format(regexp_replace(data_dt,"/","-"),"yyyy-MM")<"2017-10"
group by user_id
order by lc desc
limit 11;

结果

user_id lc
u_007   1470
u_013   1430
u_008   1240
u_005   1100
u_010   1080
u_014   1060
u_011   960
u_009   930
u_006   830
u_002   659
u_004   640
Time taken: 10.984 seconds, Fetched: 11 row(s)

(2)取出申领胡杨的条件
plant_carbon
plant_id plant_name low_carbon
植物编号 植物名 换购植物所需要的碳

select
low_carbon
from plant_carbon
where plant_id="p004"

(3)取出申领沙柳的条件

select
low_carbon
from plant_carbon
where plant_id="p002"

(4)求出能申领沙柳的棵数

select
user_id,
floor(t1.lc/(select
low_carbon lc
from plant_carbon
where plant_id="p002")) num
from
(
select
user_id,
sum(low_carbon) lc
from user_low_carbon
group by user_id
) t1
;

结果

user_id num
u_001   25
u_002   34
u_003   32
u_004   33
u_005   57
u_006   43
u_007   77
u_008   65
u_009   48
u_010   56
u_011   50
u_012   13
u_013   75
u_014   55
u_015   15

(5)求出前一名比后一名多几棵

select
user_id,
num,
bef,
num-bef
from
(
select
user_id,
num,
lag(num,1,0) over(order by num desc ) bef
from
(
select
user_id,
floor(t1.lc/(select
low_carbon lc
from plant_carbon
where plant_id="p002")) num
from
(
select
user_id,
sum(low_carbon) lc
from user_low_carbon
group by user_id
) t1
) t2
) t3

结果

user_id num     _c2
u_007   77      77
u_013   75      -2
u_008   65      -10
u_005   57      -8
u_010   56      -1
u_014   55      -1
u_011   50      -5
u_009   48      -2
u_006   43      -5
u_002   34      -9
u_004   33      -1
u_003   32      -1
u_001   25      -7
u_015   15      -10
u_012   13      -2
Time taken: 138.272 seconds, Fetched: 15 row(s)

注意:
前置函数:
datediff:求两个时间的差值
regexp_replace:替换符号
to_date:将字符串转换成时间
date_sub:求一个时间与数字之间的差值

round:四舍五入
floor:向下取整
ceil:向上取整

substring

猜你喜欢

转载自blog.csdn.net/qq_42706464/article/details/108276238