hive七__一个很难的面试题

求单月访问次数和总访问次数

  • 1、数据说明
    字段说明:
    ** 用户名,月份,访问次数
    数据内容如下:
A,2015-01,5
A,2015-01,15
B,2015-01,5
A,2015-01,8
B,2015-01,25
A,2015-01,5
A,2015-02,4
A,2015-02,6
B,2015-02,10
B,2015-02,5
A,2015-03,16
A,2015-03,22
B,2015-03,23
B,2015-03,10
B,2015-03,1
  • 2、 数据准备
    创建表:
create external table if not exists t_access(
uname string comment '用户名',
umonth string comment '月份',
ucount int comment '访问次数'
) comment '用户访问表' 
row format delimited fields terminated by "," 
location "/hive/t_access"; 

导入数据:

load data local inpath "/home/hadoop/month.txt" into table t_access;

查询数据:
这里写图片描述
* 3、结果需求
现要求出:每个用户截止到每月为止的最大单月访问次数和累计到该月的总访问次数,结果数据格式如下
这里写图片描述

  • 4、最终hql如下:
select aname,amon,acount,max(bcount) as max_access,sum(bcount) as sum_access from (
select a.name as aname,a.month as amon,a.count as acount,b.name as bname,b.month as bmon,b.count as bcount from 
(select uname name,umonth month,sum(ucount) AS count from t_access group by uname,umonth) AS a
inner join 
(select uname name,umonth month,sum(ucount) AS count from t_access group by uname,umonth) AS b
on a.name = b.name) as tt where amon>=bmon group by aname,amon,acount;

这里写图片描述

猜你喜欢

转载自blog.csdn.net/guo20082200/article/details/82534356
今日推荐