hive 练习题 求学生选课情况

1、数据说明

id course 
1,a 
1,b 
1,c 
1,e 
2,a 
2,c 
2,d 
2,f 
3,a 
3,b 
3,c 
3,e

(2)字段含义

表示有id为1,2,3的学生选修了课程a,b,c,d,e,f中其中几门。

建表语句
create table t_course(id int,course string)
row format delimited fields terminated by ","
导入数据
load data local inpath "/home/hadoop/course/course.txt" into table t_course;

3、需求

编写Hive的HQL语句来实现以下结果:表中的1表示选修,表中的0表示未选修

id    a    b    c    d    e    f
1     1    1    1    0    1    0
2     1    0    1    1    0    1
3     1    1    1    0    1    0

首先 将数据进行整理
create table id_courses as select t1.id as id,t1.course as id_courses,t2.course courses 
from 
( select id as id,collect_set(course) as course from t_course group by id ) t1 
join 
(select collect_set(course) as course from t_course) t2;
collect_set(属性名)   收集属性 所对应的value   将所有的value  放入一个 数组中   且不重复
举个例子        根据id  来获取id所对应的
    select id as id,collect_set(course) as course from t_course group by id 

继续回到 题目  查看一下当前的表id_courses


进行下一步  将 idcourses  与  courses 进行对比 筛选出我们的数据

select id,
case when array_contains(id_courses, courses[0]) then 1 else 0 end as a,
case when array_contains(id_courses, courses[1]) then 1 else 0 end as b,
case when array_contains(id_courses, courses[2]) then 1 else 0 end as c,
case when array_contains(id_courses, courses[3]) then 1 else 0 end as d,
case when array_contains(id_courses, courses[4]) then 1 else 0 end as e,
case when array_contains(id_courses, courses[5]) then 1 else 0 end as f 
from id_courses;

补充一下

 array_contains(数组, value)    用于 判断  数组中是否还有value  如果存在返回true

case  when  (条件) 

then 条件成立
else  
条件不成立
end





猜你喜欢

转载自blog.csdn.net/yumingzhu1/article/details/80660248
今日推荐