hive collect_set函数

1)创建原数据表

hive (gmall)>

drop table if exists stud;

create table stud (name string, area string, course string, score int);

2)向原数据表中插入数据

hive (gmall)>

insert into table stud values('zhang3','bj','math',88);

insert into table stud values('li4','bj','math',99);

insert into table stud values('wang5','sh','chinese',92);

insert into table stud values('zhao6','sh','chinese',54);

insert into table stud values('tian7','bj','chinese',91);

3)查询表中数据

hive (gmall)> select * from stud;

stud.name       stud.area       stud.course     stud.score

zhang3 bj      math    88

li4     bj      math    99

wang5   sh      chinese 92

zhao6   sh      chinese 54

tian7   bj      chinese 91

4)把同一分组的不同行的数据聚合成一个集合

hive (gmall)> select course, collect_set(area), avg(score) from stud group by course;

chinese ["sh","bj"]     79.0

math    ["bj"]  93.5

5) 用下标可以取某一个

hive (gmall)> select course, collect_set(area)[0], avg(score) from stud group by course;

chinese sh      79.0

math    bj      93.5

猜你喜欢

转载自blog.csdn.net/qq_39674417/article/details/113752677