Use of hive explode and LateralView to create a table containing array type fields
Side View
Click to view LateralView usage
Lateralview is used in conjunction with user-defined table generation functions (for example) explode()
. As described in the built-in table generation functions , UDTF generates zero or more output rows for each input row. Lateralview first applies the UDTF to each row of the base table, and then concatenates the resulting output row with the input row to form a virtual table with the provided table alias.
Lateral View usage
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*
explode (array)
select explode(array('A','B','C'));
select explode(array('A','B','C')) as col;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
col |
---|
A |
B |
C |
explode (map)
select explode(map('A',10,'B',20,'C',30));
select explode(map('A',10,'B',20,'C',30)) as (key,value);
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;
key | value |
---|---|
A | 10 |
B | 20 |
C | 30 |
Create test data to test Lateral View and Explode
Create a table containing array type fields, and the format is textfile
The separator between fields is a space, and the separator between array elements is a comma
create table xtable(name string,age string,subject array<string>) row format delimited fields terminated by ' ' collection items terminated by ',' stored as textfile;
Check the location of the table
0: jdbc:hive2://hadoop91:10000> desc formatted xtable;
OK
| Location: | hdfs://hadoop90:9000/user/hive/warehouse/xtable | NULL |
Create data
vi xtable.txt
# 存入以下数据
xhx 15 math,english,history
bjx 20 physical,biological
Load data into the table
[root@hadoop91 ~]# hdfs dfs -put /root/xtable.txt hdfs://hadoop90:9000/user/hive/warehouse/xtable/
View table
0: jdbc:hive2://hadoop91:10000> select * from xtable;
+--------------+-------------+-------------------------------+--+
| xtable.name | xtable.age | xtable.subject |
+--------------+-------------+-------------------------------+--+
| xhx | 15 | ["math","english","history"] |
| bjx | 20 | ["physical","biological"] |
+--------------+-------------+-------------------------------+--+
Add an explode to the query
0: jdbc:hive2://hadoop91:10000> select explode(subject) from xtable;
+-------------+--+
| col |
+-------------+--+
| math |
| english |
| history |
| physical |
| biological |
+-------------+--+
5 rows selected (0.402 seconds)
If you want to find out the name and id, the result is as follows, an error is reported
0: jdbc:hive2://hadoop91:10000> select name,age,explode(subject) from xtable;
Error: Error while compiling statement: FAILED: SemanticException [Error 10081]: UDTF's are not supported outside the SELECT clause, nor nested in expressions (state=42000,code=10081)
So this time you need to use Lateral View
0: jdbc:hive2://hadoop91:10000> select name,age,subcol from xtable lateral view explode(subject) subtable as subcol;
+-------+------+-------------+--+
| name | age | subcol |
+-------+------+-------------+--+
| xhx | 15 | math |
| xhx | 15 | english |
| xhx | 15 | history |
| bjx | 20 | physical |
| bjx | 20 | biological |
+-------+------+-------------+--+
5 rows selected (0.302 seconds)