Use of hive explode and LateralView to create a table containing array type fields

Use of hive explode and LateralView to create a table containing array type fields

Side View

Click to view LateralView usage

Lateralview is used in conjunction with user-defined table generation functions (for example) explode(). As described in the built-in table generation functions , UDTF generates zero or more output rows for each input row. Lateralview first applies the UDTF to each row of the base table, and then concatenates the resulting output row with the input row to form a virtual table with the provided table alias.

Lateral View usage

lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*

explode (array)

select explode(array('A','B','C'));
select explode(array('A','B','C')) as col;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
col
A
B
C

explode (map)

select explode(map('A',10,'B',20,'C',30));
select explode(map('A',10,'B',20,'C',30)) as (key,value);
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;
key value
A 10
B 20
C 30

Create test data to test Lateral View and Explode

Create a table containing array type fields, and the format is textfile

The separator between fields is a space, and the separator between array elements is a comma

create table xtable(name string,age string,subject array<string>) row format delimited fields terminated by ' ' collection items terminated by ',' stored as textfile;

Check the location of the table

0: jdbc:hive2://hadoop91:10000> desc formatted xtable;
OK
| Location:                     | hdfs://hadoop90:9000/user/hive/warehouse/xtable             | NULL                  |

Create data

vi xtable.txt

# 存入以下数据
xhx 15 math,english,history
bjx 20 physical,biological

Load data into the table

[root@hadoop91 ~]# hdfs dfs  -put /root/xtable.txt hdfs://hadoop90:9000/user/hive/warehouse/xtable/ 

View table

0: jdbc:hive2://hadoop91:10000> select * from xtable;
+--------------+-------------+-------------------------------+--+
| xtable.name  | xtable.age  |        xtable.subject         |
+--------------+-------------+-------------------------------+--+
| xhx          | 15          | ["math","english","history"]  |
| bjx          | 20          | ["physical","biological"]     |
+--------------+-------------+-------------------------------+--+

Add an explode to the query

0: jdbc:hive2://hadoop91:10000> select explode(subject) from xtable;
+-------------+--+
|     col     |
+-------------+--+
| math        |
| english     |
| history     |
| physical    |
| biological  |
+-------------+--+
5 rows selected (0.402 seconds)

If you want to find out the name and id, the result is as follows, an error is reported

0: jdbc:hive2://hadoop91:10000> select name,age,explode(subject) from xtable;
Error: Error while compiling statement: FAILED: SemanticException [Error 10081]: UDTF's are not supported outside the SELECT clause, nor nested in expressions (state=42000,code=10081)

So this time you need to use Lateral View

0: jdbc:hive2://hadoop91:10000> select name,age,subcol from xtable lateral view explode(subject) subtable as subcol;
+-------+------+-------------+--+
| name  | age  |   subcol    |
+-------+------+-------------+--+
| xhx   | 15   | math        |
| xhx   | 15   | english     |
| xhx   | 15   | history     |
| bjx   | 20   | physical    |
| bjx   | 20   | biological  |
+-------+------+-------------+--+
5 rows selected (0.302 seconds)

Guess you like

Origin blog.csdn.net/qq_43853055/article/details/115328666