table of Contents
1, demand
2, Demo and data structure of FIG.
3, create a table to load data Hive
4, the creation of Hive UDF functions and the results show
1, demand:
The company carried out work to do data extraction, data which need to be desensitized by UDF function hive
demo download path: https: //download.csdn.net/download/silentwolfyh/10939631
2, Demo and data structure of FIG.
Data and operational steps doc \ data in, Maven configuration dependent and there Jar package name, as long as you can install package.
#数据
1|61234522222000654321|18613718137|[email protected]|010381199909183217
2|51234522222000654322|18613718126|[email protected]|020381199909183216
3|41234522222000654323|18613718125|[email protected]|030381199909183215
4|31234522222000654324|18613718124|[email protected]|040381199909183214
5|21234522222000654325|18613718123|[email protected]|050381199909183213
#创建hive表
create table IF NOT EXISTS user
(
id string,
bankNum string,
phoneNum string,
email string,
id_num string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
stored as textfile;
#加载数据
load data local inpath '/home/yuhui1/user.txt' into table user;
Loading data to table stage.user
#创建函数
CREATE FUNCTION stage.ID_Number AS 'com.hive.udf.ID_Number' using jar 'hdfs://nameservice1/user/hive/udf/ID_Number-jar-with-dependencies.jar';
#查询
select ID_Number(phoneNum) from user;
3, create a table to load data Hive
The data into the local '/home/yuhui1/user.txt'
#加载数据
load data local inpath '/home/yuhui1/user.txt' into table user;
Loading data to table stage.user
Upload the ID_Number-jar-with-dependencies.jar to HDFS above, which [is] nameservice1 hdfs of NameNode Nameservice
#创建函数
CREATE FUNCTION stage.ID_Number AS 'com.hive.udf.ID_Number' using jar 'hdfs://nameservice1/user/hive/udf/ID_Number-jar-with-dependencies.jar';
4, the creation of Hive UDF functions and the results show
Note: stage name is a hive of database