Hive user-defined functions

1.1 About Custom Functions

1) Hive comes with some functions, such as: max / min the like, but a limited number of their own custom UDF can be conveniently extended.

2) When the built-in functions provided by Hive unable to meet your business process needs, then you can consider using user-defined function (UDF: user-defined function).

3) The user-defined function is divided into the following three categories:

​ (1)UDF(User-Defined-Function)

Entered a

​ (2)UDAF(User-Defined Aggregation Function)

Aggregate function, into a plurality

Similar to: count / max / min

​ (3)UDTF(User-Defined Table-Generating Functions)

A multiple-out

​ 如lateral view explore()

4) official documents address

https://cwiki.apache.org/confluence/display/Hive/HivePlugins

5) programming steps:

(1) inherited org.apache.hadoop.hive.ql.exec.UDF

(2) the need to achieve evaluate function; evaluate support function overloading;

(3) create a function in hive command line window

a) add a jar

add jar linux_jar_path

b) Create a function

create [temporary] function [dbname.]function_name AS class_name;

(4) Delete function in the hive command line window

Drop [temporary] function [if exists] [dbname.]function_name;

6) Notes

(1) UDF must have a return type, you can return null, but can not be void return type;

1.2 Case practical operation

1) defines four UDF class code, see: GitHub

Usage: You can select the item checkout IDEA then placed into use after installation directory hive maven labeled jar package, the "hive / lib" directory

Such as:

2) execution of the add operation in the hive added jar package configured:

hive (default)> add jar /opt/module/hive/lib/log-hive.jar;

3) permanent registration function

hive (default)>create function getdaybegin AS 'com.bigdata.hive.DayBeginUDF';

hive (default)>create function getweekbegin AS 'com.bigdata.hive.WeekBeginUDF';

hive (default)>create function getmonthbegin AS 'com.bigdata.hive.MonthBeginUDF';

hive (default)>create function formattime AS 'com.bigdata.hive.FormatTimeUDF';

4) Verify function

Since I will select Hive metadata information stored in MySQL (Hive of Metastore default stored in the database comes derby, it is recommended to use MySQL storage Metastore), so I chose to log in mysql

[bigdata@hadoop101 ~]$ mysql -uroot -p000000

mysql> show databases;

mysql> use metastore;

mysql> show tables;

mysql> select * from FUNCS;

Here is my view of the specific information in Dbeaver table:

Can be found in four custom function has been added into it.

About FUNC Field description:

5) Delete function

hive (applogsdb)> drop function getdaybegin;

hive (applogsdb)> drop function getweekbegin;

hive (applogsdb)> drop function getmonthbegin;

hive (applogsdb)> drop function formattime;

6) Note: registered in the database in which a permanent function, you must delete the method under which the database

Methods such as the creation of the applogsdb database, you must call the drop method in which data can be achieved delete function.

Guess you like

Origin www.cnblogs.com/cosmos-wong/p/11992874.html