HIVE's custom function

Hive custom functions include three types of UDF, UDAF, and UDTF

UDF (User-Defined-Function) one in and one out

UDAF (User-Defined Aggregation Funcation) aggregate function, more in and out. Count/max/min

UDTF (User-Defined Table-Generating Functions) One input and multiple output, such as lateral view explore()

Usage: add the jar file of the custom function in the HIVE session, then create the function and then use the function

UDF

1. The UDF function can be directly applied to the select statement, and then output the content after formatting the query structure.

2. When writing UDF functions, you need to pay attention to the following points:

a) Custom UDF needs to inherit org.apache.hadoop.hive.ql.UDF.

b) The evaluate function needs to be implemented, and the evaluate function supports overloading.

Example: Write a Demo that returns the length of the string:

Copy the code
import org.apache.hadoop.hive.ql.exec.UDF;

a GetLength the extends the UDF {class public public int the evaluate (String STR) { the try { return str.length (); } the catch (Exception E) { return -1; } } } copy the code 3, step









a) Package the program and put it on the target machine;

b) Enter the hive client and add the jar package:

hive> add jar /root/hive_udf.jar
  c) Create a temporary function:

hive> create temporary function getLen as'com.raphael.len.GetLength';
  d) Query HQL statement:

Copy the code
Hive> SELECT getLen (info) from apachelog;
the OK
60
29
87
102
69
60
67
79
66
Time taken: 0.072 seconds The, FETCHED: Row. 9 (S)
copying the code
  e) destroy a temporary function:

hive> DROP TEMPORARY FUNCTION getLen;

OUT OF

Multi-travel and one-line output, such as sum(), min(), used in group by

1. Must inherit

org.apache.hadoop.hive.ql.exec.UDAF (function class inheritance)

org.apache.hadoop.hive.ql.exec.UDAFEvaluator (the internal class Evaluator implements the UDAFEvaluator interface)

2. Evaluator needs to implement init, iterate, terminatePartial, merge, terminate these functions

init(): similar to the constructor, used for the initialization of UDAF

iterate(): Receive the incoming parameters, perform internal rotation, and return boolean

terminatePartial(): No parameters, it returns the rotation data after the rotation of the iterate function ends, similar to the Combiner of hadoop

merge(): Receive the return result of terminatePartial, perform data merge operation, and the return type is boolean

terminate(): returns the final aggregate function result

#Develop a function with:

#Oracle's wm_concat() function

#Mysql的group_concat()

UDAF detailed documentation: http://www.cnblogs.com/ggjucheng/archive/2013/02/01/2888051.html

UDTF

UDTF detailed documentation: http://www.cnblogs.com/ggjucheng/archive/2013/02/01/2888819.html

Guess you like

Origin blog.csdn.net/weixin_44999079/article/details/97101204