The difference between UDF, UDAF, UDTF

1. UDF: User-defined (ordinary) function, which only has an effect on a single line of values;

Inherit the UDF class and add the method evaluate()

    /**
     * @function 自定义UDF统计最小值
     *
     */
    public class Min extends UDF {
    
    
 
        public Double evaluate(Double a, Double b) {
    
    
 
            if (a == null)
                a = 0.0;
            if (b == null)
                b = 0.0;
            if (a >= b) {
    
    
                return b;
            } else {
    
    
                return a;
            }
        }
    }

2. UDAF: User-Defined Aggregation Funcation; user-defined aggregation function, which can have an effect on multiple rows of data; it is equivalent to the commonly used SUM() and AVG() in SQL, and is also an aggregation function;

Aggregate functions use:

		SELECT store_name, SUM(sales) 
		FROM Store_Information 
		GROUP BY store_name 
		HAVING SUM(sales) > 1500
		ORDER BY SUM(sales);
		 
		键字HAVING总要放在GROUP BY之后,ORDER BY之前

There are two ways to implement UDAF: simple and general:

  • a. Simple UDAF causes performance loss due to the use of Java reflection, and some features cannot be used and have been deprecated;
  • b. The other involves two classes: AbstractGenericUDAFResolver, GenericUDAFEvaluator;
    • Inherit the UDAFResolver class and override the getEvaluator() method;
    • Inherit the GenericUDAFEvaluator class and generate an instance to getEvaluator();
    • In the GenericUDAFEvaluator class, override the init(), iterate(), terminatePartial(), merge(), terminate() methods;

Refer to: Introduction to hive udaf development and detailed explanation of operation process
Hive UDAF development detailed explanation

3. UDTF: User-Defined Table-Generating Functions, user-defined table-generating functions, used to solve the problem of inputting one line and outputting multiple lines;

Inherit the GenericUDTF class and rewrite the initialize (return the output row information: the number of columns, type), process, close three methods;

Please refer to: UDTF writing and using in hive (transfer) .

Example of hive0.13 udtf usage .

4. Other

Delete temporary function

		drop temporary function toUpper;

Insert picture description here

Guess you like

Origin blog.csdn.net/qq_42578036/article/details/109726923