UDAF use of spark

What is UDAF?

UDAF (User Defined Aggregate Function) is a user-defined aggregate function. What is the difference between an aggregate function and an ordinary function? An ordinary function accepts one line of input and produces one output, and an aggregate function accepts a set (generally multiple lines) of input and then Produce an output, that is, find a way to aggregate a set of values. Similar to the sum operation, see here for spark udf use

Look directly at the demo below, calculate the average value of 1-10, the code is relatively simple

package spark

import org.apache.spark.sql.expressions.{MutableAggregationBuffer, UserDefinedAggregateFunction}
import org.apache.spark.sql.types.{DataType, IntegerType, StructField, StructType}
import org.apache.spark.sql.{Dataset, Row, SparkSession}
import java.lang

/**
  * spark的UDAF使用
  */
object UDAF {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder()
      .appName("UDAFDemo")
      .master("local[1]")
      .getOrCreate()
    val ds: Dataset[lang.Long] = spark.range(1,10)
    ds.createTempView("test")
    spark.udf.re

Guess you like

Origin blog.csdn.net/xianpanjia4616/article/details/88945128