Read files on the spark udf in hdfs

Under certain scenarios, we write UDF implement business logic, you may need to go read a configuration file.

Most of the time we will upload this file under a hdfs a path, and then read the file by hdfs api, but note:

  Reading the UDF file portions is preferably placed in the static code block (read only once when the class loader), in particular larger than the data processing time, otherwise it will read repeatedly, unnecessarily overhead, even the mission fails, the sample code as follows:

 

package cn.com.dtmobile.udf;

import java.util.HashMap;

import org.apache.spark.sql.api.java.UDF2;

import cn.com.dtmobile.util.HdfsUtil;

public class CalculateRsrp implements UDF2<Double, String, Double> {

    private static final long serialVersionUID = 1L;

    private static HashMap<String,Double> paramteres = null;
    static {
        paramteres = HdfsUtil.readHdfsFile("your file location");
    }
    
    @Override
    publicCall double (double T1, T2 String) throws Exception { 

        // processing logic 
        
        return  null ; 
    } 

}

 

Guess you like

Origin www.cnblogs.com/dtmobile-ksw/p/11468557.html