Hive自定义UDF函数开发

场景需求:判断字符串里面是否有中文


pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>houbank</groupId>
  <artifactId>hive_udf</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>hive_udf</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.10</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.5.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-exec</artifactId>
            <version>0.13.1</version>
            </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>0.13.1</version>
        </dependency>
  </dependencies>

</project>


代码:

package houbank.hive_udf;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public class ishanzi extends UDF {
    public boolean evaluate(Text input){
        boolean result = (input.getLength()== input.getBytes().length);
        return result ;
    }  

}


导出为jar包:


进行hive中执行如下:

hive>add jar /data/ishanzi.jar  (上传到本地data路径下)
hive>create temporary function ishanzi as 'houbank.hive_udf.ishanzi'
hive>select ishanzi(name) from people;(例子)


猜你喜欢

转载自blog.csdn.net/qq_33004309/article/details/79793203