HIVE UDF and JDBC programming

HIVE UDF and JDBC programming

、 、UDF

    UDF is used to extend the HIVE function library, and can use java code to customize the functional requirements.

1. Steps

    1. Create a new java project.

    2. Import HIVE related packages. The jar package is in the lib directory of the HIVE installer. Just copy the jar package.

    3. Create a class that inherits the UDF class.

    4. Write a method named evaluate by yourself , the return value and parameters are arbitrary, but the method name must be evluate .

    In order to be processed by mapreduce, String should be processed by Text.

    5. Type the written class into a jar package. When typing the jar package, you can only type the class you wrote yourself, and the jar package resources can be excluded from the jar package. Then upload to linux.

    5. On the hive command line, register the UDF with hive: add jar /xxxx/xxxx.jar

    6. Give the current UDF a name: create temporary function fname as 'full path name of the class';

    Then you can use the custom function in hql.

case

    Write a simple lowercase to uppercase.

import org.apache.hadoop.hive.ql.exec.UDF;
public class MyUDFDemo extends UDF{
	public String evluate(String str) {
		return str.toUpperCase();
	}
}

2. JDBC programming

1. Introduction

    Hive implements the jdbc interface, so it is very convenient to use jdbc technology to operate through java code.

2. Steps

1. Open external service

    By default, HIVE closes external services, and the HiveServer2 service needs to be enabled on the server side. The command is as follows:

./hive --service hiveserver2

    Only when this mode is always on can the connection succeed, otherwise, the connection fails.

    You can use the following command to make the service run in the background:

[root@hadoop bin]# ./hive --service hiveserver2 &
[1] 6669
[root@hadoop bin]# bg 1
-bash: bg: job 1 already in background

    In this way, the program enters the background and does not affect other operations.

2. java engineering

1> Create a project

    Create a local java project.

2> Import the jar package

    Import hive-jdbc-1.2.0-standalone.jar in the hive\lib directory

    Import hadoop-common-2.7.1.jar under hadoop-2.7.1\share\hadoop\common

3> Write jdbc code

	public static void main(String[] args) {
		Connection conn = null;
		PreparedStatement st = null;
		ResultSet rs = null;
		try {
			// 1.注册数据库驱动
			Class.forName("org.apache.hive.jdbc.HiveDriver");
			// 2.获取数据连接
			conn = DriverManager.getConnection("jdbc:hive2://192.168.75.150:10000/park", "root", "root");
			// 3.获取传输器对象
			st = conn.createStatement();
			// 4.传输sql执行获取结果集
			rs = st.executeQuery("select * from stu");
			// 5.处理结果集
			while (rs.next()) {
				String str = rs.getString("name");
				System.out.println(str);
			}
		} catch (Exception e) {
			e.printStackTrace();
		} finally {
			// 6.关闭连接
			if (rs != null) {
				try {
					rs.close();
				} catch (Exception e) {
					e.printStackTrace();
				} finally {
					rs = null;
				}
			}
			if (st != null) {
				try {
					st.close();
				} catch (Exception e) {
					e.printStackTrace();
				} finally {
					st = null;
				}
			}
			if (conn != null) {
				try {
					conn.close();
				} catch (Exception e) {
					e.printStackTrace();
				} finally {
					conn = null;
				}
			}
		}
	}

    The above need to pay attention to the jdbc driver and the connection address protocol.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325153426&siteId=291194637