Hive custom function UDF&&transform

First put the jar package in hive lib under the classpath

extends udf

There can be multiple evaluate methods, which are overloaded and distinguished according to the type of parameter values ​​passed in

Exporting the jar package idea is very troublesome, so change to eclipse decisively

hive下 add JAR xxx.jar;

hive> create temporary function functionName as 'the class name where the jar package is located';

select num xx(num) from p;

Processing json format files

 ObjectMapper om = new ObjectMapper();

         try {
             MovierateBean  bean = om.readValue(jsonline,MovierateBean. class );
             return bean.toString();
         } catch (Exception e){
             return (jsonline);
         }  


Hive's UDF and UDAF need to be written in the java language. Hive provides another way to achieve the purpose of custom UDF and UDAF, but the use method is simpler. This is TRANSFORM. The TRANSFORM language supports UDF-like functions through multiple languages.

Hive also provides two keywords MAP and REDUCE. But MAP and REDUCE can generally be understood as just aliases of TRANSFORM. It does not mean that it is generally called in the map phase or in the reduce phase. See the official website description for details.

 

We can use the following python script in place of the UDF function above:

The content of the server-side /opt/movie_trans.py script is as follows:

import sys
import datetime
import json
 
for line in sys.stdin:
     #line='{"movie":"2797","rate":"4","timeStamp":"978302039","uid":"1"}'
     line = line.strip()
     hjson = json.loads(line)
     movie = hjson[ 'movie' ]
     rate = hjson[ 'rate' ]
     timeStamp = hjson[ 'timeStamp' ]
     uid = hjson[ 'uid' ]
     timeStamp = datetime.datetime.fromtimestamp( float (timeStamp))
     print '\t' .join([movie, rate, str (timeStamp),uid])

Execute the following script in hive:

ADD FILE / opt / movie_trans.py;
 
SELECT
   TRANSFORM (rate)
   USING 'python movie_trans.py'
   AS (movie,rate, timeStamp, uid)
FROM rating;

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325520226&siteId=291194637