MD5 is used for business needs, but Impala does not have this function, so it is implemented with UDF. The following is the implementation process.
UDF implementation points:
- Import the corresponding version of hive-exec.jar according to the version of Hive in the cluster
- The custom UDF class should inherit the interface UDF
- Implement the evaluate() method
maven dependencies:
<dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>1.1.0</version> </dependency>
Source code:
import org.apache.hadoop.hive.ql.exec.UDF; import java.security.MessageDigest; public class MD5 extends UDF{ public static String evaluate(String value) { StringBuilder sb = new StringBuilder(); try { MessageDigest messageDigest = MessageDigest.getInstance("MD5"); byte[] bytes = messageDigest.digest(value.getBytes()); for (int i = 0; i < bytes.length; i++) { int tempInt = bytes[i] & 0xff; if (tempInt < 16) { sb.append(0); } sb.append(Integer.toHexString(tempInt)); } } catch (Exception e) { System.out.println(e.getMessage()); } return sb.toString(); } public static void main(String[] args) { String hello = "123456789"; System.out.println("MD5 encrypted result: " + evaluate(hello)); } }
Export jar package: mvn package
Upload to Hdfs: hdfs dfs -copyFromLocal ./MyHiveUDF.jar /user/impala/user_function/
Impala registration: execute in hue's impala query interface (or impala shell)
create function md5(string) returns string location 'hdfs://nameservice/user/impala/user_function/MyHiveUDF.jar' symbol='com.business.bi.udf.MD5';
test:
select MD5('123456789')
The output is: 25f9e794323b453885f5181f1b624d0b
After the above process is performed, this method can also be used in Hive
Reference: http://th7.cn/Program/java/201709/1257880.shtml