Hive/Impala/Hbase/Spark Kerberos

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/tom_fans/article/details/85243114

使用hadoop Kerberos有几个地方需要注意,避免真实环境老是有问题:

1. 我以前使用IP地址构建集群(虽然也用了DNS解析),但是有时候你直接通过主机名+DNS解析来做集群,这2者从我实际测试来看是有区别的,使用IP没有任何问题,但是使用DNS解析,开启kerberos总是会有些许问题,因此如果使用DNS解析,建议客户端连接kerberos的主机把集群的IP和主机名写入/etc/hosts,至于为什么很难解释。

2. 如果你的kerberos使用的是强加密,256就是强加密,一定要记得把JAVA下的JCE的JAR包替换,包括你使用的客户端,否者老会出现拿不到认证,因为最起码的加密级别都不一样,肯定会有问题。

3.如果是使用spark standalone,程序要使用hdfs kerberos,这个地方最好spark要支持你的HADOOP,YARN版本,我没有细测,但是版本不同,多少有点问题。

4.如果没有使用LDAP配合Kerberos, 你所建立的keytab用户在操作系统上也建立一个同名用户,当然,这个地方你要看具体的程序报错,如果提示用户找不到,就建立,否者也应该不需要。

开启Hadoop Kerberos之后,要连接组件需要比不开启Kerberos多2个动作,一个是告诉程序KDC主机是谁,另外一个告诉程序使用者是谁。 简单说就是多几行代码:

        System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
        Configuration conf = new Configuration();
        conf.set("hadoop.security.authentication", "Kerberos");
        UserGroupInformation.setConfiguration(conf);
        UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");

上面是我在WINDOW测试,所以krb5.conf指定在C盘,如果是LINUX应用服务器,当然不需要指定,直接放到/etc目录下即可,默认就会去/etc找krb5.conf, 至于keytab,这个必须指定,你可以放到程序的类路径下classpath:keytab即可。

下面简单的贴一下相应的HIVE,HBASE,SPARK,IMPALA连接kerberos的JAVA代码:

HIVE:

import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;

/**
 * Hello world!
 *
 */
public class App 
{
    public static void main( String[] args ) throws ClassNotFoundException, SQLException, IOException
    {
        //String driver = "com.cloudera.impala.jdbc41.Driver";
    	//String driver = "com.cloudera.hive.jdbc41.HS2Driver";
    	String driver = "org.apache.hive.jdbc.HiveDriver";
        //String url = "jdbc:impala://10.40.2.103:21050/default;UseSasl=0;AuthMech=3;UID=impala;PWD=";
        String url = "jdbc:hive2://xxxx.xxxx.com:10000/default;principal=hive/[email protected]";
        String username = "hive";
        String password = "hive";
    	Connection connection = null;	
        Class.forName(driver);
        System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
        Configuration conf = new Configuration();
        conf.set("hadoop.security.authentication", "Kerberos");
        UserGroupInformation.setConfiguration(conf);
        UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");
        connection = DriverManager.getConnection(url);
        String sql = "select count(*) from test.test";
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery(sql);
        while(resultSet.next()) {
        	System.out.println(resultSet.getInt(1));
        }
        resultSet.close();
        statement.close();
        connection.close();
    }
}

impala:

import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;

/**
 * Hello world!
 *
 */
public class App 
{
    public static void main( String[] args ) throws ClassNotFoundException, SQLException, IOException
    {
        String driver = "com.cloudera.impala.jdbc41.Driver";
    	//String driver = "com.cloudera.hive.jdbc41.HS2Driver";
    	//String driver = "org.apache.hive.jdbc.HiveDriver";
        //String url = "jdbc:impala://10.40.2.103:21050/default;UseSasl=0;AuthMech=3;UID=impala;PWD=";
        String url = "jdbc:impala://10.40.2.103:21050/test;UseSasl=0;AuthMech=3;UID=impala;PWD=;principal=hive/[email protected]";
        String username = "hive";
        String password = "hive";
    	Connection connection = null;
    	System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
        Class.forName(driver);
        Configuration conf = new Configuration();
        conf.set("hadoop.security.authentication", "Kerberos");
        UserGroupInformation.setConfiguration(conf);
        UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");
        connection = DriverManager.getConnection(url);
        String sql = "select count(*) from hbase_test";
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery(sql);
        while(resultSet.next()) {
        	System.out.println(resultSet.getInt(1));
        }
        resultSet.close();
        statement.close();
        connection.close();
    }
}

HBASE:

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.security.UserGroupInformation;

public class App {
	
	public static void main(String[] args) throws IOException {
		String table = "mes:test";
		Configuration conf = HBaseConfiguration.create();
		System.setProperty("HADOOP_USER_NAME", "hbase");
		conf.set("hbase.zookeeper.quorum",
				"tsczbddndev1.trinasolar.com");
		conf.set("hbase.zookeeper.property.clientPort", "2181");
		//conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
	//	conf.setInt("hbase.client.operation.timeout",10000);
	//	conf.setInt("hbase.rpc.timeout",6000);
		conf.setInt("hbase.client.retries.number", 3);
		System.setProperty("java.security.krb5.conf", "resource/krb5.conf");
		UserGroupInformation.setConfiguration(conf);
		UserGroupInformation.loginUserFromKeytab("[email protected]", "resource/hive.keytab");	
		HTable myTable = new HTable(conf, TableName.valueOf(table));
		Put put = new Put(Bytes.toBytes("CDH5.10.21"));
		put.add(Bytes.toBytes("cf"), Bytes.toBytes("name"), Bytes.toBytes("this is a test"));
		myTable.put(put);
		myTable.flushCommits();
		System.out.println("put successful");
		myTable.close();
	}

}

Spark on Yarn:

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.DataFrame;


/**
 * Hello world!
 *
 */
public class App {
	public static void main(String[] args) throws IOException {
		System.setProperty("HADOOP_USER_NAME", "hdfs");
		SparkConf sparkConf = new SparkConf().setAppName("JavaWordCount");
		sparkConf.setMaster("yarn-client");
		// sparkConf.set("spark.submit.deployMode", "client");
		//sparkConf.set("spark.yarn.jar", "hdfs:///tmp/spark-assembly_2.10-1.6.0-cdh5.10.2.jar");
		sparkConf.set("spark.yarn.appMasterEnv.CLASSPATH",
				"$CLASSPATH:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/*");
		// System.setProperty("sun.security.krb5.debug", "true");
		System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
		Configuration conf = new Configuration();
		// conf.set("hadoop.security.authentication", "Kerberos");
		// sparkConf.set("spark.security.credentials.hdfs.enabled", "true");
		UserGroupInformation.setConfiguration(conf);
		UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");
		JavaSparkContext ctx = new JavaSparkContext(sparkConf);
		JavaRDD<String> lines = ctx.textFile("/tmp/a.sql");
		System.out.println(lines.count());
		ctx.close();
		
	}
}

猜你喜欢

转载自blog.csdn.net/tom_fans/article/details/85243114