Java 访问连接Hive的Kerberos认证前提

Simply put

The reason why authentication needs to be added when connecting to Hive in the Java code is that Hive is a distributed data processing system that is typically deployed in a multi-user environment, where access to data is controlled by policies and permissions.

Therefore, to ensure the security and integrity of the data, it is necessary to authenticate users before granting them access to the data.

In the code snippet provided, Kerberos is used as the authentication mechanism, which is a widely used network authentication protocol that provides strong authentication and secure communication over insecure networks.

The code sets the appropriate configuration properties for Kerberos authentication and uses the UserGroupInformation class to log in the user from the keytab file, which contains the user’s credentials.

By using Kerberos authentication, the code ensures that only authorized users are granted access to the Hive data, which helps to prevent unauthorized access or data breaches.

Java Code connect to Hive

		System.setProperty(KRB5_CONF, dataSource.getKrbConfigFile());  
        org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();  
        conf.set(CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION, "true");  
        conf.set(CommonConfigurationKeys.HADOOP_SECURITY_AUTHENTICATION, "kerberos");  
        UserGroupInformation.setConfiguration(conf);  
        try {
    
      
            UserGroupInformation.loginUserFromKeytab(dataSource.getKrbUser(), dataSource.getKeytabFile());  
        } catch (IOException e) {
    
      
            throw new RuntimeException(e);  
        }  

Explain Code

  1. The System.setProperty method is used to set the KRB5_CONF system property to the path of the Kerberos configuration file (specified by dataSource.getKrbConfigFile() ). This file contains configuration information required by the Kerberos authentication system, such as the location of the Kerberos realm, the KDC (Key Distribution Center), and the Kerberos ticket-granting ticket (TGT) server.
  2. Next, a new org.apache.hadoop.conf.Configuration object is created to hold Hadoop configuration properties.
  3. The conf.set method is used to enable Hadoop security authorization and authentication by setting the value of the CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION and CommonConfigurationKeys.HADOOP_SECURITY_AUTHENTICATION keys to true and kerberos , respectively. This ensures that all Hadoop interactions are authenticated using Kerberos.
  4. The UserGroupInformation.setConfiguration method is used to set the global configuration for all UserGroupInformation instances to the newly created configuration object.
  5. Finally, the UserGroupInformation.loginUserFromKeytab method is used to log in the user specified by dataSource.getKrbUser() using the keytab file specified by dataSource.getKeytabFile() . A keytab is a file that contains the shared secret key used to authenticate the specified user to the Kerberos server. This method performs a login operation that retrieves a Kerberos ticket-granting ticket (TGT) for the specified user from the KDC and stores it in a cache for later use by Hadoop. If the login operation fails, an IOException is thrown and the application terminates with a runtime exception.
    Overall, this code sets up the necessary configuration properties for Hadoop to use Kerberos authentication and logs in the user specified by the keytab file, allowing the application to securely access Hadoop resources. It is typically used in secure Hadoop clusters where Kerberos is the authentication mechanism.

猜你喜欢

转载自blog.csdn.net/weixin_38233104/article/details/130955179