Configure and enable Hive remote connection

Hive remote connection

To configure Hive remote connections, first ensure that HiveServer2 is started and listening on the specified port

hive/bin/hiveserver2

Check if HiveServer2 is running

# lsof -i:10000
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    660 root  565u  IPv6  89917      0t0  TCP *:ndmp (LISTEN)

Remote connection to Hive by default

If Hive is running in an environment integrated with Hadoop, HiveServer2 can be integrated with the user authentication mechanism in Hadoop and will use authenticated Hadoop user credentials for authentication and authorization.

In the Database menu bar of IDEA, operate as follows, add a Hive connection,
insert image description here
fill in the Hive address, and the user name used in Hadoop

Note: For the first time use, the configuration process will prompt that the JDBC driver is missing, just follow the prompts to download it.

insert image description here

Click Test ConnectionTest and find that the connection to Hive fails, and the background log of hiveserver2 prompts:

 WARN  [HiveServer2-Handler-Pool: Thread-47] thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(340)) - Error opening session:
org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate root
        at org.apache.hive.service.cli.session.SessionManager.createSession(SessionManager.java:434)
        at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:373)
        at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:195)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:472)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:322)
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1497)
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1482)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

solution:

Add the following configuration to Hadoop/etc/hadoop/core-site.xmlthe file and distribute it to each node

Note: root: refers to the user name used by the Hadoop component at runtime, which should be modified according to its own configuration

    </property>
        <property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.root.groups</name>
        <value>*</value>
    </property>

Restart the connection test after restarting Hadoop and hiveserver2

insert image description here

Remotely connect to Hive with a custom authentication class

In Hive, by default, the user authentication mechanism is not enabled, that is, the default username and password of hive are empty. For security, you can enable user and password to log in to Hive by customizing an authentication class

Create a Java project and make sure that the required dependencies are included in the project, such as the JDBC driver for Hive

        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>3.1.3</version>
        </dependency>

Note: It should be consistent with the Hive JDBC version used by the server.

Create a class that implements the PasswdAuthenticationProvider interface.

package cn.ybzy.demo;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hive.service.auth.PasswdAuthenticationProvider;
import org.slf4j.Logger;

import javax.security.sasl.AuthenticationException;

public class MyHiveCustomPasswdAuthenticator implements PasswdAuthenticationProvider {
    
    

    private Logger LOG = org.slf4j.LoggerFactory.getLogger(MyHiveCustomPasswdAuthenticator.class);

    private static final String HIVE_JDBC_PASSWD_AUTH_PREFIX = "hive.jdbc_passwd.auth.%s";

    private Configuration conf = null;

    @Override
    public void Authenticate(String userName, String passwd)
            throws AuthenticationException {
    
    
        LOG.info("Hive 用户: " + userName + " 尝试登录");
        String passwdConf = getConf().get(String.format(HIVE_JDBC_PASSWD_AUTH_PREFIX, userName));
        if (passwdConf == null) {
    
    
            String message = "找不到对应用户的密码配置, 用户:" + userName;
            LOG.info(message);
            throw new AuthenticationException(message);
        }
        if (!passwd.equals(passwdConf)) {
    
    
            String message = "用户名和密码不匹配, 用户:" + userName;
            throw new AuthenticationException(message);
        }
    }

    public Configuration getConf() {
    
    
        if (conf == null) {
    
    
            this.conf = new Configuration(new HiveConf());
        }
        return conf;
    }

    public void setConf(Configuration conf) {
    
    
        this.conf = conf;
    }
}

Package the Java project and upload it to the lib directory of Hive at the same time

mv hive/MyHiveCustomPasswdAuthenticator.jar hive/lib/

Modify hive-site.xml for configuration

<!-- 使用自定义远程连接用户名和密码 -->
<property>
	<name>hive.server2.authentication</name>
	<value>CUSTOM</value><!--默认为none,修改成CUSTOM-->
</property>
<!-- 指定解析类 -->
<property>
	<name>hive.server2.custom.authentication.class</name>
	<value>cn.ybzy.demo.MyHiveCustomPasswdAuthenticator</value>
</property>  
<!--设置用户名和密码  name属性中root是用户名 value属性中时密码-->
<property>
	<name>hive.jdbc_passwd.auth.hive</name>
	<value>hive123</value>
</property>  

insert image description here

Permissions issue

When connecting to Hive remotely in IDEA and operating it, the following exceptions may occur:

ERROR --- [           HiveServer2-Background-Pool: Thread-440]  org.apache.hadoop.hive.metastore.utils.MetaStoreUtils                           (line:  166)  :  Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/hive/warehouse":root:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1855)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1839)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1798)
        at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3175)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1145)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:714)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916)

reason:

When operating Hive, it will operate HDFS, and the user who logs in to Hive does not have permission to operate

solution:

Make sure that Hive users (such as hive) can operate HDFS.

Since Hive configures the metastore data storage location, for example /hive/warehouse, the corresponding permissions need to be granted to the directory

hadoop fs -chown hive:hive /hive/warehouse

Create a database in IDEA:

create database demo;

View HDFS:
insert image description here

Additional instructions

In addition to the above methods, Hive also provides Kerberos or LDAP advanced authentication methods, which are a bit complicated and will not be discussed for now.

In addition, in earlier versions of Hive (2.x and earlier), the user name and password for Hive remote connection can be configured as follows

<property>
  <name>hive.server2.authentication</name>
  <value>PASSWORD</value>
</property>
<property>
  <name>hive.server2.authentication.user.name</name>
  <value>hive</value>
</property>
<property>
  <name>hive.server2.authentication.user.password</name>
  <value>hive123</value>
</property>
    <property>
    <name>hive.cli.print.current.db</name>
    <value>true</value>
</property>
<property>
    <name>hive.server2.thrift.port</name>
    <value>10000</value>
</property>
<property>
     <name>hive.server2.thrift.bind.host</name>
    <value>node01</value>
</property>

Guess you like

Origin blog.csdn.net/qq_38628046/article/details/132177600