Configure and enable Hive remote connection
Hive remote connection
To configure Hive remote connections, first ensure that HiveServer2 is started and listening on the specified port
hive/bin/hiveserver2
Check if HiveServer2 is running
# lsof -i:10000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 660 root 565u IPv6 89917 0t0 TCP *:ndmp (LISTEN)
Remote connection to Hive by default
If Hive is running in an environment integrated with Hadoop, HiveServer2 can be integrated with the user authentication mechanism in Hadoop and will use authenticated Hadoop user credentials for authentication and authorization.
In the Database menu bar of IDEA, operate as follows, add a Hive connection,
fill in the Hive address, and the user name used in Hadoop
Note: For the first time use, the configuration process will prompt that the JDBC driver is missing, just follow the prompts to download it.
Click Test Connection
Test and find that the connection to Hive fails, and the background log of hiveserver2 prompts:
WARN [HiveServer2-Handler-Pool: Thread-47] thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(340)) - Error opening session:
org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate root
at org.apache.hive.service.cli.session.SessionManager.createSession(SessionManager.java:434)
at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:373)
at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:195)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:472)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:322)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1497)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1482)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
solution:
Add the following configuration to
Hadoop/etc/hadoop/core-site.xml
the file and distribute it to each node
Note: root
: refers to the user name used by the Hadoop component at runtime, which should be modified according to its own configuration
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
Restart the connection test after restarting Hadoop and hiveserver2
Remotely connect to Hive with a custom authentication class
In Hive, by default, the user authentication mechanism is not enabled, that is, the default username and password of hive are empty. For security, you can enable user and password to log in to Hive by customizing an authentication class
Create a Java project and make sure that the required dependencies are included in the project, such as the JDBC driver for Hive
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>3.1.3</version>
</dependency>
Note: It should be consistent with the Hive JDBC version used by the server.
Create a class that implements the PasswdAuthenticationProvider interface.
package cn.ybzy.demo;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hive.service.auth.PasswdAuthenticationProvider;
import org.slf4j.Logger;
import javax.security.sasl.AuthenticationException;
public class MyHiveCustomPasswdAuthenticator implements PasswdAuthenticationProvider {
private Logger LOG = org.slf4j.LoggerFactory.getLogger(MyHiveCustomPasswdAuthenticator.class);
private static final String HIVE_JDBC_PASSWD_AUTH_PREFIX = "hive.jdbc_passwd.auth.%s";
private Configuration conf = null;
@Override
public void Authenticate(String userName, String passwd)
throws AuthenticationException {
LOG.info("Hive 用户: " + userName + " 尝试登录");
String passwdConf = getConf().get(String.format(HIVE_JDBC_PASSWD_AUTH_PREFIX, userName));
if (passwdConf == null) {
String message = "找不到对应用户的密码配置, 用户:" + userName;
LOG.info(message);
throw new AuthenticationException(message);
}
if (!passwd.equals(passwdConf)) {
String message = "用户名和密码不匹配, 用户:" + userName;
throw new AuthenticationException(message);
}
}
public Configuration getConf() {
if (conf == null) {
this.conf = new Configuration(new HiveConf());
}
return conf;
}
public void setConf(Configuration conf) {
this.conf = conf;
}
}
Package the Java project and upload it to the lib directory of Hive at the same time
mv hive/MyHiveCustomPasswdAuthenticator.jar hive/lib/
Modify hive-site.xml for configuration
<!-- 使用自定义远程连接用户名和密码 -->
<property>
<name>hive.server2.authentication</name>
<value>CUSTOM</value><!--默认为none,修改成CUSTOM-->
</property>
<!-- 指定解析类 -->
<property>
<name>hive.server2.custom.authentication.class</name>
<value>cn.ybzy.demo.MyHiveCustomPasswdAuthenticator</value>
</property>
<!--设置用户名和密码 name属性中root是用户名 value属性中时密码-->
<property>
<name>hive.jdbc_passwd.auth.hive</name>
<value>hive123</value>
</property>
Permissions issue
When connecting to Hive remotely in IDEA and operating it, the following exceptions may occur:
ERROR --- [ HiveServer2-Background-Pool: Thread-440] org.apache.hadoop.hive.metastore.utils.MetaStoreUtils (line: 166) : Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/hive/warehouse":root:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1855)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1839)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1798)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3175)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1145)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:714)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916)
reason:
When operating Hive, it will operate HDFS, and the user who logs in to Hive does not have permission to operate
solution:
Make sure that Hive users (such as hive) can operate HDFS.
Since Hive configures the metastore data storage location, for example /hive/warehouse
, the corresponding permissions need to be granted to the directory
hadoop fs -chown hive:hive /hive/warehouse
Create a database in IDEA:
create database demo;
View HDFS:
Additional instructions
In addition to the above methods, Hive also provides Kerberos or LDAP advanced authentication methods, which are a bit complicated and will not be discussed for now.
In addition, in earlier versions of Hive (2.x and earlier), the user name and password for Hive remote connection can be configured as follows
<property>
<name>hive.server2.authentication</name>
<value>PASSWORD</value>
</property>
<property>
<name>hive.server2.authentication.user.name</name>
<value>hive</value>
</property>
<property>
<name>hive.server2.authentication.user.password</name>
<value>hive123</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>node01</value>
</property>