Article directory
一. Requirements
- Users are strongly advised to start moving to Java 1.8 (see HIVE-8607).
- Hadoop 2.x (preferred), this article actually uses: hadoop3.0.3
- The instructions in this document are applicable to Linux and Mac.
二. Installing Hive from a Stable Release
Download page:
Unzip:
tar -xzvf hive-x.y.z.tar.gz
Set the HIVE_HOME environment variable: /etc/profile
$ cd hive-x.y.z
$ export HIVE_HOME={
{
pwd}}
$ export PATH=$HIVE_HOME/bin:$PATH
3. Running Hive
Hive uses Hadoop, so:
- Set hadoop environment variables
- 创建hdfs目录:
In addition, you must use below HDFS commands to create /tmp and /user/hive/warehouse (aka hive.metastore.warehouse.dir) and set them chmod g+w before you can create a table in Hive.
$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$ $HADOOP_HOME/bin/hadoop fs -mkdir -p /user/hive/warehouse
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
四. Running Hive CLI
hive
Logging initialized using configuration in jar:file:/home/taiyi/apache-hive-3.1.3-bin/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
因为没有初始化元数据,实际上还不能使用
五.Running HiveServer2 and Beeline
Starting from Hive 2.1, we need to run the schematool command below as an initialization step. For example, we can use “derby” as db type.
Note: During the actual author's operation, the metadata that is not initialized with derby cannot be used, and then use mysql as the metadata database of hive.
1. Download and install mysql
Installation, setting up new users, empowering new users, etc., are omitted here
2. Download the mysql driver
Download the mysql driver mysql-connector-java-8.0.11.jar and upload it to the $HIVE_HOME/lib folder
3. Configure hive-site.xml
hive-site.xml is located in the $HIVE_HOME/conf folder.
By default, there is no such file, and all parameters are configured in hive-default.xml.template.
Therefore, the first time you use it, you need to manually create this file. Next configure the metabase:
<configuration>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
当设置为true时,会话将以连接用户的权限来执行操作。会出现如下报错:
当设置为false时,HiveServer2会话将使用HiveServer2服务的运行用户的身份来执行查询。
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hostname:3306/hivedb?allowPublicKeyRetrieval=true&createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore
allowPublicKeyRetrieval=true:用于允许在连接MySQL数据库时检索公钥。它的作用是允许连接到MySQL服务器时自动检索服务器的SSL公钥。
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore
注意mysql的驱动类要和放入的驱动包一致
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>username</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
<description>password to use against metastore database</description>
</property>
</configuration>
hive.server2.enable.doAs related error
Connecting to jdbc:hive2://localhost:10000
23/07/30 15:08:04 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000:
Failed to open new session: java.lang.RuntimeException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: taiyi is not allowed to impersonate anonymous
(state=08S01,code=0)
allowPublicKeyRetrieval=true preventable error reporting
Public Key Retrieval is not allowed
but be careful
AllowPublicKeyRetrieval=True
可能会导致恶意的代理通过中间人攻击(MITM)获取到明文密码,所以默认是关闭的,必须显式开启
4. Initialize the metabase
schematool -dbType mysql -initSchema
5. Connect via beeline
HiveServer2 (introduced in Hive 0.11) has its own CLI called Beeline. HiveCLI is now deprecated in favor of Beeline, as it lacks the multi-user, security, and other capabilities of HiveServer2.
To run HiveServer2 and Beeline from shell:
启动server
nohup hive --service hiveserver2 >> /tmp/hiveserver2.log 2>&1 &
启动metastore
nohup hive --service metastore >> /tmp/hivemeta.log 2>&1 &
beeline -u jdbc:hive2://localhost:10000
如果localhost连接不了,则改为具体机器的hostname