Installation on the installation, do not pull the other
hive dependence
Before the installation must have the following conditions hive
1. A relational database can be connected, such as Mysql, postgresql the like, for storing metadata
2. hadoop, and start hdfs
3. HBase, not necessarily, but if not installed, there will be a warning, but does not affect the use of
4. java, 1.8 or later
Ready to work
1. Download the installation package
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/ Tsinghua Mirror, download speed
http://apache.org/dist/hive/ official website, download speed is slow
Select tar packets containing bin, the paper mounted hive-2.3.6
2. Upload server
The best uploaded to the master hadoop, and I do so; no need to upload all nodes
3. Extract, perhaps you can rename it, easy to operate
Environment Variables
export HIVE_HOME=/opt/SoftWare/Hive/hive‐2.3.2
export PATH=$PATH:$HIVE_HOME/bin
You can modify according to their own path
At this hive has been successfully installed, perform the hive --version to see the version
Configuration
1. First modify hive-env.sh file itself does not exist
cp hive-env.sh.template hive-env.sh
Add the following
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 export HADOOP_HOME=/usr/lib/hadoop-2.6.5 export HIVE_HOME=/usr/lib/hive2.3.6 export SPARK_HOME=/usr/lib/spark
2. Modify the hive-site.xml file itself does not exist
cp hive-default.xml.template hive-site.xml
Note that this should hive- default.xml.template duplicate copy, is a hive-default.xml, the other is a hive-site.xml, wherein hive-site.xml user-defined configuration, hive-default.xml global configuration;
When starting hive, hive-site.xml custom configurations will cover the same configuration item hive-default.xml global configuration.
Very not recommend a direct copy directly to the hive-default.xml be modified after hive-site, because in that case, we do not remember those configuration items have been modified, due to hive-site is to override the default configuration, we only local needs will need to modify the configuration to the hive-site.xml file can be. This process is omitted herein []
[Modify the following is a basic configuration that was first run up the hive, if used in special scenarios, you may need additional configuration]
The main configuration is the connection information database, Hive Derby database using a database as a default element , here a modified postgres
<! - Database Configuration - > < Property > < name > javax.jdo.option.ConnectionURL </ name > <! - - Database connection address > < value > JDBC: MySQL: //192.168.100.103: 3306 ? / ccx_hive createDatabaseIfNotExist = to true </ value > <! - - using MySQL to store metadata information > < value > jdbc: PostgreSQL: //172.16.89.80:? 5432 / db = ssl to true; databaseName = metastore_db; the Create = to true </ value > <! - using the stored metadata information postgres - > < value >jdbc: PostgreSQL: //172.16.89.80: 5432 / Ball </ value > <! --ball database is best established in advance, or it may be other configurations, such as the create = true, you can see for yourself - > < the Description > JDBC A JDBC Connect String for Metastore. the to use SSL to the encrypt / the authenticate at The Connection, the Provide Database-specific SSL Flag at The Connection in the URL of. the for Example, jdbc: PostgreSQL: // myhost / db = ssl to true for Postgres Database?. </ Description > </ Property > < Property > <! - database-driven - > < name > javax.jdo.option.ConnectionDriverName </name > <value>com.mysql.jdbc.Driver</value><!‐‐mysql‐‐> <value>org.postgresql.Driver</value><!‐‐postgres‐‐> <description>Driver class name for a JDBC metastore</description> </property> <property><!‐‐数据库用户名‐‐> <name>javax.jdo.option.ConnectionUserName</name> <value>u_ccx_hive</value> </property> <property><!‐‐数据库密码‐‐> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <!‐‐hive 执行引擎‐‐> <property> <name>hive.execution.engine</name> <value>mr</value><!‐‐mapreduce 作为引擎‐‐> <value>spark</value><!‐‐spark 作为引擎‐‐> <description> Expects one of [mr, tez, spark]. Chooses execution engine. Options are: mr (Map reduce, default), tez, spark. While MR remains the default engine for historical reasons, it is itself a historical engine and is deprecated in Hive 2 line. It may be removed without further warning. </description> </property> <property> <name>hive.metastore.schema.verification</name> <value>False</value> <description> Enforce metastore schema version consistency. True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures proper metastore schema migration. (Default) False: Warn if the version information stored in metastore doesn't match with one from in Hive jars. </description> </property>
Each value value can not have spaces
Upload database-driven
Upload driver postgresql-9.2-1003.jdbc4.jar to the lib directory of the hive
Storing metadata in the database
Database should be established in advance, this step is to initialize the database
bin/schematool -dbType mysql -initSchema
bin/schematool -dbType postgres -initSchema
Then you can view the database, create a bunch of table
Start hive Services
Started in this way can directly access the hive with a client, try this tool DBVisualizer
/bin/hive --service hiveserver2
Security presence HiveServer2 good solution HiveServer, concurrency issues, HiveServer now without a
The service start the program $ {HIVE_HOME} / bin / hiveserver2 which may be started so
--service hiveserver2 --hiveconf hive.server2.thrift.port = Hive 10001 # specified port
You can also specify the port number in the configuration file
<property> <name>hive.server2.thrift.port</name> <value>10000</value> <description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description> </property>
Start hive of shell client
shell client easy to operate
[root @ Master bin] # Hive # input show tables; displays the following message stating Hive has started Hive> show tables; the OK Time taken: 1.594 seconds The
Hive detection
Hive installed, but hadoop Hive, and what does it matter? Practical operation explanation
Create a database in the hive
Hive> the Create Database hive1; # create the database the OK Time taken: 0.478 seconds The Hive > Show Databases; # display database the OK default hive1 # successfully created Time taken: 0.132 seconds, Fetched: 2 row (s)
The question is, is successfully created, but the library where it? What is the relationship with hadoop? What is the relationship with metadata?
1. First, we look at the metadata, there is a table called DBS stored in the database metadata, it is clear that the name of the database storage
We see the new database hive1
2. Then we see hdfs
We also see the new database hive1
This path can be found in the corresponding configuration of the hive-site
Create a data table in the hive database
Hive> use hive1; # to switch to hive1 database environment the OK Time taken: 0.042 seconds The Hive > the Create the Table hive_01 (the above mentioned id int, name String); # Create a data table the OK Time taken: 0.984 seconds The Hive > Show the Tables; # lookup table the OK hive_01 # Creating success Time taken: 0.067 seconds, Fetched: 1 row (s)
1. Similarly, we look at the metadata, there is a TBLS table, it is clear that the storage table name
2. Then we see hdfs
We have found a correspondence table
Conclusion: The database created in the hive and tables are stored in hdfs, metadata stored in the metadata database;
Lane hive.user.install.directory hive-site.xml parameter defines the path HDFS default / user
Exception record
1. When given as a command executed schematool
Supplied dbType NO , because the database did a good job, and check the initialization schematool command, my answer is above operation
2. execute the command being given as follows schematool
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. Underlying cause: org.postgresql.util.PSQLException : The server does not support SSL. SQL Error code: 0 Use --verbose for detailed stacktrace. *** schemaTool failed ***
SSL problem, javax.jdo.option.ConnectionURL parameters hive-site modification is jdbc: postgresql: // myhost / db ssl = true; true to false or deleted?
3. Start hive shell being given as follows
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/root/a71af7ab-060a-465e-91ba-124ba4b07e36. Nam e node is in safe mode.The reported blocks 200 has reached the threshold 0.9990 of total blocks 200. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be tur ned off automatically in 15 seconds.
The reason: namenode in a safe state, you can turn off safe mode
# Turn off safe mode hadoop dfsadmin -safemode leave
4. Start hive shell being given below
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
Solutions are as follows:
1. Check the hive-site.xml configuration, you see the configuration values contain : configuration item "system java.io.tmpdir" of
2 . New Folder /home/grid/hive-0.14.0-bin/iotmp, pay attention to rights issues
3. comprising : the value of the configuration item "system java.io.tmpdir" modified to address the above
5. Start hive shell warning follows
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Note that just a warning, because hive2.x version is no longer supported mr, the solution is to replace the spark
References:
https://blog.csdn.net/u013384984/article/details/80435531
Configuring a little more https://www.cnblogs.com/jiangnange/p/9460034.html
https://blog.csdn.net/cjfeii/article/details/49423459
https://www.cnblogs.com/dxxblog/p/8193967.html#top
http://www.tianshouzhi.com/api/tutorials/hive/151
https://blog.csdn.net/kongxx/article/details/79418977 hive postgres
https://www.cnblogs.com/slymonkey/p/9967619.html stepped pit Record
https://blog.csdn.net/pengjunlee/article/details/81737814 stepped pit Record
https://blog.csdn.net/lby0307/article/details/80309225 hive local mode schematool Unable to initialize the mysql database