Quick Installation learning -Hive

Chapter 2 Hive installation

2.1 Hive installation address

  1. Hive official website address
    http://hive.apache.org/
  2. Document viewing address
    https://cwiki.apache.org/confluence/display/Hive/GettingStarted
  3. Download
    http://archive.apache.org/dist/hive/

2.2 Hive installation and deployment

  1. Hive installation and configuration
    (1) the apache-hive-1.2.1-bin.tar.gz upload the linux / opt / software directory
    (2) extracting apache-hive-1.2.1-bin.tar.gz to / opt / module / directory
[atguigu@hadoop102 software]$ tar -zxvf 
apache-hive-1.2.1-bin.tar.gz -C /opt/module/

(3) modify the name apache-hive-1.2.1-bin.tar.gz is hive

[atguigu@hadoop102 module]$ mv apache-hive-1.2.1-bin/ hive

(4) modify the / opt / module / hive / hive-env.sh.template conf directory name is hive-env.sh

[atguigu@hadoop102 conf]$ mv hive-env.sh.template hive-env.sh

(5) Configuration file hive-env.sh
(a) Path Configuration HADOOP_HOME

export HADOOP_HOME=/opt/module/hadoop-2.7.2

(B) Path Configuration HIVE_CONF_DIR

export HIVE_CONF_DIR=/opt/module/hive/conf
  1. Hadoop cluster configuration
    (1) must be started and yarn hdfs
[atguigu@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh
[atguigu@hadoop103 hadoop-2.7.2]$ sbin/start-yarn.sh

(2) create / tmp and / user / hive / warehouse two directories and modify them in the same group permissions can be written on HDFS
(not operating, the system automatically creates)

[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -mkdir /tmp
[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -mkdir -p
/user/hive/warehouse
[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -chmod g+w /tmp
[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -chmod g+w 
/user/hive/warehouse
  1. Hive basic operation
    (1) Start hive
[atguigu@hadoop102 hive]$ bin/hive

(2) View database

hive> show databases;

(3) open the default database

hive> use default;

(4) Display default database table

hive> show tables;

(5) create a table

hive> create table student(id int, name string);

(6) There are several tables in the database are

hive> show tables;

Structure (7) View table

hive> desc student;

(8) inserting data into a table

hive> insert into student values(1000,"ss");

(9) the data look-up table

hive> select * from student;

(10) to exit the hive

hive> quit;

2.3 The local file into Hive Case

Needs
to import data in the local directory to the hive /opt/module/data/student.txt the Student (int ID, name
String) table.
1. Data preparation
to prepare the data in / opt / module / data directory
(1) create data in / opt / module / directory

[atguigu@hadoop102 module]$ mkdir data

(2) create student.txt files in / opt / module / datas / directory and add the data

[atguigu@hadoop102 datas]$ touch student.txt
[atguigu@hadoop102 datas]$ vi student.txt
1001 zhangshan
1002 lishi
1003 zhaoliu

Note that in the tab spacing.

  1. Hive actual operation
    (1) Start hive
[atguigu@hadoop102 hive]$ bin/hive

(2) Display Database

hive> show databases;

(3) using the default database

hive> use default;

(4) Display default database table

hive> show tables;

student table (5) delete created

hive> drop table student;

(6) to create student table and declares file separator '\ t'

hive> create table student(id int, name string) ROW FORMAT 
DELIMITED FIELDS TERMINATED
BY '\t';

(7) /opt/module/data/student.txt file loaded into student database table.

hive> load data local inpath '/opt/module/data/student.txt' into 
table student;

(8) Hive query results

hive> select * from student;
OK
1001 zhangshan
1002 lishi
1003 zhaoliu
Time taken: 0.266 seconds, Fetched: 3 row(s)
  1. Problems encountered
    then open a client window to start hive, will produce java.sql.SQLException exception.
Exception in thread "main" java.lang.RuntimeException: 
java.lang.RuntimeException:
Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClien t
 at 
org.apache.hadoop.hive.ql.session.SessionState.start(Session
State.java:522)
 at 
org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at 
org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621
)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:57)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClien
t
 at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(
MetaStoreUtils.java:1523)
 at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<in
it>(RetryingMetaStoreClient.java:86)
 at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.get
Proxy(RetryingMetaStoreClient.java:132)
 at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.get
Proxy(RetryingMetaStoreClient.java:104)
 at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClien
t(Hive.java:3005)
 at 
org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:302
4)
 at 
org.apache.hadoop.hive.ql.session.SessionState.start(Session
State.java:503)
... 8 more

The reason is, Metastore default stored in the database comes derby, it is recommended to use MySQL storage Metastore;

2.4 MySql installation

2.4.1 installation package ready

  1. View mysql is installed, if installed, uninstall mysql
    (1) View
[root@hadoop102 桌面]# rpm -qa|grep mysql
mysql-libs-5.1.73-7.el6.x86_64

(2) Uninstall

[root@hadoop102 桌 面 ]# rpm -e --nodeps 
mysql-libs-5.1.73-7.el6.x86_64 2.解压 mysql-libs.zip 文件到当前目录
[root@hadoop102 software]# unzip mysql-libs.zip
[root@hadoop102 software]# ls
mysql-libs.zip
mysql-libs

3. Into the mysql-libs folder

[root@hadoop102 mysql-libs]# ll
总用量 76048
-rw-r--r--. 1 root root 18509960 3 月 26 2015 
MySQL-client-5.6.24-1.el6.x86_64.rpm
-rw-r--r--. 1 root root 3575135 12 月 1 2013 
mysql-connector-java-5.1.27.tar.gz
-rw-r--r--. 1 root root 55782196 3 月 26 2015 
MySQL-server-5.6.24-1.el6.x86_64.rpm

MySql Server 2.4.2 installation

  1. Install mysql server
[root@hadoop102 mysql-libs]# rpm -ivh 
MySQL-server-5.6.24-1.el6.x86_64.rpm
  1. View random password generated
[root@hadoop102 mysql-libs]# cat /root/.mysql_secret
OEXaQuS8IWkG19Xs
  1. View mysql status
[root@hadoop102 mysql-libs]# service mysql status
  1. Start mysql
[root@hadoop102 mysql-libs]# service mysql start

2.4.3 install MySql Client

  1. Install mysql client
[root@hadoop102 mysql-libs]# rpm -ivh 
MySQL-client-5.6.24-1.el6.x86_64.rpm
  1. Link mysql
[root@hadoop102 mysql-libs]# mysql -uroot -pOEXaQuS8IWkG19Xs
  1. change Password
mysql>SET PASSWORD=PASSWORD('000000');
  1. Exit mysql
mysql>exit

2.4.4 MySql in the user table Host Configuration

Configuration as long as the root user + password, can log on any host MySQL database.

  1. Enter mysql
[root@hadoop102 mysql-libs]# mysql -uroot -p000000
  1. Display Database
mysql>show databases;
  1. Use mysql database
mysql>use mysql;
  1. Show all the tables in the mysql database
mysql>show tables;
  1. Show user table structure
mysql>desc user;
  1. Query user table
mysql>select User, Host, Password from user;
  1. Modify the user table, the table is amended as follows Host%
mysql>update user set host='%' where host='localhost';
  1. Other host delete the root user
delete from user where Host='hadoop102';
delete from user where Host='127.0.0.1';
delete from user where Host='::1';
  1. Refresh
mysql>flush privileges;
  1. drop out
mysql>quit;

2.5 Hive metadata arranged to MySql

2.5.1 drive copy

  1. Extracting mysql-connector-java-5.1.27.tar.gz driver package in / opt / software / mysql-libs directory
[root@hadoop102 mysql-libs]# tar -zxvf 
mysql-connector-java-5.1.27.tar.gz
  1. Copy mysql-connector-java-5.1.27-bin.jar to / opt / module / hive / lib /
[root@hadoop102 mysql-connector-java-5.1.27]# cp
/opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-c
onnector-java-5.1.27-bin.jar/opt/module/hive/lib/

2.5.2 Configuration Metastore to MySql

  1. Create a hive-site.xml in / opt / module / hive / conf directory
[atguigu@hadoop102 conf]$ touch hive-site.xml
[atguigu@hadoop102 conf]$ vi hive-site.xml
  1. According to the official document configuration parameters, copy the data to the hive-site.xml file
    https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
 <name>javax.jdo.option.ConnectionURL</name>
 
<value>jdbc:mysql://hadoop102:3306/metastore?createDatabaseI
fNotExist=true</value>
 <description>JDBC connect string for a JDBC 
metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
 <value>com.mysql.jdbc.Driver</value>
 <description>Driver class name for a JDBC 
metastore</description>
</property>
<property>
 <name>javax.jdo.option.ConnectionUserName</name>
 <value>root</value>
 <description>username to use against metastore 
database</description>
</property>
<property>
 <name>javax.jdo.option.ConnectionPassword</name>
 <value>000000</value>
 <description>password to use against metastore 
database</description>
</property>
</configuration>
  1. Once configured, if you start hive exceptions, you can restart the virtual machine. (After restarting, do not forget to start hadoop cluster)

2.5.3 Multi-window start testing Hive

  1. First start MySQL
[atguigu@hadoop102 mysql-libs]$ mysql -uroot -p000000

Access to several databases

mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+ 2.再次打开多个窗口,分别启动 hive
[atguigu@hadoop102 hive]$ bin/hive
  1. After starting the hive, back window to view the MySQL database, the database shows an increase of metastore
mysql> show databases;
+--------------------+
| Database | +--------------------+
| information_schema |
| metastore |
| mysql |
| performance_schema |
| test |
+--------------------+

2.6 HiveJDBC access

2.6.1 Start hiveserver2 Service

[atguigu@hadoop102 hive]$ bin/hiveserver2

2.6.2 Start beeline

[atguigu@hadoop102 hive]$ bin/beeline
Beeline version 1.2.1 by Apache Hive
beeline>

2.6.3 connection hiveserver2

beeline> !connect jdbc:hive2://hadoop102:10000(回车)
Connecting to jdbc:hive2://hadoop102:10000
Enter username for jdbc:hive2://hadoop102:10000: atguigu(回车)
Enter password for jdbc:hive2://hadoop102:10000: (直接回车)
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop102:10000> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| hive_db2 |
+----------------+--+

2.7 Hive commonly used interactive command

[atguigu@hadoop102 hive]$ bin/hive -help
usage: hive
-d,--define <key=value> Variable subsitution to apply 
to hive
 commands. e.g. -d A=B or --define 
A=B
 --database <databasename> Specify the database to use
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
 --hiveconf <property=value> Use value for given property
 --hivevar <key=value> Variable subsitution to apply 
to hive
 commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-S,--silent Silent mode in interactive 
shell
-v,--verbose Verbose mode (echo executed SQL 
to the console)
  1. "-E" do not enter the hive interactive window execute sql statement
[atguigu@hadoop102 hive]$ bin/hive -e "select id from student;"
  1. "-F" to execute the script in sql statement
    (1) create hivef.sql files in / opt / module / datas directory
[atguigu@hadoop102 datas]$ touch hivef.sql

File written correctly sql statement

select * from student;

(2) execute the file sql statement

[atguigu@hadoop102 hive]$ bin/hive -f 
/opt/module/datas/hivef.sql

sql statement and the result is written to the file (3) in the executable file

[atguigu@hadoop102 hive]$ bin/hive -f 
/opt/module/datas/hivef.sql > 
/opt/module/datas/hive_result.txt

2.8 Hive other command operations

  1. How to view hdfs file system in hive cli command window
    hive> dfs -ls /;
  2. How to view the hive cli command window to a local file system
    hive> ls / opt / module / datas!;
  3. View all entered in the hive command history
    (1) into the root directory of the current user / root or / Home / atguigu
    (2) View. Hivehistory file
    [atguigu @ hadoop102 ~] $ cat .hivehistory

2.9 Hive common attribute configuration

2.9.1 Hive data warehouse location configuration

1) The most original position Default data warehouse is in hdfs: / the user / hive / warehouse path.
2) In the repository directory, not create a folder on the default database default. If a default database tables belong to, create a folder directly in the data warehouse catalog.
3) modify the default data warehouse original position (below the hive-default.xml.template copy the configuration file to the hive-site.xml).

<property>
	<name>hive.metastore.warehouse.dir</name>
	<value>/user/hive/warehouse</value>
	<description>location of default database for the warehouse</description>
</property>

Configuring the same group of users have execute permissions

bin/hdfs dfs -chmod g+w /user/hive/warehouse

2.9.2 After the query information display configuration

1) add the following information in hive-site.xml configuration file, the header information can be achieved display the current database, and the look-up table configuration.

<property>
	<name>hive.cli.print.header</name>
	<value>true</value>
</property>
<property>
	<name>hive.cli.print.current.db</name>
	<value>true</value>
</property>

2) before and after the restart hive, comparative configuration differences.
(1) Before the configuration, as shown in Figure 6-2
Here Insert Picture Description
(2) configuration shown in Figure 6-3
Here Insert Picture Description

2.9.3 Hive running log configuration

  1. Hive of log stored in the default directory /tmp/atguigu/hive.log (current user name)
  2. Modify hive of log storage logs to / opt / Module / hive / logs
    (1) modify the file name for the hive-log4j.properties /opt/module/hive/conf/hive-log4j.properties.template
[atguigu@hadoop102 conf]$ pwd
/opt/module/hive/conf
[atguigu@hadoop102 conf]$ mv hive-log4j.properties.template 
hive-log4j.properties

(2) modify the log in the storage position hive.log.dir hive-log4j.properties file = / opt / module / hive / logs

2.9.4 configuration parameters

  1. See all current configuration information
hive>set;
  1. Configuration parameters of three ways
    (1) profile mode
    default profile: hive-default.xml
    custom profile: hive-site.xml
    Note: custom configuration will overwrite the default configuration. In addition, Hadoop Hive will be read into the configuration, because as a client Hadoop Hive is initiated, the configuration will overwrite the Hive Hadoop configuration. Set profiles are valid for all native Hive process started.

(2) the way command line parameters
when starting Hive, can -hiveconf param = value row is added in order to set the parameters.
E.g:

[atguigu@hadoop103 hive]$ bin/hive -hiveconf 
mapred.reduce.tasks=10;

Note: The only valid hive start viewing parameters:

hive (default)> set mapred.reduce.tasks;

(3) parameter declaration embodiment
can set the parameters of the SET keyword in HQL
example:

hive (default)> set mapred.reduce.tasks=100;

Note: The only valid hive start.
Check parameter settings

hive (default)> set mapred.reduce.tasks;

Priority setting of the three ways in ascending order. That configuration file <command line parameter <parameter declaration. Note that some system-level parameters, such as log4j related settings must be set with the first two methods, because those parameters read in the previous session setup has been completed.

Released 1384 original articles · won praise 1165 · Views 110,000 +

Guess you like

Origin blog.csdn.net/weixin_42528266/article/details/104364353