1, Hive installation address
1. Hive official website address
2. Document View Address
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
3. github address
https://github.com/apache/hive
2, Hive installation and deployment
1. Hive installation and configuration
(1) The apache-hive-1.2.1-bin.tar.gz upload the linux / opt / software directory
(2) extracting apache-hive-1.2.1-bin.tar.gz to / opt / module / directory
[root@master software]$ tar -zxvf apache-hive-1.2.1-bin.tar.gz /opt/module/
(3) modify the name apache-hive-1.2.1-bin.tar.gz is hive
[root@master module]$ mv apache-hive-1.2.1-bin/ hive
(4) modify the / opt / module / hive / hive-env.sh.template conf directory name is hive-env.sh
[root@master conf]$ mv hive-env.sh.template hive-env.sh
(5) configuration file hive-env.sh
(A) Path Configuration HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
(B) Path Configuration HIVE_CONF_DIR
export HIVE_CONF_DIR=/opt/module/hive/conf
2. Hadoop cluster configuration
(1) must be started and yarn hdfs
[root@master hadoop-2.7.2]$ sbin/start-dfs.sh
[root@master hadoop-2.7.2]$ sbin/start-yarn.sh
(2) create / tmp and / user / hive / warehouse two directories and modify them in the same group permissions can be written on HDFS
[root@master hadoop-2.7.2]$ bin/hadoop fs -mkdir /tmp
[root@master hadoop-2.7.2]$ bin/hadoop fs -mkdir -p /user/hive/warehouse
[root@master hadoop-2.7.2]$ bin/hadoop fs -chmod g+w /tmp
[root@master hadoop-2.7.2]$ bin/hadoop fs -chmod g+w /user/hive/warehouse
Or turn off the permissions in the configuration file check hadoop of hdfs-site.xml in
<property>
<name>dfs.permissions.enable</name>
<value>false</value>
</property>
3. Hive basic operations
(1) Start hive
[root@master hive]$ bin/hive
(2) View database
hive> show databases;
(3) open the default database
hive> use default;
(4) Display default database table
hive> show tables;
(5) create a table
hive> create table student(id int, name string);
(6) There are several tables in the database are
hive> show tables;
Structure (7) View table
hive> desc student;
(8) inserting data into a table
hive> insert into student values(1000,"ss");
(9) the data look-up table
hive> select * from student;
(10) to exit the hive
hive> quit;
3, the local file import Hive Case
demand
Introducing the data in the local directory to the hive /opt/module/datas/student.txt the student (id int, name string) table.
1. data preparation
Prepare data in / opt / module / datas directory
(1) create datas in / opt / module / directory
[root@master module]$ mkdir datas
(2) create student.txt files in / opt / module / datas / directory and add the data
[root@master datas]$ touch student.txt
[root@master datas]$ vi student.txt
1001 zhangshan
1002 lishi
1003 zhaoliu
Note that in the tab spacing.
2. Hive practice
(1) Start hive
[root@master hive]$ bin/hive
(2) Display Database
hive> show databases;
(3) using the default database
hive> use default;
(4) Display default database table
hive> show tables;
student table (5) delete created
hive> drop table student;
(6) to create student table and declares file separator '\ t'
hive> create table student(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED
BY '\t';
(7) /opt/module/datas/student.txt file loaded into student database table. Double quotes can also be
hive> load data local inpath '/opt/module/datas/student.txt' into table student;
(8) Hive query results
hive> select * from student;
OK
1001 zhangshan
1002 lishi
1003 zhaoliu
Time taken: 0.266 seconds, Fetched: 3 row(s)
3. Problems encountered
Then open a client window to start hive, will produce java.sql.SQLException exception.
The reason is, Metastore default stored in the database comes derby, it is recommended to use MySQL storage Metastore;
4, MySql installation
4.1 installation package ready
1. View mysql is installed, if installed, uninstall mysql
(1) View
[Root @ master table] # rpm -qa | grep mysql
mysql-libs-5.1.73-7.el6.x86_64
(2) Uninstall
[Root @ master table] # rpm -e --nodeps mysql-libs-5.1.73-7.el6.x86_64
2. Mysql-libs.zip unzip the file to the current directory
[root@hadoop102 software]# unzip mysql-libs.zip
[root@hadoop102 software]# ls
mysql-libs.zip
mysql-libs
3. Into the mysql-libs folder
[root@master mysql-libs]# ll
The total amount of 76048
-rw-r--r--. 1 root root 18509960 3月 26 2015 MySQL-client-5.6.24-1.el6.x86_64.rpm
-rw-r--r--. 1 root root 3575135 12月 1 2013 mysql-connector-java-5.1.27.tar.gz
-rw-r--r--. 1 root root 55782196 3月 26 2015 MySQL-server-5.6.24-1.el6.x86_64.rpm
MySql Server 4.2 installation
1. Install mysql server
[root@master mysql-libs]# rpm -ivh MySQL-server-5.6.24-1.el6.x86_64.rpm
2. View random password generated
[root@master mysql-libs]# cat /root/.mysql_secret
OEXaQuS8IWkG19Xs
3. View mysql status
[root@master mysql-libs]# service mysql status
4. Start mysql
[root@master mysql-libs]# service mysql start
4.3 client installed MySql
1. Install mysql client
[root@master mysql-libs]# rpm -ivh MySQL-client-5.6.24-1.el6.x86_64.rpm
2. Link mysql
[root@master mysql-libs]# mysql -uroot -pOEXaQuS8IWkG19Xs
3. change Password
mysql>SET PASSWORD=PASSWORD('000000');
4. Exit mysql
mysql>exit
4.4 MySql in the user table Host Configuration
Configuration as long as the root user + password, can log on any host MySQL database.
1. Enter mysql
[root@master mysql-libs]# mysql -uroot -p000000
2. Display Database
mysql>show databases;
3. Use mysql database
mysql>use mysql;
4. Show all the tables in the mysql database
mysql>show tables;
5. Show user table structure
mysql>desc user;
6. Query user table
mysql>select User, Host, Password from user;
7. Modify the user table, the table is amended as follows Host%
mysql>update user set host='%' where host='localhost';
8. Other host delete the root user
mysql>delete from user where Host='hadoop102';
mysql>delete from user where Host='127.0.0.1';
mysql>delete from user where Host='::1';
9. Refresh
mysql>flush privileges;
10. drop out
mysql>quit;
5, Hive metadata arranged to MySql
5.1 drive copy
1. Extracting mysql-connector-java-5.1.27.tar.gz driver package in / opt / software / mysql-libs directory
[root@master mysql-libs]# tar -zxvf mysql-connector-java-5.1.27.tar.gz
2. Copy mysql-connector-java-5.1.27-bin.jar /opt/software/mysql-libs/mysql-connector-java-5.1.27 directory under the / opt / module / hive / lib /
[root@master mysql-connector-java-5.1.27]# cp mysql-connector-java-5.1.27-bin.jar
/opt/module/hive/lib/
5.2 Configuration Metastore to MySql
1. Create a hive-site.xml in / opt / module / hive / conf directory
[root@mstear conf]$ touch hive-site.xml
[root@master conf]$ vi hive-site.xml
2. According to the official document configuration parameters, copy the data to the hive-site.xml file
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
</configuration>
3. Once configured, if you start hive exceptions, you can restart the virtual machine. (After restarting, do not forget to start hadoop cluster)
More than 5.3 Hive window to start testing
1. First start MySQL
[root@mstear mysql-libs]$ mysql -uroot -p000000
Access to several databases
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
2. Again open multiple windows, each hive start
[root@mstear hive]$ bin/hive
3. After starting the hive, back window to view the MySQL database, the database shows an increase of metastore
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| metastore |
| mysql |
| performance_schema |
| test |
+--------------------+
6, Hive commonly used interactive command
[root@mstear hive]$ bin/hive -help usage: hive -d,--define <key=value> Variable subsitution to apply to hive commands. e.g. -d A=B or --define A=B --database <databasename> Specify the database to use -e <quoted-query-string> SQL from command line -f <filename> SQL from files -H,--help Print help information --hiveconf <property=value> Use value for given property --hivevar <key=value> Variable subsitution to apply to hive commands. e.g. --hivevar A=B -i <filename> Initialization SQL file -S,--silent Silent mode in interactive shell -v,--verbose Verbose mode (echo executed SQL to the console) |
1. "-E" do not enter the hive interactive window execute sql statement
[root@mstear hive]$ bin/hive -e "select id from student;"
2. "-F" execute script sql statement
(1) create hivef.sql files in / opt / module / datas directory
[root@mstear datas]$ touch hivef.sql
File written correctly sql statement
select *from student;
(2) execute the file sql statement
[root@mstear hive]$ bin/hive -f /opt/module/datas/hivef.sql
sql statement and the result is written to the file (3) in the executable file
[root@mstear hive]$ bin/hive -f /opt/module/datas/hivef.sql > /opt/module/datas/hive_result.txt
7, Hive other command operations
1. Exit hive window:
hive(default)>exit;
hive(default)>quit;
No difference in the new version of the hive, the previous version is there:
exit: first recessive submit data, and then exit;
quit: do not submit the data and exit;
2. How to view hdfs file system in hive cli command window
hive(default)>dfs -ls /;
3. How to view the local file system in hive cli command window
hive(default)>! ls /opt/module/datas;
4. See all the history commands entered in the hive
(1) into the root directory of the current user / root or / home / atguigu
(2) view. Hivehistory file
[Root @ mstear ~] $ cat .hivehistory
8, Hive common property configuration
8.1 Hive data warehouse location configuration
Most home position (1) Default data warehouse is on the hdfs: under / user / hive / warehouse path.
(2) in the repository directory, not create a folder on the default database default. If a default database tables belong to, create a folder directly in the data warehouse catalog.
(3) modify the default data warehouse original position (below the hive-default.xml.template copy the configuration file to the hive-site.xml).
<property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> |
Configuring the same group of users have execute permissions
bin/hdfs dfs -chmod g+w /user/hive/warehouse
8.2 query information display configuration
1) add the following information in hive-site.xml configuration file, the header information can be achieved display the current database, and the look-up table configuration.
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
2) before and after the restart hive, comparative configuration differences.
(1) Before the configuration, as shown in Figure 6-2
Figure 6-2 before
(2) After the configuration shown in Figure 6-3
Figure 6-3 configuration
8.3 Hive running log configuration
1.Hive的log默认存放在/tmp/atguigu/hive.log目录下(当前用户名下)
2.修改hive的log存放日志到/opt/module/hive/logs
(1)修改/opt/module/hive/conf/hive-log4j.properties.template文件名称为hive-log4j.properties
[root@master conf]$ pwd
/opt/module/hive/conf
[root@master conf]$ mv hive-log4j.properties.template hive-log4j.properties
(2)在hive-log4j.properties文件中修改log存放位置
hive.log.dir=/opt/module/hive/logs
8.4 参数配置方式
1.查看当前所有的配置信息
hive>set;
2.参数的配置三种方式
(1)配置文件方式
默认配置文件:hive-default.xml
用户自定义配置文件:hive-site.xml
注意:用户自定义配置会覆盖默认配置。另外,Hive也会读入Hadoop的配置,因为Hive是作为Hadoop的客户端启动的,Hive的配置会覆盖Hadoop的配置。配置文件的设定对本机启动的所有Hive进程都有效。
(2)命令行参数方式
启动Hive时,可以在命令行添加-hiveconf param=value来设定参数。
例如:
[root@master hive]$ bin/hive -hiveconf mapred.reduce.tasks=10;
注意:仅对本次hive启动有效
查看参数设置:
hive (default)> set mapred.reduce.tasks;
(3)参数声明方式
可以在HQL中使用SET关键字设定参数
例如:
hive (default)> set mapred.reduce.tasks=100;
注意:仅对本次hive启动有效。
查看参数设置
hive (default)> set mapred.reduce.tasks;
上述三种设定方式的优先级依次递增。即配置文件<命令行参数<参数声明。注意某些系统级的参数,例如log4j相关的设定,必须用前两种方式设定,因为那些参数的读取在会话建立以前已经完成了。