Hive 2.0.1 installation deployment

Environment: Vmware 11, Ubuntu 15.10, Hadoop 2.7.1

Hive is built on the basis of static batch Hadoop, Hadoop usually have higher latency and requires a lot of overhead at the time of job submission and scheduling. Therefore, Hive is not suitable for low-latency fast queries on large data sets. For example, online transaction processing (OLTP). Hive query operation process strictly abide by the Hadoop MapReduce job execution model, Hive converts the user's HiveQL statement submitted to the Hadoop clusters MapReduce jobs through an interpreter, Hadoop monitoring process to execute the job, and then return to the job execution result to the user.

Hive is best use of batch jobs where large data sets, for example, Web log analysis.

Hive is used by default embedded Derby database, but the database can only access a file, so it is necessary to use other relational databases alternative Derby , here demonstrates Mysql and PostgreSQL installation.

1.1 Installation Mysql

  1. Download: sudo apt-get install mysql-server mysql-client libmysqlclient-devduring the installation process will be prompted to enter a password, remember to enter.

enter description here

  1. Check the installation was successful: sudo netstat -tap | grep mysqlsuccessful if installed.

enter description here

Or enter sudo service mysql statusViews:

enter description here

  1. Use mysql -u root -plogin Mysql database

    GRANT ALL PRIVILEGES ON *.* TO ‘root’@’%’ WITH GRANT OPTION;

    GRANT ALL PRIVILEGES ON *.* TO ‘ROOT’@’%’ IDENTIFIED BY ‘ROOTPASSWD’ WITH GRANT OPTION;
    此处 % 表示任意host

  2. Creating == hive == == hive == database and user

    mysql use
    INSERT INTO User (Host, User, password) values ( '%', 'Hive', password ( 'Hive')); i.e., the user can use the remote access time // host is mysql%

enter description here

  1. Modify mysqld.cnf configuration file sudo vim /etc/mysql/mysql.conf.d/mysqld.cnf, the bind-address = 127.0.0.1comment out or modified bind-address=0.0.0.0. Need to restart mysql : , sudo service mysql restartthen you can remotely access mysql the

    enter description here
    Can be found, MySQL process Host has changed

  2. Create a system user to log database with shell

    sudo adduser hive

enter description here

enter description here

  1. Completely remove MysqL

    APT-GET autoremove -purge sudo MySQL-Server
    sudo MySQL-APT-GET the Remove the Common // this is very important
    dpkg -l | grep ^ rc | awk '{print $ 2}' | sudo xargs dpkg -P // clean up residual data

PostgreSQL 1.2 installation

  1. downloadsudo apt-get install postgresl-9.4
  2. Use the default user login sudo -u postgres psql postgres, create Adminpack:CREATE EXTENSION adminpack;

enter description here

  1. Use password postgres, set the default user == postgres == password, use qexit pgsql shell:

enter description here

  1. Create a database user == hive == (password: hive) and == hive == database

    -u -d -P Postgres Createuser the sudo -A hive
    the sudo -u Postgres hive that createdb -O // first hive to hive username

enter description here

  1. Create a system user to log database with shell

    sudo adduser hive

    enter description here

  2. Configure pg_hba.conf and postgresql.conf

    south you /etc/postgresql/9.4/main/pg_hba.conf
    south you /etc/postgresql/9.4/main/postgresql.conf

    enter description here
    192.168.8.0Is the local IP, other items should also be amended to md5 verification

    enter description here
    PostgreSQL native access only by default, so you want to configure listen_address to*

    Restart the service, fully equipped to take effect

    sudo systemctl restart postgresql
    或者
    sudo service postgresql restart

Reference:
http://www.unixmen.com/install-postgresql-9-4-and-phppgadmin-on-ubuntu-15-10/
https://help.ubuntu.com/lts/serverguide/mysql.html

2 Installation Hive

Environment variable settings:
enter description here

2.1 download Mysql / PostgreSQL's JDBC package

http://dev.mysql.com/downloads/connector/j/
https://jdbc.postgresql.org/download.html
the driver package into the Hive 's == lib == directory, demonstrated here is Mysql

2.2 Hive Configuration

Into the Hive 's == conf ==, copy the following files:

cp hive-default.xml.template hive-site.xml
cp hive-env.sh.template hive-env.sh

Configuration hive-env.sh

      
      
1
2
3
4
5
6
7
8
9
10
      
      
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/usr/ local/hadoop
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/usr/ local/hive/conf
# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/usr/ local/hive/lib

Configuration hive-site.xml
Hive loaded two files Hive-default.xml , is a hive-site.xml file, if two files of different configuration parameters, Hive will be configured by the user hive-site.xml prevail . So we can hive-site.xml to retain only the following parameters

      
      
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
      
      
xml version="1.0"?>
xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir </name>
<value>/user/hive/warehouse </value>
<description>location of default database for the warehouse </description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL </name>
<value>jdbc:mysql://node1:3306/hive?characterEncoding=UTF-8 </value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName </name>
<value>com.mysql.jdbc.Driver </value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName </name>
<value>hive </value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword </name>
<value>hive </value>
</property>
</configuration>

2.3 创建必要的目录

hdfs dfs -mkdir /user/
hdfs dfs -mkdir /user/hive/
hdfs dfs -mkdir /user/hive/warehouse
hdfs dfs -mkdir /tmp/
hdfs dfs -mkdir /tmp/hive
hdfs dfs -chmod 777 /user/hive/warehouse
hdfs dfs -chmod 777 /tmp/hive

记得修改文件夹权限,否则在JDBC访问时,可能会出现如下错误:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=anonymous, access=WRITE, inode=”/user/hive/warehouse/loginfo”:phoenix:supergroup:drwxr-xr-x

enter description here
==loginfo==是使用JDBC创建的Hive表

2.3 启动

hive start

会抛出下面错误:
Exception in thread “main” java.lang.RuntimeException: Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType …) to create the schema. If needed, don’t forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql)

由于Hive默认的是 Derby 数据库,所以先要初始化Hive:

      
      
1
2
3
4
      
      
schematool -dbType mysql -initSchema
# 如果初始化过其他类型数据库,先删除/tmp/{usr_name}/metastore_db 文件夹,否则地初始化失败
rm -R metastore_db

enter description here

enter description here

enter description here

MysqlHive库里生成一些元数据表

enter description here

JDBC接口

配置Hadoopcore-site.xml,以便允许匿名访问Hadoop

      
      
1
2
3
4
5
6
7
8
9
      
      
// 将phoenix替换为你需要的用户名即可
<property>
<name>hadoop.proxyuser.phoenix.hosts </name>
<value>* </value>
</property>
<property>
<name>hadoop.proxyuser.phoenix.groups </name>
<value>* </value>
</property>

否则会遇到如下错误:
org.apache.hadoop.ipc.RemoteException: User: phoenix is not allowed to impersonate anonymous

设置完成后需要重启hadoop,然后运行==hive/bin==下的hiveserver2

在Eclipse项目中的包如下图产,这些jar包可以在==hive/lib==目录下找到,==slf4j==这两个包可以在hadoop中找到:

enter description here

测试类:

      
      
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
      
      
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
public class {
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
public static void main(String[] args) throws SQLException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
System.exit( 1);
}
Connection con = DriverManager.getConnection( "jdbc:hive2://master:10000/default", "", "");
Statement stmt = con.createStatement();
String tableName = "loginfo";
stmt.execute( "drop table if exists " + tableName);
stmt.execute( "create table " + tableName + " (key int, value string)");
System.out.println( "Create table success!");
String sql = "show tables '" + tableName + "'";
System.out.println( "Running: " + sql);
ResultSet res = stmt.executeQuery(sql);
if (res.next()) {
System.out.println(res.getString( 1));
}
}
}

Reference:
https://www.shiyanlou.com/courses/document/766
http://blog.csdn.net/nengyu/article/details/51620760
http://www.cnblogs.com/linjiqin/archive/2013/ 03/04 / 2943025.html

Original: Big Box  Hive 2.0.1 installation deployment


Guess you like

Origin www.cnblogs.com/petewell/p/11606839.html