Hive (1) ---- Hive installation and configuration

Hive installation and configuration

Download the hive installation package

Here, the version of hive-0.13.1-cdh5.3.6 is used as an example. The package name is: hive-0.13.1-cdh5.3.6.tar.gz

Unzip Hive to the installation directory

$ tar -xvf hive-0.13.1-cdh5.3.6.tar.gz

rename config file

mv hive-default.xml.template hive-site.xml

mv hive-env.sh.template hive-env.sh

mv hive-log4j.properties.template hive-log4j.properties

hive-env.sh file

JAVA_HOME=/usr/local/src/jdk1.8.0_121

HADOOP_HOME=/usr/local/src/hadoop-2.5.0-cdh5.3.6

export HIVE_CONF_DIR=/usr/local/src/hive-0.13.1-cdh5.3.6/conf

hive-site.xml file

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
</property>

<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
</property>

<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>
    <description>password to use against metastore database</description>
</property>

<!--显示数据库名称以及字段名称-->
<!-- 是否在当前客户端中显示查询出来的数据的字段名称 -->
<property>
  <name>hive.cli.print.header</name>
  <value>true</value>
  <description>Whether to print the names of the columns in query output.</description>
</property>

<!-- 是否在当前客户端中显示当前所在数据库名称 -->
<property>
  <name>hive.cli.print.current.db</name>
  <value>true</value>
  <description>Whether to include the current database in the Hive prompt.</description>
</property>

<!--简单HiveSql绕过MR配置-->
<property>
    <name>hive.fetch.task.conversion</name>
    <value>more</value>
    <description>
    Some select queries can be converted to single FETCH task minimizing latency.
    Currently the query should be single sourced not having any subquery and should not have
    any aggregations or distincts (which incurs RS), lateral views and joins.
    1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
    2. more    : SELECT, FILTER, LIMIT only (TABLESAMPLE, virtual columns)
    </description>
</property>

注意:该版本中hive-site.xml文件在2787h行附近缺少了<property>标签

hive-log4j.properties

hive.log.dir=/usr/local/src/hive-0.13.1-cdh5.3.6/logs

Install Mysql

su - root

yum -y install mysql mysql-server mysql-devel

wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm

rpm -ivh mysql-community-release-el7-5.noarch.rpm

yum -y install mysql-community-server

Configure Mysql

  • Start Mysql service

    systemctl start mysqld.service

  • Set the root user password

    mysqladmin -uroot password '123456'

  • Authorize users and other machine nodes

    mysql grant all on . to root@'master' identified by '123456';

    grant: authorization

    all: all permissions

    . : database name.table name

    root: the user who operates mysql

    @'':CPU name

    Password: 123456

    Refresh after completion: flush privileges;

  • Copy the database driver package to the lib folder in the Hive root directory

    cp -a mysql-connector-java-5.1.27-bin.jar /usr/local/src/hive-0.13.1-cdh5.3.6/lib/

Start Hive

bin/hive

Modify some directory permissions about Hive in the HDFS system

/usr/local/src/hadoop-2.7.2/bin/hadoop fs -chmod 777 /tmp/

/usr/local/src/hadoop-2.7.2/bin/hadoop fs -chmod 777 /user/hive/warehouse

create database

create database school;

create table operation

create table t1(eid int, name string, sex string) row format delimited fields terminated by '\t';

Import data into hive table

  • Import from local:

    load data local inpath 'file path' into table library name. table name; (this step has already uploaded the file to HDFS)

  • Import from HDFS system

Hive historical command storage

cat ~/.hivehistory

Mainly used to troubleshoot logic errors or view common commands

Hive temporary effective settings

Fixed syntax: set attribute name = attribute value

For example: set hive.cli.print.header=false;

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325391792&siteId=291194637