About the Hive SQL operations

Creative Commons License Copyright: Attribution, allow others to create paper-based, and must distribute paper (based on the original license agreement with the same license Creative Commons )

Hive deployment

1. Upload tar packets
2. Extract the
tar-Hive -zxvf 1.2.1.tar.gz
3. Installation mysql database
recommended yum line installation (run the installation script)
4. Configuration Hive
(A) arranged HIVE_HOME environment variable
vi conf / hive- env.sh
disposed therein HADOOP_HOME $
(B) configuration metabase

vi  hive-site.xml 
		添加如下内容:
		<configuration>
		<property>
		<name>javax.jdo.option.ConnectionURL</name>
		<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
		<description>JDBC connect string for a JDBC metastore</description>
		</property>

		<property>
		<name>javax.jdo.option.ConnectionDriverName</name>
		<value>com.mysql.jdbc.Driver</value>
		<description>Driver class name for a JDBC metastore</description>
		</property>

		<property>
		<name>javax.jdo.option.ConnectionUserName</name>
		<value>root</value>
		<description>username to use against metastore database</description>
		</property>

		<property>
		<name>javax.jdo.option.ConnectionPassword</name>
		<value>root</value>
		<description>password to use against metastore database</description>
		</property>
		</configuration>

5. Installation hive and mysq completed, the jar package mysql connection copy to $ HIVE_HOME / lib directory
if the problem does not appear authority, authorizing the mysql (mysql installation executed on the machine)
mysql -uroot--p

设置密码
set password=password('root');

#(执行下面的语句  *.*:所有库下的所有表   %:任何IP地址或主机都可以连接)
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root' WITH GRANT OPTION;

FLUSH PRIVILEGES;

通过mysql -uroot -proot
  1. Inconsistent Jline package version, you need to copy hive lib directory jline.2.12.jar a jar hadoop replace the
    6.1, cp hive / lib / jline -2.12.jar /opt/software/hadoop-2.6.4/ Share / Hadoop / Yarn / lib /
    6.2, and after loading hive mysq was completed, the jar package mysql connection copy to $ HIVE_HOME / lib directory

Hive generation:

Non java programmer to do operations on data hdfs of mapreduce.
Hive is just a tool, a tool used to construct data warehouse. Hive and MapReduce HDFS based underlayer or function to achieve.

About Hive

Hive: data warehouse.
Hive: interpreter, compiler, optimizer, etc.
Hive is running, the metadata stored in a relational database inside.

Hive process flow

Here Insert Picture Description

Hive architecture

Here Insert Picture Description
Hive architecture
(1) There are three main user interface: CLI, Client and WUI. One of the most commonly used CLI, Cli when activated, will start at the same time a copy of the Hive. Hive Client is a client, the user connects to the Hive Server. Client mode at startup, it is necessary to point out the Hive Server node, and start Hive Server in the node. WUI is accessed through a browser Hive.
(2) Hive metadata stored in the database, such as mysql, derby. Hive metadata includes the name of the table, and a list of partitions and their properties, property sheet (whether for the external table, etc.), directory and other data tables.
(3) interpreter, compiler, optimizer completed HQL query from lexical analysis, parsing, compilation, optimization and query plan generation. The generated query plan is stored in HDFS, MapReduce and subsequently have called for execution.
(4) Hive data stored in HDFS, most queries, calculations are completed (including the MapReduce * by the query, such as select * from tbl not generate any MapRedcue

Hive the SQL compiler a conversion operator
operator is the smallest processing unit Hive of
a representative of the operation of each operator or a MapReduce job HDFS

2. Database Link
(1) directly into the database
(2) externally -navicat
2.1 External connection method
the CLI
Web
JDBC
metadata storage position 3.hive
3.1derby inside -hive database can multiple users simultaneous access
3.2 Local
3.3 stored in external database (security)

HiveSQL

1. Create a database of
database creation is inside the table of
(1) an internal table

CREATE  TABLE [IF NOT EXISTS] table_name
	删除表时,元数据与数据都会被删除

(2) an external table

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name LOCATION 				hdfs_path
	删除外部表只删除metastore的元数据,不删除hdfs中的表数据

(3)分区表
(4)分桶表
2.Hive 建表
CREATE TABLE person(
id INT,
name STRING,
age INT,
likes ARRAY,
address MAP<STRING,STRING>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
COLLECTION ITEMS TERMINATED BY ‘-’
MAP KEYS TERMINATED BY ‘:’
LINES TERMINATED BY ‘\n’;
3. Hive 查看表描述
DESCRIBE [EXTENDED|FORMATTED] table_name

Guess you like

Origin blog.csdn.net/power_k/article/details/92848221