【HIVE 之 DDL,DML】

每个人都有每个人的生活方式。每个人都有每个人要走的路,每条路不一定适合每个人,这就像每个人穿内裤一样,不在乎外表多么花俏,最重要的是适合自己,人生的十字路口又很多,选择了,就义无反顾的走下去,不要回头,因为回头你得不到你想要的答案!
————————————————前言:送给为了自己目标努力的你!
开始今天的学习
DDL: Data Definition Language
create delete drop alter关键字开头的

Database
HDFS上的一个文件夹
默认自带一个default数据库
默认数据库存放位置:/user/hive/warehouse
位置是由参数决定的:hive.metastore.warehouse.dir
Hive所有参数的查询:https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
在Hive里面查看指定配置参数的语法:
set key;
你要设置参数: set key=value;

CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, …)];

CREATE DATABASE IF NOT EXISTS d5_hive;
非default的存放路径:${hive.metastore.warehouse.dir}/dbname.db

CREATE DATABASE IF NOT EXISTS d5_hive_2
COMMENT ‘this is ruozedata d5’
WITH DBPROPERTIES (‘creator’=‘ruoze’, ‘date’=‘20181020’);

ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, …); – (Note: SCHEMA added in Hive 0.14.0)

ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; – (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)

ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; – (Note: Hive 2.2.1, 2.4.0 and later)

DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

cascade: hibernate/jpa
1对多的时候,你删除1的一端是否删除多的一端

create table xx(id int);

数值类型:int bigint float double
字符串: string <= date time
20181020

t:date time - string

where flag=true  0

Hive构建在Hadoop之上
hive创建表,然后数据是存储在HDFS之上
文件:zhangsan,20,m,beijing
表: name age gender location

所以创建表的时候要指定分隔符(默认分隔符是\001 ^A)
空格、制表符(\t)

CREATE TABLE ruoze_emp (
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’
;

load data local inpath ‘/home/hadoop/data/emp.txt’ overwrite into table ruoze_emp;

$$$$$

row_format
DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
[NULL DEFINED AS char] – (Note: Available in Hive 0.13 and later)
| SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, …)]

desc xxx
desc formatted ruoze_emp;

内部表和外部表
MANAGED_TABLE

create table ruoze_emp_managed as select * from ruoze_emp;
创建了一张内部表:HDFS MySQL 都有数据
删除表

CREATE EXTERNAL TABLE ruoze_emp_external (
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’
LOCATION ‘/ruoze_emp_external’
;

DML : Data Manipulation Language

LOAD DATA [LOCAL] INPATH ‘filepath’ [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 …)]

create table ruoze_dept(
deptno int,
dname string,
loc string
)ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’;

LOAD DATA LOCAL INPATH ‘/home/hadoop/data/dept.txt’ INTO TABLE ruoze_dept;

LOAD DATA INPATH ‘/data/dept.txt’ INTO TABLE ruoze_dept;

Standard syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 …) [IF NOT EXISTS]] select_statement1 FROM from_statement;
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 …)] select_statement1 FROM from_statement;

INSERT OVERWRITE TABLE ruoze_emp_test select empno,ename from ruoze_emp;

INSERT OVERWRITE TABLE ruoze_emp_test
select empno,job,ename,mgr,hiredate,sal,comm,deptno from ruoze_emp;

INSERT OVERWRITE LOCAL DIRECTORY ‘/tmp/ruoze’
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’
SELECT empno,ename FROM ruoze_emp;
操作
[root@hadoop001 ~]# su - hadoop
Last login: Mon Nov 19 23:26:37 CST 2018 on pts/1
-bash: /home/hadoop/.bash_profile: line 14: syntax error near unexpected token newline' -bash: /home/hadoop/.bash_profile: line 14:
[hadoop@hadoop001 ~]$ cd app
[hadoop@hadoop001 app]$ ll
total 417656
drwxr-xr-x 16 hadoop hadoop 4096 Nov 19 09:02 hadoop-2.6.0-cdh5.7.0
-rw-rw-r-- 1 hadoop hadoop 311585484 Apr 1 2016 hadoop-2.6.0-cdh5.7.0.tar.gz
drwxr-xr-x 10 hadoop hadoop 4096 Mar 24 2016 hive-1.1.0-cdh5.7.0
-rw-rw-r-- 1 hadoop hadoop 116082695 Apr 1 2016 hive-1.1.0-cdh5.7.0.tar.gz
[hadoop@hadoop001 app]$ cd hive-1.1.0-cdh5.7.0
[hadoop@hadoop001 hive-1.1.0-cdh5.7.0]$ ll
total 440
drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 bin
drwxr-xr-x 2 hadoop hadoop 4096 Nov 19 23:22 conf
drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 data
drwxr-xr-x 6 hadoop hadoop 4096 Mar 24 2016 docs
drwxr-xr-x 4 hadoop hadoop 4096 Mar 24 2016 examples
drwxr-xr-x 7 hadoop hadoop 4096 Mar 24 2016 hcatalog
drwxr-xr-x 4 hadoop hadoop 12288 Nov 19 20:45 lib
-rw-r–r-- 1 hadoop hadoop 23169 Mar 24 2016 LICENSE
-rw-r–r-- 1 hadoop hadoop 397 Mar 24 2016 NOTICE
-rw-r–r-- 1 hadoop hadoop 4048 Mar 24 2016 README.txt
-rw-r–r-- 1 hadoop hadoop 376416 Mar 24 2016 RELEASE_NOTES.txt
drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 scripts
[hadoop@hadoop001 hive-1.1.0-cdh5.7.0]$ cd conf
[hadoop@hadoop001 conf]$ hive
which: no hbase in (/home/hadoop/app/hive-1.1.0-cdh5.7.0/bin:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin:/usr/java/jdk1.7.0_80/bin:/usr/java/jdk1.7.0_80/bin:/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin)

Logging initialized using configuration in jar:file:/home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive (default)> show databases;
OK
default
Time taken: 1.219 seconds, Fetched: 1 row(s)
hive (default)>
> CREATE DATABASE d5_hive;
OK
Time taken: 0.588 seconds
hive (default)> show databases;
OK
d5_hive
default
Time taken: 0.095 seconds, Fetched: 2 row(s)
hive (default)> CREATE DATABASE IF NOT EXISTS d5_hive_2
> COMMENT ‘this is ruozedata d5’
> WITH DBPROPERTIES (‘creator’=‘ruoze’, ‘date’=‘20181020’);
OK
Time taken: 0.192 seconds
hive (default)> show databases;
OK
d5_hive
d5_hive_2
default
Time taken: 0.067 seconds, Fetched: 3 row(s)
hive (default)> desc database d5_hive_2
> desc database d5_hive_2;
FAILED: ParseException line 2:0 missing EOF at ‘desc’ near ‘d5_hive_2’
hive (default)> desc database d5_hive_2;
OK
d5_hive_2 this is ruozedata d5 hdfs://hadoop001:9000/user/hive/warehouse/d5_hive_2.db hadoop USER
Time taken: 0.072 seconds, Fetched: 1 row(s)
hive (default)> use d5_hive;
OK
Time taken: 0.029 seconds
hive (d5_hive)> show databases;
OK
d5_hive
d5_hive_2
default
Time taken: 0.047 seconds, Fetched: 3 row(s)
hive (d5_hive)> create table x(id int)
> ;
OK
Time taken: 0.446 seconds
hive (d5_hive)> show databases;
OK
d5_hive
d5_hive_2
default
Time taken: 0.055 seconds, Fetched: 3 row(s)
hive (d5_hive)> drop database d5_hive_2;
OK
Time taken: 1.301 seconds
hive (d5_hive)> show databases;
OK
d5_hive
default
Time taken: 0.071 seconds, Fetched: 2 row(s)
hive (d5_hive)> drop database d5_hive
> ;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidOperationException(message:Database d5_hive is not empty. One or more tables exist.)
hive (d5_hive)> drop database d5_hive cascade;
OK
Time taken: 7.571 seconds
hive (d5_hive)> show databases;
OK
default
Time taken: 0.091 seconds, Fetched: 1 row(s)
hive (d5_hive)> use default;
OK
Time taken: 0.047 seconds
hive (default)> CREATE TABLE ruoze_emp (
> empno int,
> ename string,
> job string,
> mgr int,
> hiredate string,
> sal double,
> comm double,
> deptno int
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’
> ;
OK
Time taken: 0.349 seconds
hive (default)> show databases;
OK
default
Time taken: 0.127 seconds, Fetched: 1 row(s)
hive (default)> show tables;
OK
ruoze_emp
Time taken: 0.08 seconds, Fetched: 1 row(s)
hive (default)> select * from ruoze_emp;
OK
Time taken: 0.645 seconds
hive (default)> load data local inpath ‘/home/hadoop/data/emp.txt’ overwrite into table ruoze_emp;
Loading data to table default.ruoze_emp
Table default.ruoze_emp stats: [numFiles=1, numRows=0, totalSize=700, rawDataSize=0]
OK
Time taken: 1.598 seconds
hive (default)> select * from ruoze_emp;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
8888 HIVE PROGRAM 7839 1988-1-23 10300.0 NULL NULL
Time taken: 0.196 seconds, Fetched: 15 row(s)
hive (default)>
hive (default)> show tables;
OK
ruoze_emp
ruoze_emp2
ruoze_emp3
ruoze_emp_managed
Time taken: 0.02 seconds, Fetched: 4 row(s)
hive (default)> CREATE EXTERNAL TABLE ruoze_emp_external (
> empno int,
> ename string,
> job string,
> mgr int,
> hiredate string,
> sal double,
> comm double,
> deptno int
> )ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’
> LOCATION ‘/ruoze_emp_external’
> ;
OK
hive (default)>
>
> create table ruoze_dept(
> deptno int,
> dname string,
> loc string
> )ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’;
OK
Time taken: 0.342 seconds
hive (default)>
> select * from ruoze_dept;
OK
Time taken: 0.32 seconds
hive (default)> LOAD DATA LOCAL INPATH ‘/home/hadoop/data/dept.txt’ INTO TABLE ruoze_dept;
Loading data to table default.ruoze_dept
Table default.ruoze_dept stats: [numFiles=1, totalSize=79]
OK
Time taken: 0.735 second
hive (default)> select * from ruoze_dept;
OK
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Time taken: 0.208 seconds, Fetched: 4 row(s)
hive (default)>
结束语:每一次含苞的绽放,都需要经过漫长的等待,每一次绽放的开始,都要尽展其光芒,不怕前方的路有多长,就怕你心中没有光芒,相信自己,出发吧,少年!
—————————————————送给在路上奔跑的你!

猜你喜欢

转载自blog.csdn.net/qq_43688472/article/details/84307884