hive DDL 笔记

为了查看表结构和数据方便使用客户端服务端模式访问hive


开启hive服务端(监听端口为10000)

[root@master apps]# hive-1.2.1/bin/hiveserver2

开启服务端口后光标一直闪烁,表示开启


客户端连接hive服务端并进行操作

hive当做客户端(在另一台机器上操作)
注: Enter username for jdbc:hive2://master.hadoop:10000:(此处输入启动服务端的用户)

        Enter password for jdbc:hive2://master.hadoop:10000:(此处没配密码,直接回车跳过)


具体操作如下:

[root@slave1 apps]# cd hive-1.2.1/
[root@slave1 hive-1.2.1]# bin/beeline 
Beeline version 1.2.1 by Apache 
Hivebeeline> !connect jdbc:hive2://master.hadoop:10000
Connecting to jdbc:hive2://master.hadoop:10000
Enter username for jdbc:hive2://master.hadoop:10000: root
Enter password for jdbc:hive2://master.hadoop:10000: 
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://master.hadoop:10000> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| mydb           |
+----------------+--+
2 rows selected (3.776 seconds)
0: jdbc:hive2://master.hadoop:10000> use mydb;
No rows affected (0.352 seconds)
0: jdbc:hive2://master.hadoop:10000> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| t_1       |
| t_1_like  |
| t_1_son   |
| t_2       |
+-----------+--+
4 rows selected (0.236 seconds)
0: jdbc:hive2://master.hadoop:10000> 

hive DDL操作

1、创建内部表,该表的目录有hive自动创建

create table t_1(id int, name string,password string)
row format delimited
fields terminated by ',';
具体操作如下:
0: jdbc:hive2://master.hadoop:10000> create table t_1(id int, name string,password string)
0: jdbc:hive2://master.hadoop:10000> row format delimited
0: jdbc:hive2://master.hadoop:10000> fields terminated by ',';
No rows affected (1.671 seconds)
0: jdbc:hive2://master.hadoop:10000> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| t_1       |
+-----------+--+
1 row selected (0.104 seconds)

2、创建外部表,可以指定文件目录

create external table t_2(id int, name string,password string)
row format delimited
fields terminated by ','
location '/aa/bb';
具体操作如下:
0: jdbc:hive2://master.hadoop:10000> create external table t_2(id int, name string,password string)
0: jdbc:hive2://master.hadoop:10000> row format delimited
0: jdbc:hive2://master.hadoop:10000> fields terminated by ','
0: jdbc:hive2://master.hadoop:10000> location '/aa/bb';
No rows affected (4.267 seconds)
0: jdbc:hive2://master.hadoop:10000> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| t_1       |
| t_2       |
+-----------+--+
2 rows selected (0.363 seconds)
0: jdbc:hive2://master.hadoop:10000> 
内部表和外部表(external)区别:
1)
内部表的目录有hive创建在默认的目录下面:/user/hive/warehouse/......
外部表的目的目录有用户自己建表是指定:location '/位置'
2)
drop一个内部表时,表的元信息和表数据目录都会被删掉
drop一个外部表时,只删除表的元信息,表的数据目录不会删除

外部表的意义:通常一个数据仓库系统,数据都是有别的系统产生的,为了方便映射,就可以在hive
中用外部表映射,并且就算hive中的表被删除了,但是目录还在,不会影响到继续使用该目录的系统。

3、导入数据

1)导入本地文件到hive:(该文件需位于运行hive的机器上在这里为master.hadoop)

load data local inpath '/root/a.txt' into table t_1;

具体操作如下:

0: jdbc:hive2://master.hadoop:10000> load data local inpath '/root/a.txt' into table t_1;
INFO  : Loading data to table mydb.t_1 from file:/root/a.txt
INFO  : Table mydb.t_1 stats: [numFiles=1, totalSize=45]
No rows affected (0.911 seconds)
0: jdbc:hive2://master.hadoop:10000> select * from t_1;
+---------+-----------+---------------+--+
| t_1.id  | t_1.name  | t_1.password  |
+---------+-----------+---------------+--+
| 1       | user1     | 123123        |
| 2       | user2     | 123123        |
| 3       | user3     | 123123        |
+---------+-----------+---------------+--+
3 rows selected (0.289 seconds)
0: jdbc:hive2://master.hadoop:10000> 
2)导入hdfs文件到hive:

load data inpath '/a.txt' into table t_1;(此操作会将hdfs下的该文件直接移走到表目录中)

具体操作如下:

查看hdfs下的文件

[root@master ~]# hadoop fs -ls /
Found 4 items
-rw-r--r--   2 root supergroup         45 2018-07-08 02:47 /a.txt
drwxr-xr-x   - root supergroup          0 2018-07-07 16:40 /root
drwx-wx-wx   - root supergroup          0 2018-07-07 16:05 /tmp
drwxr-xr-x   - root supergroup          0 2018-07-07 16:36 /user
[root@master ~]# 

上传a.txt 到hive

0: jdbc:hive2://master.hadoop:10000> load data inpath '/a.txt' into table t_1;
INFO  : Loading data to table mydb.t_1 from hdfs://ns/a.txt
INFO  : Table mydb.t_1 stats: [numFiles=2, totalSize=90]
No rows affected (0.986 seconds)
0: jdbc:hive2://master.hadoop:10000> select * from t_1;
+---------+-----------+---------------+--+
| t_1.id  | t_1.name  | t_1.password  |
+---------+-----------+---------------+--+
| 1       | user1     | 123123        |
| 2       | user2     | 123123        |
| 3       | user3     | 123123        |
| 1       | user1     | 123123        |
| 2       | user2     | 123123        |
| 3       | user3     | 123123        |
+---------+-----------+---------------+--+
6 rows selected (0.435 seconds)
0: jdbc:hive2://master.hadoop:10000> 

再次查看HDFS文件目录

[root@master ~]# hadoop fs -ls /
Found 3 items
drwxr-xr-x   - root supergroup          0 2018-07-07 16:40 /root
drwx-wx-wx   - root supergroup          0 2018-07-07 16:05 /tmp
drwxr-xr-x   - root supergroup          0 2018-07-07 16:36 /user
[root@master ~]# 
3)从别的表查询数据后插入到另一张表中(会新建)
create table t_1_son
as
select * from t_1;
4)可以根据已有的表的表结构创建一张新的表
create table t_1_like like t_1;

给其插入数据

insert into table t_1_like
select
id,name,password
from t_1;

4、导出数据:

导出到hdfs

insert overwrite directory '/root/aa/bb'
select * from t_1;

导出到本地

insert overwrite local directory '/root/aa/bb'
select * from t_1;


猜你喜欢

转载自blog.csdn.net/nuc2015/article/details/80953633