Hbase数据库基本操作

本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>
上个月写了一篇Hive数据仓库基本操作过了这么长的时间,还没来得及复习,今天又学了Hbase数据库的一大堆操作,为了防止混淆,和后期快速复习,查找,今天再写一篇Hbase Shell的基本操作,记性不好,只好写下来啦。

命令 作用
create 创建表
desc 查看表信息
put 插入数据
get 数据查询
scan 数据查询
alter 修改表
truncate 清空数据表
drop 删除表

在保证Hbase和相关依赖项都启动后输入hbase shell,进入Hbase客户端。

[root@namenode opt]# hbase shell

输入help,查看Hbase的shell命令

hbase(main):001:0> help
HBase Shell, version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
......

命令有很多,本文只罗列一些基本的命令

1、create 创建表

输入help ‘create’ 查看建表语句

hbase(main):002:0> help 'create'
......

基本语句:create ‘表名’ , ‘列族名’
插一句:可能会出现 znode data = = null 的问题,这是因为运行Hbase的用户无法将文件写入zookeeper,导致znode为空
解决方案:在hbase-site.xml文件中指定zookeeper的文件目录即可

<property>
	<name>hbase.zookeeper.property.dataDir</name>
    <value>/opt/zookeeper/data</value>
</property>

Hbase创建表时。只指定表的名称和列族名称,不指定列的名称和类型
如:创建一个名为 student1 ,列族名为 cf1 的表

hbase(main):003:0> create 'student1' , 'cf1'
0 row(s) in 3.4160 seconds

=> Hbase::Table - student1

创建一个名为 student2 ,列族名为 cf1 和 cf2 的表

hbase(main):011:0> create 'student2' , 'cf1' , 'cf2'
0 row(s) in 2.2410 seconds

=> Hbase::Table - student2

输入 list 命令,查看Hbase中的表

hbase(main):016:0> list
TABLE
student1
student2
2 row(s) in 0.0080 seconds

=> ["student1", "student2"]

2、desc 查看表信息

使用命令:desc ‘表名’ 查看student1表的详细信息

hbase(main):018:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0590 seconds

{}中的内容是列族的信息,以下是各个字符的对应解释

扫描二维码关注公众号,回复: 11469723 查看本文章
字符 解释
NAME 名称
VERSION 版本号
IN_MEMORY 是否将数据在内存中存储
TTL 创建时间
BLOCKSIZE 列族的大小
REPLICATION_SCOPE 复制

查看 student2 表的信息, student2 表有两个列族 ,故有两个{}

hbase(main):019:0> desc 'student2'
Table student2 is ENABLED
student2
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.2050 seconds

3、put 插入数据

put命令:put ‘表名’ , ‘行键’ , ‘列族:列名’ , ‘数据内容’
将信息(name:July,age:18,grade:98,sex:男)插入到表student1中

hbase(main):020:0> put 'student1' , '001' , 'cf1:name' , 'July'
0 row(s) in 0.2070 seconds

hbase(main):021:0> put 'student1' , '001' , 'cf1:age' , '18'
0 row(s) in 0.0240 seconds

hbase(main):022:0> put 'student1' , '001' , 'cf1:grade' , '98'
0 row(s) in 0.0210 seconds

hbase(main):023:0> put 'student1' , '001' , 'cf1:sex' , 'M'
0 row(s) in 0.0080 seconds

浏览器输入:namenode:50070
插入数据在HDFS存储路径
Hbase插入的数据存储在HDFS中
存储路径为:/hbase/data/default/表/region编号/列族/HDFS的文件名
default 是默认的命名空间

4、get 数据查询

基本命令:get ‘表名’ , ‘行键’
获取表 student1 行键为 001 的数据

hbase(main):025:0> get 'student1' , '001'
COLUMN                                              CELL
 cf1:age                                            timestamp=1589278473070, value=18
 cf1:grade                                          timestamp=1589278487224, value=98
 cf1:name                                           timestamp=1589278459460, value=July
 cf1:sex                                            timestamp=1589278496165, value=M
4 row(s) in 0.0760 seconds

其中,timetamp 表示存入数据的时间戳,value 是对应的值
查询表 student1,001行,cf1列族,name列的数据

hbase(main):026:0> get 'student1' , '001' , 'cf1:name'
COLUMN                                              CELL
 cf1:name                                           timestamp=1589278459460, value=July
1 row(s) in 0.0210 seconds

将表student1中,name列的值July修改为Mary,然后查询结果

hbase(main):027:0> put 'student1' , '001' , 'cf1:name' , 'Mary'
0 row(s) in 0.0220 seconds

hbase(main):028:0> get 'student1' , '001' , 'cf1:name'
COLUMN                                              CELL
 cf1:name                                           timestamp=1589279637308, value=Mary
1 row(s) in 0.0150 seconds

在Hbase中,列族默认VERSION的值为1,表示每一列只能存储一个值,后插入的值会覆盖之前的值!
创建表时,指定VERSION的值:create ‘表名’ {NAME => ‘列族名’ , VERSIONS => ‘版本值’}
创建表 student3,并指定列族 cf1 的版本值是3,查询结果

hbase(main):031:0> create 'student3' , {NAME => 'cf1' , VERSIONS => '3'}
0 row(s) in 8.8800 seconds

=> Hbase::Table - student3

hbase(main):032:0> desc 'student3'
Table student3 is ENABLED
student3
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0410 seconds

向表student3的相同的列中插入3次数据,查询结果

hbase(main):033:0> put 'student3' , '001' , 'cf1:name' , 'July'
0 row(s) in 0.0470 seconds

hbase(main):034:0> put 'student3' , '001' , 'cf1:name' , 'Tom'
0 row(s) in 0.0120 seconds

hbase(main):035:0> put 'student3' , '001' , 'cf1:name' , 'Mary'
0 row(s) in 0.0090 seconds

hbase(main):039:0> get 'student3' , '001' , {COLUMN => 'cf1:name' ,VERSIONS => 3}
COLUMN                                              CELL
 cf1:name                                           timestamp=1589280435047, value=Mary
 cf1:name                                           timestamp=1589280430294, value=Tom
 cf1:name                                           timestamp=1589280423248, value=July
3 row(s) in 0.2580 seconds

5、scan 数据查询

使用get查询时,必须输入行键,不能直接对某一列进行查询
可以使用scan对表的指定列进行查询
命令:scan ‘表名’ , {COLUMN => ‘列族:列名’ , VERSIONS => ‘版本值’ }
查询表student3中的name列

hbase(main):005:0> scan 'student3' , {COLUMN => 'cf1:name' , VERSIONS => 3}
ROW                                                 COLUMN+CELL
 001                                                column=cf1:name, timestamp=1589280435047, value=Mary
 001                                                column=cf1:name, timestamp=1589280430294, value=Tom
 001                                                column=cf1:name, timestamp=1589280423248, value=July
1 row(s) in 0.1310 seconds

6、alter 修改表

alter 可以在表中增加列族
命令:alter ‘表名’ , NAME => ‘列族名’ , VERSIONS => 版本值
在表student1中增加列族cf1,修改版本值为3,查询结果

hbase(main):007:0> alter 'student1' , NAME => 'cf2' , VERSIONS => 3
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 5.0950 seconds

hbase(main):008:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.0500 seconds

alter 可以删除表中的数据,但alter只能以列族为单位删除
命令:
alter '表名’ , NAME => ‘列族’ , METHOD => ‘delete’ ,或输入
alter ‘表名’ , ‘delete’ => ‘列族’

删除表student1中的列族cf2

hbase(main):009:0> alter 'student1' , 'delete' => 'cf2'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.7580 seconds

hbase(main):010:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0280 seconds

7、truncate 清空数据

命令:truncate ‘表名’
清空表student3中的数据

hbase(main):011:0> truncate 'student3'
Truncating 'student3' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 4.7220 seconds

hbase(main):014:0> get 'student3' , '001'
COLUMN                                              CELL
0 row(s) in 0.2540 seconds

在清空表中数据时,系统自动先禁用表再清空数据
当数据清空完成后,系统自动恢复表的使用

使用命令:is_enabled ‘表名’ 查看表是否可用

hbase(main):015:0> is_enabled 'student3'
true
0 row(s) in 0.0190 seconds

8、drop 删除表

命令:drop ‘表名’
Hbase表不能直接删除

hbase(main):016:0> drop 'student2'

ERROR: Table student2 is enabled. Disable it first.

Here is some help for this command:
Drop the named table. Table must first be disabled:
  hbase> drop 't1'
  hbase> drop 'ns1:t1'

在删除之前,必须先禁用表
命令:disable ‘表名’

hbase(main):017:0> disable 'student2'
0 row(s) in 2.3580 seconds

hbase(main):018:0> drop 'student2'
0 row(s) in 1.3430 seconds

hbase(main):019:0> list
TABLE
student1
student3
2 row(s) in 0.0090 seconds

=> ["student1", "student3"]




本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>

后续遇到其他命令将继续跟新此文章

如有错误(包括之前的博文),欢迎私信

技术永无止境!谢谢支持!

猜你喜欢

转载自blog.csdn.net/pineapple_C/article/details/106080693