必须要掌握的【Hbase Shell】

目录

Hbase Shell

一,基本命令

二,数据模型

三,表的管理

四,表数据的增删改查

五,hbase数据迁移的importtsv


Hbase Shell启动

[root@master conf]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hadoop/hbase-1.6.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoop/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020
​
hbase(main):001:0> 

<!--注意-->

hbase shell没法使用退格删除文字。所以我们在xshell设置一下。

一,基本命令

1.连接到hbase

hbase shell

2.显示hbase shell 帮助文本

type help

然后按Enter键,显示HBASE Shell的一些基本使用信息,以及一些示例命令。注意,表名、行、列都必须用引号括起来。

3.创建表

使用create命令创建一个新表。您必须指定表名和Column类名。

creat 'table_name','family1','family2','familyN'

4.列出有关表的信息

使用list命令来确认表的存在。

list 'table_name'

现在使用describe命令查看详细信息,包括配置默认值。

describe 'table_name'

5.把数据插到你的表中

若要将数据放入表中,请使用put命令。

put 'table_name','rowkey','family:column','value'
#eg:
put 'test', 'row1', 'cf:a', 'value1'

插入位于row1,列cf:a,值为value1

HBASE中的列由列族前缀组成。

6.查看所有记录

从HBASE获取数据的方法之一是扫描。使用scan命令扫描表以获取数据。您可以限制扫描,但就目前而言,所有数据都将被获取。

#全部显示
scan 'table_name'
#显示10条记录
scan 'table_name',{LIMIT=>10}

7.获取一行数据

若要一次获取一行数据,请使用get命令。

get 'table_name','rowkey'

8.禁用一张表

如果要删除表或更改其设置,以及在某些其他情况下,需要首先禁用表,使用disable命令。您可以使用enable命令。

disable 'table_name'
#
enable 'table_name'

9.删除一张表

若要删除(删除)表,请使用drop命令。注:先得禁用

disable 'table_name'
drop 'table_name'

10.查看表中记录总数

这个命令并不快,但是目前还没找到比这快的统计行数的方式。

count 'table_name'

11.删除记录

第一种方式删除一条记录的单列数据

第二种方式删除整条记录

delete 'table_name','rowkey','family_name:column'
####
delete 'table_name','rowkey'

12.退出hbase shell

若要退出HBASE Shell并从群集断开连接,请使用quit命令。HBASE仍在后台运行。

quit

二,数据模型

在HBASE中,数据存储在具有行和列的表中。这是一个术语重叠关系数据库(RDBMS),但这不是一个有用的类比。相反,将HBASE表看作多维地图是有帮助的.

HBASE数据模型术语

  • HBASE表由多行组成。

    在概念级别上,表可以被看作是一组稀疏的行,但它们实际上是由列族存储的。一个新的列限定符(列族:列限定符)可以在任何时候添加到现有列族中。

  • 行键

    HBASE中的行由一个行键和一个或多个列组成,列的值与它们相关联。行在存储时按行键按字母顺序排序。因此,行键的设计非常重要。其目标是以这样一种方式存储数据,即相关行彼此接近。常见的行键模式是网站域。如果您的行键是域,则应该将它们反向存储(org.apache.www、org.apache.mail、org.apache.jira)。这样,所有Apache域都在表中彼此接近,而不是根据子域的第一个字母展开。

  • HBASE中的列由列族和列限定符组成,它们由:(冒号)字符。

  • 列族

    列族物理上共用一组列及其值,通常是出于性能原因。每个列族都有一组存储属性,例如其值是否应该缓存在内存中、其数据如何被压缩或行键如何编码等。表中的每一行都有相同的列族,尽管给定行可能不会在给定列族中存储任何内容。

  • 列限定符

    列限定符被添加到列族中,以便为给定的数据段提供索引。列族content,列限定符可能是content:html,另一个可能是content:pdf。虽然列族在表创建时是固定的,但列限定符是可变的,并且在行之间可能有很大差异。

  • 单元格

    单元格是行、列系列和列限定符的组合,包含一个值和一个时间戳,表示该值的版本。

  • 时间戳

    时间戳与每个值并排写,是值的给定版本的标识符。默认情况下,时间戳表示写入数据时在RegionServer上的时间,但您可以在将数据放入单元格时指定不同的时间戳值。

三,表的管理

  1. list 命令

#语法格式 list <table>
hbase(main):001:0> list
TABLE                                                                                               
0 row(s) in 3.6950 seconds

=> []
hbase(main):002:0> 

2.create命令

hbase(main):002:0> create 'scores2',{NAME=>'course',VERSIONS=>3},{NAME=>'grade',VERSIONS=>3}
0 row(s) in 3.0720 seconds

=> Hbase::Table - scores2
hbase(main):003:0> create 'scores','course','grade'

3.describe

hbase(main):005:0> describe 'scores'
Table scores is ENABLED                                                                             
scores                                                                                              
COLUMN FAMILIES DESCRIPTION                                                                         
{NAME => 'course', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS =
> 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '
0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                           
{NAME => 'grade', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS =>
 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0
', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                            
2 row(s) in 0.1460 seconds

hbase(main):006:0> 

4.disable

hbase(main):006:0> disable 'scores2'
0 row(s) in 2.4350 seconds

hbase(main):007:0> drop 'scores2'
0 row(s) in 1.3560 seconds

hbase(main):008:0> list
TABLE                                                                                                              
scores                                                                                                             
1 row(s) in 0.0500 seconds

=> ["scores"]
hbase(main):009:0> 

5.exists

hbase(main):009:0> exists 'scores2'
Table scores2 does not exist                                                                                       
0 row(s) in 0.0200 seconds

hbase(main):010:0> 

6.is_enabled

判断表是否enable,语法:enable <table>

hbase(main):013:0> is_enabled 'scores'
true                                                                                                               
0 row(s) in 0.0280 seconds

hbase(main):014:0> 

7.is_disabled

判断表是否disable,格式:disable<table>

hbase(main):014:0> is_disabled 'scores'
false                                                                                                              
0 row(s) in 0.0450 seconds

hbase(main):015:0> 

8.alter

修改表结构。

语法格式:alter <table>,{NAME=><family>},{NAME=><family>,METHOD=>'delete'}

#向scores表添加一列族address,同时指定版本数为3:
hbase(main):015:0> alter 'scores',NAME=>'address',VERSIONS=>3
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 3.6060 seconds

#将scores表中的grade列族删掉
hbase(main):016:0> alter 'scores',NAME=>'grade',METHOD=>'delete'
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 3.7150 seconds

<!--注意-->

在进行更改表结构之前需要将该表停用,操作执行完后再启动。例如:

disable 'scores'

alter 操作

enable 'scores'

9.删除列族

(1)禁用表

hbase(main):017:0> disable 'scores'
0 row(s) in 2.3090 seconds

(2)删除列表(注意NAME和METHOD要大写)

hbase(main):018:0> alter 'scores',NAME=>'course',METHOD=>'delete'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.2050 seconds

(3)删除列族之后再启用表

hbase(main):019:0> enable 'scores'
0 row(s) in 1.3910 seconds

(4)再次查看表信息,可以看到course已经被删除

hbase(main):020:0> describe 'scores'
Table scores is ENABLED                                                                                            
scores                                                                                                             
COLUMN FAMILIES DESCRIPTION                                                                                        
{NAME => 'address', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DAT
A_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLO
CKSIZE => '65536', REPLICATION_SCOPE => '0'}                                                                       
1 row(s) in 0.0410 seconds

10.whoami

查看当前访问HBase的用户

hbase(main):021:0> whoami
root (auth:SIMPLE)
    groups: root

hbase(main):022:0> 

11.version

查看HBase的版本信息

hbase(main):022:0> version
1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020

12.status

查看当前HBase的状态

#####status
hbase(main):023:0> status
1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load

####status 'summary'
hbase(main):024:0> status 'summary'
1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load

####status 'detailed'
hbase(main):025:0> status 'detailed'
version 1.6.0
0 regionsInTransition
active master:  master:16000 1607158473117
0 backup masters
master coprocessors: null
2 live servers
    slave1:16020 1607158482456
        requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=20, maxHeapMB=235, numberOfStores=1, numberOfStorefiles=2, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=70, writeRequestsCount=9, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=21, currentCompactedKVs=21, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint]
        "hbase:meta,,1"
            numberOfStores=1, numberOfStorefiles=2, storeRefCount=0, maxCompactedStoreFileRefCount=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=1607159362926, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=70, writeRequestsCount=9, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=21, currentCompactedKVs=21, compactionProgressPct=1.0, completeSequenceId=45, dataLocality=1.0
    slave2:16020 1607158481937
        requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=13, maxHeapMB=235, numberOfStores=2, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=4, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[]
        "hbase:namespace,,1607137216837.828e7f51fd1fef7501b8ccc4c3b373ca."
            numberOfStores=1, numberOfStorefiles=1, storeRefCount=0, maxCompactedStoreFileRefCount=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=4, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=1.0
        "scores,,1607147570746.91663a6c1657acef0ce3c114638e81af."
            numberOfStores=1, numberOfStorefiles=0, storeRefCount=0, maxCompactedStoreFileRefCount=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=0.0
0 dead servers

hbase(main):026:0> 

13.权限管理

(1)分配权限

有R(读),W(写),X(执行),C(创造),A(管理员)。语法格式:

grant<user>,<permissions>,<table>,<column family>,<column qualifier>

Hbase的权限管理依赖协处理器,权限控制是通过AccessController Coprocessor协处理器框架实现的,可实现对用户的RWXCA的权限控制。

需要配置

hbase.security.authorization=true

hbase.coprocessor.master.classes和hbase.coprocessor.master.classes使其包含org.apache.hadoop.hbase.security.access.AccessController来提供安全管理能力,所以

需要设置下面参数:停止HBase运行,配置hbase-site.xml

<property>
  <name>hbase.superuser</name>
  <value>hbase</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>    
  <value>org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.token.TokenProvider</value>  
</property>
<property>
  <name>hbase.coprocessor.regionserver.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
  <name>hbase.coprocessor.master.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
  <name>hbase.rpc.engine</name>
  <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>
<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>

配置完成保存退出 重新启动HBase。

(2)分配权限

hbase(main)> grant '<user>', '<permission>', '<table>'
hbase(main)> grant 'user1', 'RWXCA', 'table1'

(3)查看权限

hbase(main)> user_permission '<table>'
hbase(main)> user_permission 'table1'

User                                   Namespace,Table,Family,Qualifier:Permission                                                                   
 user1                                 default,table1,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN]

语法格式:user_permission<table>

(4)收回权限

hbase(main)> revoke '<user>', '<table>'
hbase(main)> revoke 'user1', 'table1'

与分配权限类似,语法格式:revoke<user><table><column family><column qualifier>

四,表数据的增删改查

1.put

向表中插入数据。语法:put<table>,<rowkey>,<family : column>,<value>,<timesstamp>

例如:向表scores2中插入数据,rk001是行键,course是列族,soft是列名,值是database。

hbase(main):012:0> list
TABLE                                                                                               
scores                                                                                              
scores2                                                                                             
2 row(s) in 0.0170 seconds

=> ["scores", "scores2"]
hbase(main):013:0> enable 'scores2'
0 row(s) in 0.0130 seconds

hbase(main):014:0> put'scores2','rk001','course:soft','database'
0 row(s) in 0.1900 seconds

hbase(main):015:0> 

(1)put更新记录

将上条数据更新为english

hbase(main):015:0> put 'scores2','rk001','course:soft','english'
0 row(s) in 0.0160 seconds

(2)批量添加数据

编写一个文件one.txt,内容如下:

put 'scores2','rk002','course:soft','database'
put 'scores2','rk002','course:jg','math'
put 'scores2','rk003','course:soft','c'
put 'scores2','rk004','course:soft','java'

在linux端执行命令hbase shell one.txt,执行结果:

[root@master usr]# mkdir egdata
[root@master usr]# cd egdata/
[root@master egdata]# vi one.txt
[root@master egdata]# hbase shell one.txt 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hadoop/hbase-1.6.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoop/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
0 row(s) in 0.4800 seconds

0 row(s) in 0.0120 seconds

0 row(s) in 0.0100 seconds

0 row(s) in 0.0090 seconds

HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020

hbase(main):001:0> list
TABLE                                                                                               
scores                                                                                              
scores2                                                                                             
2 row(s) in 0.0890 seconds

=> ["scores", "scores2"]
hbase(main):002:0> decribe scores2
NameError: undefined local variable or method `scores2' for #<Object:0x31a136a6>

hbase(main):003:0> describe 'scores2'
Table scores2 is ENABLED                                                                            
scores2                                                                                             
COLUMN FAMILIES DESCRIPTION                                                                         
{NAME => 'course', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS =
> 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '
0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                           
1 row(s) in 0.0330 seconds

hbase(main):004:0> scan 'scores2'
ROW                        COLUMN+CELL                                                              
 rk001                     column=course:soft, timestamp=1607221272704, value=english               
 rk002                     column=course:jg, timestamp=1607221711379, value=math                    
 rk002                     column=course:soft, timestamp=1607221711165, value=database              
 rk003                     column=course:soft, timestamp=1607221711394, value=c                     
 rk004                     column=course:soft, timestamp=1607221711402, value=java                  
4 row(s) in 0.0290 seconds

hbase(main):005:0> 

2.get

查询数据。语法:get<table>,<rowkey>,[<family : column> , ........]

#查询scores2中rk001行的course:soft列的值
hbase(main):005:0> get 'scores2','rk001','course:soft'
COLUMN                     CELL                                                                     
 course:soft               timestamp=1607221272704, value=english                                   
1 row(s) in 0.1590 seconds

#查询scores2中rk001行course列族的值
hbase(main):006:0> get 'scores2','rk001','course'
COLUMN                     CELL                                                                     
 course:soft               timestamp=1607221272704, value=english                                   
1 row(s) in 0.0130 seconds

#查询scores2中rk001行的值
hbase(main):007:0> get 'scores2','rk001'
COLUMN                     CELL                                                                     
 course:soft               timestamp=1607221272704, value=english                                   
1 row(s) in 0.0120 seconds

#查询scores2中rk002行course列族的值,版本数为3
hbase(main):008:0> get 'scores2','rk002',{COLUMN=>'course',VERSIONS=>3}
COLUMN                     CELL                                                                     
 course:jg                 timestamp=1607221711379, value=math                                      
 course:soft               timestamp=1607221711165, value=database                                  
1 row(s) in 0.0710 seconds

#下面这种方式能够得到之前保存的历史数据。
#例如,查询scores2中rk002行course列族的值,版本数为3,时间戳1607221711300~1607221711170之间的值。
hbase(main):011:0> get 'scores2','rk002',{COLUMN=>'course:soft',TIMERANGE=>[1607221711300,1607221711170],VERSIONS=>3}

下面是高级用法:

(1)ValueFilter

表示对值进行过滤。

#查找scores2中rk001行中值是database的数据:
hbase(main):012:0> get 'scores2','rk001',{FILTER=>"ValueFilter(=,'binary:database')"}
COLUMN                     CELL                                                                     
 course:soft               timestamp=1607221106518, value=database                                  
1 row(s) in 0.5400 seconds

hbase(main):013:0> 

#查找scores2中rk002行中含有a的数据:
hbase(main):013:0> get 'scores2','rk002',{FILTER=>"ValueFilter(=,'substring:a')"}
COLUMN                     CELL                                                                     
 course:jg                 timestamp=1607221711379, value=math                                      
 course:soft               timestamp=1607221711165, value=database                                  
1 row(s) in 0.0310 seconds

hbase(main):014:0> 

(2)QualifierFilter

表示对列进行过滤。

#查找scores2中rk001行中列名是db的数据:
hbase(main):018:0> get 'scores2', 'rk001', {FILTER => "QualifierFilter(=, 'binary:db')"}
COLUMN                     CELL                                                                     
0 row(s) in 0.0390 seconds

#查找scores2中rk001行中列名中含有db的数据:
hbase(main):019:0> get 'scores2', 'rk001', {FILTER => "QualifierFilter(=, 'substring:db')"}
COLUMN                     CELL                                                                     
0 row(s) in 0.0170 seconds

3.scan

扫描表。语法:scan<table>,{COLUMN=>[<family : column> , ........], LIMIT=>num}

另外还可以添加STARTROW,TIMERANGE和FILTER等高级功能。

#scores2表
#扫描整个表:
hbase(main):021:0> scan 'scores2'
ROW                        COLUMN+CELL                                                              
 rk001                     column=course:soft, timestamp=1607221272704, value=english               
 rk002                     column=course:jg, timestamp=1607221711379, value=math                    
 rk002                     column=course:soft, timestamp=1607221711165, value=database              
 rk003                     column=course:soft, timestamp=1607221711394, value=c                     
 rk004                     column=course:soft, timestamp=1607221711402, value=java                  
4 row(s) in 0.0160 seconds

hbase(main):022:0> 

#扫描整个表列族为course的数据:
hbase(main):022:0> scan 'scores2',{COLUMNS=>'course'}
ROW                        COLUMN+CELL                                                              
 rk001                     column=course:soft, timestamp=1607221272704, value=english               
 rk002                     column=course:jg, timestamp=1607221711379, value=math                    
 rk002                     column=course:soft, timestamp=1607221711165, value=database              
 rk003                     column=course:soft, timestamp=1607221711394, value=c                     
 rk004                     column=course:soft, timestamp=1607221711402, value=java                  
4 row(s) in 0.0310 seconds

hbase(main):023:0> 

#扫描整个表列族为coursed的数据,同时设置扫描的来时和结束行键:
hbase(main):027:0> scan 'scores2',{COLUMNS=>'course',STARTROW=>'rk001', ENDROW=>'rk003'}
ROW                        COLUMN+CELL                                                              
 rk001                     column=course:soft, timestamp=1607221272704, value=english               
 rk002                     column=course:jg, timestamp=1607221711379, value=math                    
 rk002                     column=course:soft, timestamp=1607221711165, value=database              
2 row(s) in 0.0150 seconds
     
#扫描整个表列族为course的数据,同时设置版本为3:
hbase(main):026:0> scan 'scores2',{COLUMNS=>'course',VERSIONS=>3}
ROW                        COLUMN+CELL                                                              
 rk001                     column=course:soft, timestamp=1607221272704, value=english               
 rk002                     column=course:jg, timestamp=1607221711379, value=math                    
 rk002                     column=course:soft, timestamp=1607221711165, value=database              
 rk003                     column=course:soft, timestamp=1607221711394, value=c                     
 rk004                     column=course:soft, timestamp=1607221711402, value=java                  
4 row(s) in 0.0470 seconds

hbase(main):027:0> 

4.delete

删除数据。语法:delete<table>,<rowkey>,<family : column>,<timestamp>

(1)删除行中的某个值

语法: delete<table>,<rowkey>,<family : column>,<timestamp>,必须指定列名。

#删除scores2中rk001行中course:soft列的数据
hbase(main):028:0> delete 'scores2','rk001','course:soft'
0 row(s) in 0.0570 seconds

<!--将删除rk001行f1 : coll列所有版本的数据-->

(2)删除行

可以不指定列名,删除整行数据。

#删除scores2中rk002行的数据:
hbase(main):002:0> delete'scores2','rk002'

5.deleteall

删除行。语法:deleteall<table>,<rowkey>,<family : column>,<timestamp>

#删除表scores的rk004行的所有数据:
hbase(main):003:0> deleteall 'scores2','rk004'
0 row(s) in 0.0330 seconds

6.count

查询表中总共有多少行数据。语法:count<table>

#统计scores2中的所有数据
hbase(main):005:0> count 'scores2'
1 row(s) in 0.0210 seconds

=> 1

7.truncate

清空表。语法:truncate<table>

#清空表scores2
hbase(main):006:0> truncate 'scores2'
Truncating 'scores2' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 4.2990 seconds

hbase(main):007:0> list
TABLE                                                                                               
scores                                                                                              
scores2                                                                                             
2 row(s) in 0.1550 seconds

=> ["scores", "scores2"]
hbase(main):008:0> scan 'scores2'
ROW                        COLUMN+CELL                                                              
0 row(s) in 0.3380 seconds

hbase(main):009:0> 

五,hbase数据迁移的importtsv

hbase数据来源于日志文件或者RDBMS,把数据迁移到HBASE表中,常见的方法有使用HBASE Put API,使用HBase批量加载工具,自定义MapReduce实现。

[root@master egdata]# vi 1.tsv
[root@master egdata]# cat 1.tsv
1001  zhangsan  16
1002  lisi  18
1003  wangwu  19
1004  zhaoliu  20
1005  zhengqi  19
[root@master egdata]# hdfs dfs -mkdir -p /hbase/data1
[root@master egdata]# hdfs dfs -put 1.tsv /hbase/data1
[root@master egdata]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hadoop/hbase-1.6.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoop/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020

hbase(main):001:0> create 'student2','info'
0 row(s) in 1.6560 seconds

=> Hbase::Table - student2
hbase(main):002:0> quit
[root@master egdata]# yarn jar /usr/hadoop/hbase-1.6.0/lib/hbase-server-1.6.0.jar importtsv -Dimporttsv.separator=\t-Dimporttsv.columns=HBASE_ROW_KEY,info:name student2 /hbase/data1/1.tsv
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/Filter
	at java.lang.Class.getDeclaredMethods0(Native Method)
	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
	at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
	at java.lang.Class.getMethod0(Class.java:3018)
	at java.lang.Class.getMethod(Class.java:1784)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.<init>(ProgramDriver.java:59)
	at org.apache.hadoop.util.ProgramDriver.addClass(ProgramDriver.java:103)
	at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:42)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.filter.Filter
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	... 14 more
[root@master egdata]# 

......

猜你喜欢

转载自blog.csdn.net/qq_46009608/article/details/110733425