NoSQL 之 HBase(一)

NoSQL 之 HBase

HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。

Hbase和HDFS之间关系?
在这里插入图片描述
因为HDFS文件系统虽然支持海量数据存储,但是不擅长对单条记录做高效的管理(查询、修改、删除,增加)不支持对海量数据做随机读写。HBase是构建在HDFS上的一款NoSQL数据库,实现了对HDFS上的数据的高效管理,能够实现对海量数据的随机读写,实现行级别数据管理。

HBase数据库特点

  • 大 hbase一张表规模一般是在数亿行*数百万列且每一列具备上千个版本
  • 稀疏 ,HBase没有固定的表结构,在一行记录中,可以有任意多个列存在(提升磁盘利用率)。
  • HBase没有数据类型,所有的类型都是以字节数组形式存在。
  • 该数据和常规的数据库最大的区别是在底层对表中记录管理形式上有很大的区别,因为绝大多数数据库都是面向行存储的模式,导致了系统的IO利用率低。在HBase中采用面向列存储的形式,极大的提升系统的IO利用率。

行存储和列存储

行存储
在这里插入图片描述
列存储
在这里插入图片描述

HBase环境搭建

  • 确保hadoop能正常运行(HDFS),必须配置 HADOOP_HOME
  • 安装zookeeper(管理hbase服务)
[root@CentOS ~]# tar -zxf zookeeper-3.4.6.tar.gz -C /usr/
[root@CentOS ~]# vi /usr/zookeeper-3.4.6/conf/zoo.cfg
tickTime=2000
dataDir=/root/zkdata
clientPort=2181
[root@CentOS ~]# mkdir /root/zkdata
[root@CentOS zookeeper-3.4.6]# ./bin/zkServer.sh start zoo.cfg
JMX enabled by default
Using config: /usr/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@CentOS zookeeper-3.4.6]# ./bin/zkServer.sh status zoo.cfg
JMX enabled by default
Using config: /usr/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: standalone
[root@centos ~]# jps
1612 SecondaryNameNode
1348 NameNode
1742 QuorumPeerMain //zookeeper
1437 DataNode
  • 安装HBase
[root@centos ~]# tar -zxf hbase-1.2.4-bin.tar.gz -C /usr/
[root@centos ~]# vi /usr/hbase-1.2.4/conf/hbase-site.xml

<property>
            <name>hbase.rootdir</name>
            <value>hdfs://CentOS:9000/hbase</value>
</property>
<property>
            <name>hbase.cluster.distributed</name>
            <value>true</value>
</property>
<property>
            <name>hbase.zookeeper.quorum</name>
            <value>CentOS</value>
</property>
<property>
            <name>hbase.zookeeper.property.clientPort</name>
            <value>2181</value>
</property>
[root@centos ~]# vi /usr/hbase-1.2.4/conf/regionservers 

centos

[root@centos ~]# vi .bashrc

HBASE_MANAGES_ZK=false
HBASE_HOME=/usr/hbase-1.2.4
HADOOP_HOME=/usr/hadoop-2.6.0
JAVA_HOME=/usr/java/latest
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin
export JAVA_HOME
export CLASSPATH
export PATH
export HADOOP_HOME
export HBASE_HOME
export HBASE_MANAGES_ZK

  • 启动HBase
[root@centos ~]# start-hbase.sh 
[root@centos ~]# jps
1612 SecondaryNameNode
2102 HRegionServer  //负责实际表数据的读写操作
1348 NameNode
2365 Jps
1978 HMaster        //类似namenode管理表相关元数据、管理ResgionServer
1742 QuorumPeerMain
1437 DataNode

可以访问:http://centos:16010

HBase Shell命令

  • 连接Hbase
[root@centos ~]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hbase-1.2.4/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017

hbase(main):001:0> 

  • 查看系统状态
hbase(main):001:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load

  • 查看当前系统版本
hbase(main):006:0> version
1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017

namespace操作(数据库)

  • 查看系统数据库
hbase(main):003:0> list_namespace
NAMESPACE                                                                                default                        
hbase      
  • 创建namespace
hbase(main):006:0> create_namespace 'baizhi',{'author'=>'zs'}
0 row(s) in 0.3260 seconds
  • 查看namespace的表
hbase(main):004:0> list_namespace_tables 'hbase'
TABLE                      
meta  
namespace
  • 查看建库详情
hbase(main):008:0> describe_namespace 'baizhi'
DESCRIPTION               
{NAME => 'baizhi', author => 'zs'}
1 row(s) in 0.0550 seconds
  • 修改namespace
hbase(main):010:0> alter_namespace 'baizhi',{METHOD => 'set','author'=> 'wangwu'}
0 row(s) in 0.2520 seconds
hbase(main):011:0> describe_namespace 'baizhi'
DESCRIPTION            
{NAME => 'baizhi', author => 'wangwu'}                                                           1 row(s) in 0.0030 seconds
hbase(main):012:0> alter_namespace 'baizhi',{METHOD => 'unset',NAME => 'author'}
0 row(s) in 0.0550 seconds
hbase(main):013:0> describe_namespace 'baizhi'
DESCRIPTION              
{NAME => 'baizhi'}         
1 row(s) in 0.0080 seconds
  • 删除namespace
hbase(main):020:0> drop_namespace 'baizhi'
0 row(s) in 0.0730 seconds

HBase不允许删除有表的数据库

table相关操作(DDL操作)

  • 创建表
hbase(main):023:0> create 't_user','cf1','cf2'
0 row(s) in 1.2880 seconds

=> Hbase::Table - t_user
hbase(main):024:0> create 'baizhi:t_user',{NAME=>'cf1',VERSIONS=>3},{NAME=>'cf2',TTL=>3600}
0 row(s) in 1.2610 seconds

=> Hbase::Table - baizhi:t_user

  • 查看建表详情
hbase(main):026:0> describe 'baizhi:t_user'
Table baizhi:t_user is ENABLED                                                                                               
baizhi:t_user                                                                                                                
COLUMN FAMILIES DESCRIPTION                                                                                                  
{NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =
> 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', B
LOCKCACHE => 'true'}                                                                                                         
{NAME => 'cf2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =
> 'NONE', MIN_VERSIONS => '0', TTL => '3600 SECONDS (1 HOUR)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY
 => 'false', BLOCKCACHE => 'true'}                                                                                           
2 row(s) in 0.0240 seconds

  • 判断表是否存在
hbase(main):030:0> exists 't_user'
Table t_user does exist                                                                                                      
0 row(s) in 0.0250 seconds

  • enable/is_enabled/enable_all (类似disable、disable_all、is_disabled)
hbase(main):036:0> enable 't_user'
0 row(s) in 0.0220 seconds

hbase(main):037:0> is_enabled 't_user'
true                                                                                                                   
0 row(s) in 0.0090 seconds
hbase(main):035:0> enable_all 't_.*'
t_user                                                                                                                       
Enable the above 1 tables (y/n)?
y
1 tables successfully enabled

  • drop表
hbase(main):038:0> disable 't_user'
0 row(s) in 2.2930 seconds

hbase(main):039:0> drop 't_user'
0 row(s) in 1.2670 seconds
  • 展示所有用户表(无法查看系统表hbase下的表)
hbase(main):042:0> list 'baizhi:.*'
TABLE                                                                                                                        
baizhi:t_user                                                                                                                
1 row(s) in 0.0050 seconds

=> ["baizhi:t_user"]
hbase(main):043:0> list
TABLE                                                                                                                        
baizhi:t_user                                                                                                                
1 row(s) in 0.0050 seconds

  • 获取一个表的引用
hbase(main):002:0> t=get_table 'baizhi:t_user'
0 row(s) in 0.0440 seconds


  • 修改表参数
hbase(main):008:0> alter 'baizhi:t_user',{ NAME => 'cf2', TTL => 60 }
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.7000 seconds


表的DML操作

  • Put指令
hbase(main):010:0> put 'baizhi:t_user',1,'cf1:name','zhangsan'
0 row(s) in 0.2060 seconds

hbase(main):011:0> t = get_table 'baizhi:t_user'
0 row(s) in 0.0010 seconds

hbase(main):012:0> t.put 1,'cf1:age','18'
0 row(s) in 0.0500 seconds

  • Get指令
hbase(main):017:0> get 'baizhi:t_user',1
COLUMN                           CELL     
 cf1:age                         timestamp=1536996547967, value=21   
 cf1:name                        timestamp=1536996337398, value=zhangsan  
2 row(s) in 0.0680 seconds
hbase(main):019:0> get 'baizhi:t_user',1,{COLUMN =>'cf1', VERSIONS=>10}
COLUMN                           CELL   
 cf1:age                         timestamp=1536996547967, value=21 
 cf1:age                         timestamp=1536996542980, value=20 
 cf1:age                         timestamp=1536996375890, value=18 
 cf1:name                        timestamp=1536996337398, value=zhangsan 
4 row(s) in 0.0440 seconds

hbase(main):020:0> get 'baizhi:t_user',1,{COLUMN =>'cf1:age', VERSIONS=>10}
COLUMN                           CELL                
 cf1:age                         timestamp=1536996547967, value=21 
 cf1:age                         timestamp=1536996542980, value=20 
 cf1:age                         timestamp=1536996375890, value=18   
3 row(s) in 0.0760 seconds

hbase(main):021:0> get 'baizhi:t_user',1,{COLUMN =>'cf1:age', TIMESTAMP => 1536996542980 }
COLUMN                           CELL 
 cf1:age                         timestamp=1536996542980, value=20  
1 row(s) in 0.0260 seconds

hbase(main):025:0> get 'baizhi:t_user',1,{TIMERANGE => [1536996375890,1536996547967]}
COLUMN                           CELL  
 cf1:age                         timestamp=1536996542980, value=20 
1 row(s) in 0.0480 seconds

hbase(main):026:0> get 'baizhi:t_user',1,{TIMERANGE => [1536996375890,1536996547967],VERSIONS=>10}
COLUMN                           CELL
 cf1:age                         timestamp=1536996542980, value=20 
 cf1:age                         timestamp=1536996375890, value=18 
2 row(s) in 0.0160 seconds
  • scan
hbase(main):004:0> scan 'baizhi:t_user'
ROW                              COLUMN+CELL   
 1                               column=cf1:age, timestamp=1536996547967, value=21
 1                               column=cf1:height, timestamp=1536997284682, value=170
 1                               column=cf1:name, timestamp=1536996337398, value=zhangsan
 1                               column=cf1:salary, timestamp=1536997158586, value=15000  
 1                               column=cf1:weight, timestamp=1536997311001, value=\x00\x00\x00\x00\x00\x00\x00\x05          
 2                               column=cf1:age, timestamp=1536997566506, value=18 
 2                               column=cf1:name, timestamp=1536997556491, value=lisi  
2 row(s) in 0.0470 seconds
hbase(main):009:0> scan 'baizhi:t_user', {STARTROW => '1',LIMIT=>1}
ROW                              COLUMN+CELL 
 1                               column=cf1:age, timestamp=1536996547967, value=21
 1                               column=cf1:height, timestamp=1536997284682, value=170 
 1                               column=cf1:name, timestamp=1536996337398, value=zhangsan
 1                               column=cf1:salary, timestamp=1536997158586, value=15000
 1                               column=cf1:weight, timestamp=1536997311001, value=\x00\x00\x00\x00\x00\x00\x00\x05          
1 row(s) in 0.0280 seconds

hbase(main):011:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',TIMESTAMP=>1536996542980}
ROW                              COLUMN+CELL 
 1                               column=cf1:age, timestamp=1536996542980, value=20
1 row(s) in 0.0330 seconds
  • delete/deleteall
hbase(main):013:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',VERSIONS=>3}
ROW                              COLUMN+CELL  
 1                               column=cf1:age, timestamp=1536996547967, value=21
 1                               column=cf1:age, timestamp=1536996542980, value=20
 1                               column=cf1:age, timestamp=1536996375890, value=18 
 2                               column=cf1:age, timestamp=1536997566506, value=18                                           
2 row(s) in 0.0150 seconds

hbase(main):014:0> delete 'baizhi:t_user',1,'cf1:age',1536996542980
0 row(s) in 0.0920 seconds

hbase(main):015:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',VERSIONS=>3}
ROW                              COLUMN+CELL 
 1                               column=cf1:age, timestamp=1536996547967, value=21
 2                               column=cf1:age, timestamp=1536997566506, value=18                                           
2 row(s) in 0.0140 seconds

hbase(main):016:0> delete 'baizhi:t_user',1,'cf1:age'
0 row(s) in 0.0170 seconds

hbase(main):017:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',VERSIONS=>3}
ROW                              COLUMN+CELL     
 2                               column=cf1:age, timestamp=1536997566506, value=18                                           
1 row(s) in 0.0170 seconds

hbase(main):019:0> deleteall 'baizhi:t_user',1
0 row(s) in 0.0200 seconds

hbase(main):020:0>  get 'baizhi:t_user',1
COLUMN                           CELL 
0 row(s) in 0.0200 seconds
  • truncate
hbase(main):022:0> truncate 'baizhi:t_user'
Truncating 'baizhi:t_user' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 4.0040 seconds

上一篇:MongoDB 架构(四)
下一篇:java API操作HBase(二)

扫描二维码关注公众号,回复: 6192827 查看本文章

猜你喜欢

转载自blog.csdn.net/qq_42806727/article/details/89096210