一、概述
HBASE是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统;
数据量小的场景不适合用HBASE;
HBASE写的时候非常慢,因为文件存在HDFS上;
(读和写是一个此消彼长的过程,因为向快速的查,就必须建索引,写的时候就必须维护索引,就会变慢)
HBASE的元数据表hbase:meta的位置存放在zookeeper上,HBASE从这个位置读取元数据,再根据元数据从HDFS读取数据;
二、部署
配套安装:
分布式hadoop2.6:三节点hadoop101、hadoop102、hadoop103
zookeeper3.5
2.1 下载
下载官网
把安装包上传到虚拟机hadoop101。
2.2 分布式HBASE部署
1.解压、修改环境变量
tar zxvf hbase-2.2.0-bin.tar.gz
vi ~/.bash_profile
在~/.bash_profile添加如下内容
HBASE_HOME=/usr/local/src/hbase-2.2.0
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:
export JAVA_HOME HADOOP_HOME CLASSPATH HIVE_HOME ZOOKEEPER_HOME HBASE_HOME PATH
source ~/.bash_profile
使其生效。
把环境变量同步到其他两节点:
scp ~/.bash_profile root@hadoop102:~/
scp ~/.bash_profile root@hadoop103:~/
切换到hadoop102,hadoop103使环境变量生效:source ~/.bash_profile
2.修改配置文件
1)vim hbase-env.sh
添加如下内容
export JAVA_HOME=/usr/local/src/jdk8u292-b10
export HBASE_MANAGES_ZK=false
解释:
HBASE_MANAGES_ZK=false 时使用独立的,为true时使用默认自带的。我已经装了分布式的zookeeper,所以不适用自带的。
2)配置hbase-site.xml文件 vi hbase-site.xml
添加如下内容:
<property>
<!-- hbase在HDFS上的存储路径,与hadoop core-site.xml 中的fs.defaultFS保持严格一致,不然HMaster启动失败.-->
<name>hbase.rootdir</name>
<value>hdfs://hadoop101:9820/hbase</value>
</property>
<property>
<!-- 是否是分布式 -->
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<!-- 集群的所有节点 -->
<name>hbase.zookeeper.quorum</name>
<value>hadoop101,hadoop102,hadoop103</value>
</property>
<property>
<!-- HBASE的web端口号 -->
<name>hbase.master.info.port</name>
<value>60010</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
<description>Time difference of regionserver from master</description>
</property>
<property>
<!-- 如果coprocessor加载失败或者初始化失败或者抛出Throwable对象,设置为false会让系统继续运行,默认是true -->
<name>hbase.coprocessor.abortonerror</name>
<value>false</value>
</property>
<property>
<!-- 在分布式情况下, 一定设置为false -->
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
3)配置regionservers
regionservers里面配的是数据存储的节点
vi regionservers
hadoop101
hadoop102
hadoop103
4)配置高可用(不是必须)
在conf下创建back-masters文件,用来指定备用的Hmaster,主Hmaster挂了以后备Hmaster会启动。
[root@hadoop101 conf]# touch backup-masters
[root@hadoop101 conf]# vi backup-masters
指定hadoop102为备用,backup-masters的内容是hadoop102
- 拷贝到其他集群
scp -r hbase-2.2.0 root@hadoop102:/usr/local/src/
scp -r hbase-2.2.0 root@hadoop103:/usr/local/src/
- 启动
启动HBSE之前确认hadoop和zookeeper已经启动;
在主节点启动HBASE,启动命令:start-hbase.sh
在哪个节点启动hbase,哪个就是主节点,也就是HMaster进程所在的节点
启动后可看到多出HMaster和HRegionServer进程,同时hadoop102和hadoop103也启动了HRegionServer进程,表示启动成功
HBASE的web端:http://hadoop101:60010/master-status
启动成功以后在HDFS的根目录会出现一个hbase的文件夹,这个路径可以通过配置文件修改。
总结:
HBASE出问题解决办法:
- 首先关闭HBASE:
stop-hbase.sh
- 然后查看日志logs下面的.log文件,根据具体错误去解决;
- 初始化hbase,删除hdfs上的/hbase文件路径,删除hbase安装路径下的log、logs文件夹
三、使用
安装好的HBASE只是工具,会使用才是王道,接下来演示一些基本操作。
hbase shell
创建了一个名为myHbase的表,表里面有1个列簇,名为myCard,保留5个版本信息
create 'myHbase',{
NAME => 'myCard',VERSIONS => 5}
查看list:
所有的表明、列名都需要加上引号
1、查看状态、版本:
status
version
2、查看所有表:
list
3、退出:
quit
DDL
4、创建表
create ‘test’,’testcf’
create ‘test1’,{NAME => ‘name’,VERSIONS => 5}
create ‘test2’,{NAME => ‘name’,VERSIONS => 5},{NAME => ‘age’,VERSIONS => 3}
5、查看表
desc ‘test’
6、使表不可用(在删除或者修改表设置,都需要禁用表)
disable ‘test’
启用表:
enable ‘test’
7、删除表
drop ‘test’
8、判断表是否存在
exists ‘test’
9、刷新内存数据到HFile
flush ‘test’
10、修改表
create ‘test’,’testcf’
put ‘test’,‘rk1’,‘testcf:a’,‘fisrt’
disable ‘test’
alter ‘test’,{NAME=>’testcf’,TTL=>5}
enable ‘test’
DML
create ‘test’,‘cftest’
1、向表中插入数据
put ‘test’,‘rk1’,‘cftest:a’,‘fisrt’
put ‘test’,‘rk1’,‘cftest:a’,‘fisrt’
put ‘test’,‘rk1’,‘cftest:a’,‘second’
put ‘test’,‘rk1’,‘cftest:a’,‘three’
put ‘test’,‘rk2’,‘cftest:a’,‘four’
put ‘test’,‘rk2’,‘cftest:b’,‘four’
put ‘test’,‘rk3’,‘cftest:c’,‘five’
2、扫描全表中数据
scan ‘test’
指定返回条数
scan ‘test’,{LIMIT=>2}
3、获取一行数据
get ‘test’,‘rk2’
4、获取一行某一列的值
get ‘test’,‘rk2’,‘cftest:a’
5、更新数据
put ‘test’,‘rk2’,‘cftest:a’,‘hadoop’
6、删除数据
删除指定行中的指定列
delete ‘test’,‘rk2’,‘cftest:a’
删除指定行所有的列
deleteall ‘test’,‘rk2’
清空表数据
truncate ‘test’
7、显示行数
count ‘test’
HBase原生态编程
四大Nosql:
图数据库:任务关系;
文档数据库:
键值对数据库:
列簇数据库:
四、采坑记录
1. HMaster启动后闪退
hbase启动之后,hmaster几秒钟后又挂掉,rs和zk在各个节点的进程均正常
然后我查看hbase的logs,发现以下问题:
2021-08-02 21:38:00,474 WARN [master/hadoop101:16000:becomeActiveMaster] ipc.Client: Failed to connect to server: hadoop101/192.168.1.101:9000: try once and fail.
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788)
at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
at org.apache.hadoop.ipc.Client.call(Client.java:1345)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy18.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:691)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
at com.sun.proxy.$Proxy19.setSafeMode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372)
at com.sun.proxy.$Proxy20.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2143)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1359)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hbase.util.FSUtils.isInSafeMode(FSUtils.java:287)
at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:699)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:250)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:153)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:124)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:922)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2353)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:598)
at java.lang.Thread.run(Thread.java:748)
2021-08-02 21:38:00,499 WARN [master/hadoop101:16000:becomeActiveMaster] ipc.Client: Failed to connect to server: hadoop101/192.168.1.101:9000: try once and fail.
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788)
at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
at org.apache.hadoop.ipc.Client.call(Client.java:1345)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy18.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:691)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
at com.sun.proxy.$Proxy19.setSafeMode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372)
at com.sun.proxy.$Proxy20.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2143)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1359)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1343)
at org.apache.hadoop.hbase.util.FSUtils.isInSafeMode(FSUtils.java:293)
at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:699)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:250)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:153)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:124)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:922)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2353)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:598)
at java.lang.Thread.run(Thread.java:748)
解决方案:
将hbase-site.xml中的配置项中的端口号9000改为hdfs在运行的端口号9820
<property>
<name>hbase.rootdir</name>
<value>hdfs://Hadoop01:9000/hbase</value>
</property>
又查看到在core-site.xml中查到hdfs在运行的端口号为9820
<property>
<name>fs.defaultFS</name>
<value>hdfs://Hadoop01:9820</value>
</property>
有时间了在写
HBASE图形化软件:phoenix
http://phoenix.apache.org/download.html