This article describes how to install, deploy, start and stop HBase cluster, how to perform basic operations Hbase line of command.
And describes Hbase profile.
You need to complete all the prerequisites installed prior to installation.
A prerequisite
1、JDK
And, like Hadoop, Hbase need JDK1.6 or higher version, so please install the JDK and configuration environment variable.
Hbase version of JDK version
HBase Version | JDK 7 | JDK 8 | JDK 9 (Non-LTS) | JDK 10 (Non-LTS) | JDK 11 |
---|---|---|---|---|---|
2.0+ | HBASE-20264 | HBASE-20264 | HBASE-21110 | ||
1.2+ | HBASE-20264 | HBASE-20264 | HBASE-21110 |
2、Zookeeper
zookeeper is Hbase cluster coordinator, responsible for solving the problem HMaster a single point, it must be installed a zookeeper.
3、Hadoop
Clustered mode, we need Hadoop environment
Hadoop version supports
- T = support
- F = not supported
- N = not tested
HBase-1.2.x, HBase-1.3.x | HBase-1.4.x | HBase-2.0.x | HBase-2.1.x | |
---|---|---|---|---|
Hadoop-2.4.x | T | F | F | F |
Hadoop-2.5.x | T | F | F | F |
Hadoop-2.6.0 | F | F | F | F |
Hadoop-2.6.1+ | T | F | T | F |
Hadoop-2.7.0 | F | F | F | F |
Hadoop-2.7.1+ | T | T | T | T |
Hadoop-2.8.[0-1] | F | F | F | F |
Hadoop-2.8.2 | N | N | N | N |
Hadoop-2.8.3+ | N | N | T | T |
Hadoop-2.9.0 | F | F | F | F |
Hadoop-2.9.1+ | N | N | N | N |
Hadoop-3.0.[0-2] | F | F | F | F |
Hadoop-3.0.3+ | F | F | T | T |
Hadoop-3.1.0 | F | F | F | F |
Hadoop-3.1.1+ | F | F | T | T |
Second, the installation and deployment
Hbase There are two modes of operation, stand-alone mode and distributed mode.
1, stand-alone mode
download
First of all here to download a stable version of Hbase, https://www.apache.org/dyn/closer.lua/hbase/
Recommend click on the top link into HBase Releases Click stable folder, and then download will tar.gz end of the binary file to a local. Do not download to temporarily src.tar.gz file ending.
Decompression
Into the directory that you want to extract
$ tar xzvf hbase-1.3.5-bin.tar.gz
$ cd hbase-1.3.5/
JAVA_HOME
Before you start HBase, you need to set JAVA_HOME
environment variables. You can set the variables commonly used settings of the operating system, HBase also provides a central mechanism conf / hbase-env.sh . Edit this file, uncomment JAVA_HOME
beginning of the line, and set it to suit your operating system's path.
JAVA_HOME=/usr
hbase-site.xml
HBase master configuration file to edit the conf / HBase-the site.xml .
You need to specify HBase and ZooKeeper data storage directory on the local file system.
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///home/testuser/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/testuser/zookeeper</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
<description>
Controls whether HBase will check for stream capabilities (hflush/hsync).
Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
with the 'file://' scheme, but be mindful of the NOTE below.
WARNING: Setting this to false blinds you to potential data loss and
inconsistent system state in the event of process and/or node failures. If
HBase is complaining of an inability to use hsync or hflush it's most
likely not a false positive.
</description>
</property>
</configuration>
Commitment
bin / start-hbase.sh to start HBase
bin / start-hbase.sh stop HBase
It can be used jps
to confirm HMaster and HRegionServer process is turned off.
2, cluster model
2.1 pseudo-distributed mode
Pseudo-distributed mode means HBase still entirely run on a single host, but each HBase daemon (HMaster, HRegionServer and ZooKeeper) runs as a separate process.
Before using cluster model, to ensure that HDFS can operate normally.
hbase-site.xml
HBase master configuration file to edit the conf / HBase-the site.xml .
Need a distributed mode on the URI specified hdfs
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
Use bin / start-hbase.sh start HBase. If your system is configured correctly, the jps
command should display HMaster and HRegionServer process is running.
Starting and stopping Backup HBase master (HMaster) server
You can start up to 9 HMaster backup server, the server computing HMaster a total of 10 primary server.
$ ./bin/local-master-backup.sh start 2 3 5
Starting and stopping other RegionServers
$ .bin/local-regionservers.sh start 2 3 4 5
2.2 fully distributed
In fact, you need a fully distributed configuration to fully test HBase, and used in the actual scene. In a distributed configuration, the cluster consists of multiple nodes, each node running one or more HBase daemon. These include primary and backup master instance, a plurality of nodes and a plurality RegionServer ZooKeeper nodes.
Architecture as follows:
Node Name | Master | ZooKeeper | RegionServer |
---|---|---|---|
node-a.example.com | yes | yes | no |
node-b.example.com | backup | yes | yes |
node-c.example.com | no | yes | yes |
To ensure that there is a communication node rights, such as encryption and firewalls to configure ssh-free privileges, and zookeeper configured to start.
The Hbase download and unzip the profile synchronization to each machine.
Start the cluster
$ bin/start-hbase.sh
node-c.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-c.example.com.out
node-a.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-a.example.com.out
node-b.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-b.example.com.out
starting master, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-master-node-a.example.com.out
node-c.example.com: starting regionserver, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-regionserver-node-c.example.com.out
node-b.example.com: starting regionserver, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-regionserver-node-b.example.com.out
node-b.example.com: starting master, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-master-nodeb.example.com.out
3, Web UI
In HBase 0.98.x above, HBase Web UI changes from the port 60030 and 60010 RegionServer the master node 16010 and 16030
You can view the page by Hbase after the start.
三、Hbase Shell
In HBase installation directory bin / use the directory hbase shell
command to connect HBase running instance.
$ ./bin/hbase shell
hbase(main):001:0>
Preview HBase Shell help text
Input help
and press Enter, you can see basic information and some examples of HBase Shell command.
Create a table
Use create
to create a table must specify a table name and column family name
hbase(main):001:0> create 'test', 'cf'
0 row(s) in 0.4170 seconds
=> Hbase::Table - test
Table Information
Use list
View table exists
hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.0180 seconds
=> ["test"]
Use describe
Display details and configuration
hbase(main):003:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE =>
'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'f
alse', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE
=> '65536'}
1 row(s)
Took 0.9998 seconds
Insert data
Use put
insert data
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.0850 seconds
hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0110 seconds
hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0100 seconds
All scanned data
One way to obtain data from HBase is scan
. Scan the table data using the scan command. You can do limit the scan.
hbase(main):006:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1421762485768, value=value1
row2 column=cf:b, timestamp=1421762491785, value=value2
row3 column=cf:c, timestamp=1421762496210, value=value3
3 row(s) in 0.0230 seconds
A data acquisition
Use get
the command once a data acquisition
hbase(main):007:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1421762485768, value=value1
1 row(s) in 0.0350 seconds
Disabled list
Use disable
the command to disable table
hbase(main):008:0> disable 'test'
0 row(s) in 1.1820 seconds
hbase(main):009:0> enable 'test'
0 row(s) in 0.1770 seconds
Use enable
the command to enable table
hbase(main):010:0> disable 'test'
0 row(s) in 1.1820 seconds
Delete table
hbase(main):011:0> drop 'test'
0 row(s) in 0.1370 seconds
Exit HBase Shell
Use quit
command to exit the command line and disconnect from the cluster.
Fourth, Detailed profiles
Apache Hadoop Apache HBase using the same system configuration. All configuration files are located in the conf / directory, you need to keep pace each node in the cluster.
backup-masters
默认情况下不存在。这是一个纯文本文件,其中列出了主服务器应在其上启动备份主进程的主机,每行一台主机。
hadoop-metrics2-hbase.properties
用于连接 HBase Hadoop 的 Metrics2 框架。默认情况下只包含注释出的示例。
hbase-env.cmd and hbase-env.sh
用于 Windows 和 Linux/Unix 环境的脚本,以设置 HBase 的工作环境,包括 Java、Java 选项和其他环境变量的位置。该文件包含许多注释示例来提供指导。
hbase-policy.xml
RPC 服务器使用默认策略配置文件对客户端请求进行授权决策。仅在启用 HBase安全模式下使用。
hbase-site.xml
主要的 HBase 配置文件。该文件指定覆盖 HBase 的默认配置的配置选项。您可以在 docs/hbase-default.xml 中查看(但不要编辑)默认配置文件。您还可以在 HBase Web UI 的 HBase 配置选项卡中查看群集的整个有效配置(默认和覆盖)。
log4j.properties
通过log4j进行 HBase 日志记录的配置文件。
regionservers
包含应该在 HBase 集群中运行 RegionServer 的主机列表的纯文本文件。默认情况下,这个文件包含单个条目localhostt。它应该包含主机名或 IP 地址列表,每行一个,如果集群中的每个节点将在其localhost接口上运行 RegionServer 的话,则只应包含localhost
More real-time calculation, Hbase, Flink, Kafka and other related technologies Bowen, welcome attention to calculate real-time streaming