Hbase entry (two) - Installation and Configuration

file

This article describes how to install, deploy, start and stop HBase cluster, how to perform basic operations Hbase line of command.

And describes Hbase profile.

You need to complete all the prerequisites installed prior to installation.

file

A prerequisite

1、JDK

And, like Hadoop, Hbase need JDK1.6 or higher version, so please install the JDK and configuration environment variable.

Hbase version of JDK version

HBase Version	JDK 7	JDK 8	JDK 9 (Non-LTS)	JDK 10 (Non-LTS)	JDK 11
2.0+			HBASE-20264	HBASE-20264	HBASE-21110
1.2+			HBASE-20264	HBASE-20264	HBASE-21110

2、Zookeeper

zookeeper is Hbase cluster coordinator, responsible for solving the problem HMaster a single point, it must be installed a zookeeper.

3、Hadoop

Clustered mode, we need Hadoop environment

Hadoop version supports

T = support
F = not supported
N = not tested

	HBase-1.2.x, HBase-1.3.x	HBase-1.4.x	HBase-2.0.x	HBase-2.1.x
Hadoop-2.4.x	T	F	F	F
Hadoop-2.5.x	T	F	F	F
Hadoop-2.6.0	F	F	F	F
Hadoop-2.6.1+	T	F	T	F
Hadoop-2.7.0	F	F	F	F
Hadoop-2.7.1+	T	T	T	T
Hadoop-2.8.[0-1]	F	F	F	F
Hadoop-2.8.2	N	N	N	N
Hadoop-2.8.3+	N	N	T	T
Hadoop-2.9.0	F	F	F	F
Hadoop-2.9.1+	N	N	N	N
Hadoop-3.0.[0-2]	F	F	F	F
Hadoop-3.0.3+	F	F	T	T
Hadoop-3.1.0	F	F	F	F
Hadoop-3.1.1+	F	F	T	T

Second, the installation and deployment

Hbase There are two modes of operation, stand-alone mode and distributed mode.

1, stand-alone mode

download

First of all here to download a stable version of Hbase, https://www.apache.org/dyn/closer.lua/hbase/

Recommend click on the top link into HBase Releases Click stable folder, and then download will tar.gz end of the binary file to a local. Do not download to temporarily src.tar.gz file ending.

file

Decompression

Into the directory that you want to extract

$ tar xzvf hbase-1.3.5-bin.tar.gz
$ cd hbase-1.3.5/

JAVA_HOME

Before you start HBase, you need to set JAVA_HOMEenvironment variables. You can set the variables commonly used settings of the operating system, HBase also provides a central mechanism conf / hbase-env.sh . Edit this file, uncomment JAVA_HOMEbeginning of the line, and set it to suit your operating system's path.

JAVA_HOME=/usr

hbase-site.xml

HBase master configuration file to edit the conf / HBase-the site.xml .

You need to specify HBase and ZooKeeper data storage directory on the local file system.

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///home/testuser/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/testuser/zookeeper</value>
  </property>
  <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
    <description>
      Controls whether HBase will check for stream capabilities (hflush/hsync).

      Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
      with the 'file://' scheme, but be mindful of the NOTE below.

      WARNING: Setting this to false blinds you to potential data loss and
      inconsistent system state in the event of process and/or node failures. If
      HBase is complaining of an inability to use hsync or hflush it's most
      likely not a false positive.
    </description>
  </property>
</configuration>

Commitment

bin / start-hbase.sh to start HBase

bin / start-hbase.sh stop HBase

It can be used jpsto confirm HMaster and HRegionServer process is turned off.

2, cluster model

2.1 pseudo-distributed mode

Pseudo-distributed mode means HBase still entirely run on a single host, but each HBase daemon (HMaster, HRegionServer and ZooKeeper) runs as a separate process.

Before using cluster model, to ensure that HDFS can operate normally.

hbase-site.xml

HBase master configuration file to edit the conf / HBase-the site.xml .

Need a distributed mode on the URI specified hdfs

<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>

<property>
  <name>hbase.rootdir</name>
  <value>hdfs://localhost:8020/hbase</value>
</property>

Use bin / start-hbase.sh start HBase. If your system is configured correctly, the jpscommand should display HMaster and HRegionServer process is running.

Starting and stopping Backup HBase master (HMaster) server

You can start up to 9 HMaster backup server, the server computing HMaster a total of 10 primary server.

$ ./bin/local-master-backup.sh start 2 3 5

Starting and stopping other RegionServers

$ .bin/local-regionservers.sh start 2 3 4 5

2.2 fully distributed

In fact, you need a fully distributed configuration to fully test HBase, and used in the actual scene. In a distributed configuration, the cluster consists of multiple nodes, each node running one or more HBase daemon. These include primary and backup master instance, a plurality of nodes and a plurality RegionServer ZooKeeper nodes.

Architecture as follows:

Node Name	Master	ZooKeeper	RegionServer
node-a.example.com	yes	yes	no
node-b.example.com	backup	yes	yes
node-c.example.com	no	yes	yes

To ensure that there is a communication node rights, such as encryption and firewalls to configure ssh-free privileges, and zookeeper configured to start.

The Hbase download and unzip the profile synchronization to each machine.

Start the cluster

$ bin/start-hbase.sh
node-c.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-c.example.com.out
node-a.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-a.example.com.out
node-b.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-b.example.com.out
starting master, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-master-node-a.example.com.out
node-c.example.com: starting regionserver, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-regionserver-node-c.example.com.out
node-b.example.com: starting regionserver, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-regionserver-node-b.example.com.out
node-b.example.com: starting master, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-master-nodeb.example.com.out

3, Web UI

In HBase 0.98.x above, HBase Web UI changes from the port 60030 and 60010 RegionServer the master node 16010 and 16030

You can view the page by Hbase after the start.

file

三、Hbase Shell

In HBase installation directory bin / use the directory hbase shellcommand to connect HBase running instance.

$ ./bin/hbase shell
hbase(main):001:0>

Preview HBase Shell help text

Input helpand press Enter, you can see basic information and some examples of HBase Shell command.

Create a table

Use createto create a table must specify a table name and column family name

hbase(main):001:0> create 'test', 'cf'
0 row(s) in 0.4170 seconds

=> Hbase::Table - test

Table Information

Use listView table exists

hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.0180 seconds

=> ["test"]

Use `describe`Display details and configuration

hbase(main):003:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE =>
'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'f
alse', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE
 => '65536'}
1 row(s)
Took 0.9998 seconds

Insert data

Use putinsert data

hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.0850 seconds

hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0110 seconds

hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0100 seconds

All scanned data

One way to obtain data from HBase is scan. Scan the table data using the scan command. You can do limit the scan.

hbase(main):006:0> scan 'test'
ROW                                      COLUMN+CELL
 row1                                    column=cf:a, timestamp=1421762485768, value=value1
 row2                                    column=cf:b, timestamp=1421762491785, value=value2
 row3                                    column=cf:c, timestamp=1421762496210, value=value3
3 row(s) in 0.0230 seconds

A data acquisition

Use getthe command once a data acquisition

hbase(main):007:0> get 'test', 'row1'
COLUMN                                   CELL
 cf:a                                    timestamp=1421762485768, value=value1
1 row(s) in 0.0350 seconds

Disabled list

Use disablethe command to disable table

hbase(main):008:0> disable 'test'
0 row(s) in 1.1820 seconds

hbase(main):009:0> enable 'test'
0 row(s) in 0.1770 seconds

Use enablethe command to enable table

hbase(main):010:0> disable 'test'
0 row(s) in 1.1820 seconds

Delete table

hbase(main):011:0> drop 'test'
0 row(s) in 0.1370 seconds

Exit HBase Shell

Use quitcommand to exit the command line and disconnect from the cluster.

Fourth, Detailed profiles

Apache Hadoop Apache HBase using the same system configuration. All configuration files are located in the conf / directory, you need to keep pace each node in the cluster.

backup-masters
默认情况下不存在。这是一个纯文本文件，其中列出了主服务器应在其上启动备份主进程的主机，每行一台主机。
hadoop-metrics2-hbase.properties
用于连接 HBase Hadoop 的 Metrics2 框架。默认情况下只包含注释出的示例。
hbase-env.cmd and hbase-env.sh
用于 Windows 和 Linux/Unix 环境的脚本，以设置 HBase 的工作环境，包括 Java、Java 选项和其他环境变量的位置。该文件包含许多注释示例来提供指导。
hbase-policy.xml
RPC 服务器使用默认策略配置文件对客户端请求进行授权决策。仅在启用 HBase安全模式下使用。
hbase-site.xml
主要的 HBase 配置文件。该文件指定覆盖 HBase 的默认配置的配置选项。您可以在 docs/hbase-default.xml 中查看（但不要编辑）默认配置文件。您还可以在 HBase Web UI 的 HBase 配置选项卡中查看群集的整个有效配置（默认和覆盖）。
log4j.properties
通过log4j进行 HBase 日志记录的配置文件。
regionservers
包含应该在 HBase 集群中运行 RegionServer 的主机列表的纯文本文件。默认情况下，这个文件包含单个条目localhostt。它应该包含主机名或 IP 地址列表，每行一个，如果集群中的每个节点将在其localhost接口上运行 RegionServer 的话，则只应包含localhost

More real-time calculation, Hbase, Flink, Kafka and other related technologies Bowen, welcome attention to calculate real-time streaming

file