Concise Hbase Fully-distributed distributed test environment deployment records

Due to the unstructured properties Hbase, regarded her as a storage system than as a database easier to accept.

Hbase is a storage system, does not require very strict pre-defined matrix structure, provides the ability to organize data loose. As memory, she really needs the support of other file systems, this is not necessarily limited to HDFS (in the official document tutorial, Hbase standalone is based on the local file system as storage, manufacturers have supported a more stable Hbase + GPFS of program), just because of Bigtable is from google apache project, so the combination of both natural advantages, support is also very rich.

Hbase this test with the official stable version recommended: hbase-1.4.10.

This test uses a completely distributed environment fully-distributed deployment method (and distinguished from a pseudo standalone distributed deployment), external ZooKeeper, and the original is a common hadoop zookeeper.

(Hbase package actually contains a zookeeper process, you can start an independent zookeeper to manage hbase, this approach is not used here)

(1) Create a system user and configure the free exchange of visits between the dense cluster

For ease of management, you start the process several other machines through a script start-hbase.sh, then you need to configure rsh password-free in host007, host003, host001, on host002.

创建hbase用户:useradd -d /usr/local/hbase -m -s /bin/bash -G sudo,hadoop,java,zookeeper,yarn,hive hbase

"-G sudo, hadoop, java, zookeeper, yarn, hive": whether you need to configure the permissions of the user group, you need to look at the existing system is not necessary step, but the test environment, easy landing, modify other system files and check their progress, here is a way to grant permission.

Hbase free for user configuration password. (Here not carried out, a lot of relevant documentation, the private key is simply placed in the ~ hbase / .ssh directory, and then written to the authorized_key were key file on all other machines to access this machine. Also must take .ssh directory and file permissions configured 700)

(2) Configuration regionservers

All hbase configuration files are in ~ hbase / conf directory, find regionservers file, into the region server as a cluster of several machines. Change

host004
host005
host006

A hostname line, can write ip. Do not forget to write a certain hostname mapping between edit / etc / hosts defined hostname and ip's.

 

(3) Configuration master backup

Create a file in the backup-masters in $ HBASE_HOME. (Environment variable is not necessary, but if the configuration $ HBASE_HOME will bring a lot of convenience, especially a lot of other hadoop default configuration is set up for $ HBASE_HOME, if not configured, then it can not find the default file, resulting in failure of the configuration, so It is strongly recommended to perform in the footsteps of /etc/profile.d/ create a startup environment variables, and using this global approach.)

Write:

host003

 (4) arranged zookeeper

Use external separate zookeeper, it requires the following settings:

1. In document hbase-site.xml, modify  HBASE_MANAGES_ZK = false.

2. hbase-site.xml, the correspondingly modified as follows disposed, outward-pointing zookeeper address and port:

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>host004,host005,host006</value>
</property>

<property>
   <name>hbase.zookeeper.property.clientPort</name>
   <value>2181</value>
</property>

 This specifies the original configuration in host004, host005, zookeeper cluster on the host006, nothing more than define port for each client.

 

(4) hbase some variables and parameters

Mainly configured for hbase-env.sh file.

 JAVA_HOME: export JAVA_HOME=/usr/local/java/jdk1.8.0_211

Heapsize: export HBASE_HEAPSIZE=600M

The main directory path: one log, a temporary file is run-time

export HBASE_LOG_DIR=/usr/local/hbase/hbase-1.4.10/Logs

export HBASE_PID_DIR=/usr/local/hbase/hbase-1.4.10/Pids

Whether to enable hbase comes zookeeper: export HBASE_MANAGES_ZK = false

 

Defined in (5) is distributed Hbase

In fact only one parameter, hbase-site.xml

<property>
 <name>hbase.cluster.distributed</name>
 <value>true</value>
</property>

 

(6) disposed defined HDFS

Such as the opening said, Hbase actually requires a combination of a file system data to provide her for her organization, therefore, we need to define an access path to the file system to Hbase.

Hbase is simply supported using the file system to store data, including ordinary FS (configured as a file: // usr / local / habase / data), or with HDFS. And even develop their own support for other file systems, of course, if you want to do, you require a lot of customization.

If the HDFS file system, you need to tell the configuration information hbase hdfs, she was able to access the corresponding nameservice correct. (Note: If you do not pass nameservice hbase to access HDFS, it does not need to know the configuration of HDFS does not require core-site.xml, hdfs-site.xml configuration inside information only needs to be configured to hdfs: // host001: 9000 on it)

<property>
  <name>hbase.rootdir</name>
  <value>hdfs://HadoopCluster1/hbase</value>
</property>

 Bonding said above, since the cluster as a nameservice HDFS access paths, a need to provide a description of the configuration and core-site.xml hdfs-site.xml inside, it is necessary to copy both files to clusters hadoop $ HBASE_HOME / conf / directory. Hbase will default to read the information in these two files to obtain nameservice corresponding host and port.

 

(6) Final inspection

Revised $ HBASE_HOME / conf file should include: regionserver, backup-master, hbase-env.sh, hbase-site.sh

The most basic configuration is as follows:
Hbase-env.sh:

export JAVA_HOME=/usr/local/java/jdk1.8.0_211
export HBASE_HEAPSIZE=600M
export HBASE_LOG_DIR=/usr/local/hbase/hbase-1.4.10/Logs
export HBASE_PID_DIR=/usr/local/hbase/hbase-1.4.10/Pids
export HBASE_MANAGES_ZK=false

 hbase-site.sh

<configuration>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>
<property>
  <name>hbase.rootdir</name>
  <value>hdfs://HadoopCluster1/hbase</value>
</property>
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>host004,host005,host006</value>
</property>
<property>
   <name>hbase.zookeeper.property.clientPort</name>
   <value>2181</value>
</property>
</configuration>

 

Regionservers:

host004
host005
host006

backup-masters:

host003

 If you use hdfs, nameservice way, also you need to have core-stie.xml, under hdfs-site.xml in this directory.

 

(7) configuration file distributed to all machines in the cluster

The configuration file (conf below all) hbase replicated to all cluster machines. So, on that machine is started, that Taiwan has become a Master, because Master this machine is not specified. Other regionserver, backup-master are all specified in the footsteps of these machines will be based on the specified hostname rsh past, and then start the appropriate daemons, so that the entire cluster started.

On the master:

hbase@host007:~/hbase-1.4.10/bin$ ./start-hbase.sh 

It can start according to the configuration.

 

Guess you like

Origin www.cnblogs.com/ZhouAlex/p/11387920.html