Use kettle timing extraction tool mysql data to the cluster table hbase (c)

Why hbase

The data extraction to hadoop in time, we can be saved as a file, but when reading the data on the hadoop analysis, in the form of files is very inconvenient. Data is stored in the form Hbase table to facilitate the inquiry.

Hbase cluster installation

Preparation:

1. installation mode: Hbase can be installed stand-alone mode, pseudo-distributed, distributed. Here we install a distributed based on previous hadoop cluster hbase

2. The need to prepare: 1) build a complete cluster environment hadoop 2) hbase of linux archive.

installation:

1. In the master host hadoop cluster, the compressed hbase copied to an arbitrary path (typically with a path and hadoop), the I / home / wangkang /, then decompress

$ tar zxvf hbase-1.3.5.tar.gz

2. Add the environment variable and on each slave machine on the master host, find the / etc / profile file,

export HBASE_HOME=/home/wangkang//hbase-1.3.5

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib:$HBASE_HOME/bin

Where is the need for new HBASE_HOME added, PATH before there is allocated, followed by the: $ HBASE_HOME / bin can be. HBASE_HOME in front of the PATH. After adding the need to make it take effect immediately

$ source /etc/profile

This approach will take effect immediately, but if it is in centos7 graphical interface, then close the terminal will be permanent after the open will fail, you need to reboot the system. Remember before the restart shutdown hadoop services.

3. Open the installation directory hbase /hbase-1.3.5/conf/hbase-env.sh, add the following

export JAVA_HOME=/home/wangkang/jdk1.8.0_131/(jdk根目录)

export HBASE_CLASSPATH=/home/wangkang/hbase-1.3.5/conf（hbase配置目录）

export HBASE_MANAGES_ZK = true (true use its own means zookeeper, false own separate configuration ZooKeeper)

export HBASE_HOME=/home/wangkang/hbase-1.3.5/（hbase根目录）

export HADOOP_HOME=/home/wangkang/hadoop-2.7.7（hadoop根目录）

export HBASE_LOG_DIR = / home / wangkang / hbase-1.3.5 / log (log storage address)

4. Open hbase-site.xml same directory, add the following

<-! Master is the computer name of the primary server hadoop, port and have been hadoop configuration, default 9000-->

<name>hbase.rootdir</name>

<value>hdfs://master:9000/hbase</value>

</property>

<name>hbase.cluster.distributed</name>

</property>

<name>hbase.master</name>

<value>master:60000</value>

</property>

<name>hbase.zookeeper.quorum</name>

<-! Master, slave1 is the name of each computer in the cluster, I am here only 2 ->

<value>master,slave1</value>

</property>

<name>hbase.zookeeper.property.dataDir</name>

<-! Zookeeper confidence storage address is not configured to do so will be placed under the tmp path, each reboot the system will be cleared ->

<value>/home/wangkang/hbase-1.3.5/data</value>

</property>

5. Open the file in the same directory regionservers, the name of each computer in the cluster

master

slave1

Each row

5. hbase copy files from the master to the other machine

$ scp -r ./hbase-1.3.5 192.168.93.132:/home/wangkang/

So far hbase installed

Start and Authentication

Hbase bin path into the installation directory under execution

$ start-hbase.sh

1. Review Process Certification

jps view the process in the master, the emergence of

HMster (hbase the master process)

HQuorumPeer (zookeeper process)

HRegionServer

In slave1 in jps, view the process, appear

HQuorumPeer (zookeeper process)

HRegionServer (hbase the slave process)

If the above process has represented a successful start.

2. Verify page

Enter the following address, ip wherein ip is the master host address.

http://192.168.93.131:16010/master-status

3. Start order

If a stand-alone installation zookeeper

Boot sequence: hadoop-> zookeeper-> hbase

Stop order: hbase-> zookeeper-> hadoop

Use the built-zookeeper

Boot sequence: hadoop-> hbase

Stop order: hbase-> hadoop

Use kettle timing extraction tool mysql data to the cluster table hbase (c)

Guess you like