Hbase installation and deployment

Installation and configuration

1. Prerequisites

1. JDK

Like Hadoop, Hbase requires JDK1.6 or higher, all first follow the configuration environment variables

2. Zookeeper

Zookeeper is the coordinator of the Hbase cluster, responsible for solving the single point of Hmaster, so zookeeper must be installed first.

3.Hadoop

Hadoop environment is required under cluster mode

Two, installation and deployment

Hbase has two operating modes, stand-alone mode and distributed mode

1. Cluster mode

[JDK download link][ https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html ] JDK installation is generally provided by research and development, and there are version requirements

[Hbase download link][ https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/ ]

Enter the stable folder [stable stable version], and then download the binary file ending in tar.gz to the local

Install the JDK environment

tar xf jdk-8u121-linux-x64.tar.gz  -C /opt/java/
#配置环境变量,在文件的底部添加一下内容
vim /etc/profile
#java
export JAVA_HOME=/opt/java
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
source /etc/profile
java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

Install Zookeeper

Zookeeper is the coordinator of the Hbase cluster and is responsible for solving the single store problem of HMaster, so zookeeper must be installed first.

[zookeeper download address][ https://archive.apache.org/dist/zookeeper/stable/ ]

Install and configure zk
#上传zk的二进制安装包
tar xf zookeeper-3.6.2.tar.gz  -C /opt/
ln -s /opt/zookeeper-3.6.2/ /opt/zookeeper
cd /opt/zookeeper/conf/
#配置zk的环境变量
export ZK_HOME=/opt/zookeeper
export PATH=$JAVA_HOME/bin:$ZK_HOME/bin:$PATH
#修改配置文件如下,并重命名
vim zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper
clientPort=2181
server.1=master:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888
# 用于配置组成Zookeeper集群的机器列表,其中id即为ServerID,与每台服务器myid文件中的数字相对应。同时,在该参数中,会配置两个端口:第一个端口用于指定Follower服务器与Leader进行运行时通信和数据同步时所使用的端口,第二个端口则专门用于进行Leader选举过程中的投票通信。
#分发配置文件到其他主机
scp zoo.cfg node2:/opt/zookeeper/conf
scp zoo.cfg master:/opt/zookeeper/conf
#创建myid文件
Create myid file
  • 3 servers must be created
  • First ${ZOOKEEPER_HOME}create a data (any name) directory under the / directory to store myid files ( 唯一标识一台zk节点) and other zk data
  • Then write node unique id in myid file, master为1, node2为2,node3为3

    cd /tmp
    mkdir /tmp/zookeeper
    echo 1 >/tmp/zookeeper/myid
zk management script
# 集群列表
#!/bin/bash

# 集群列表
list=(master node2 node3)

case $1 in
"start"){
        for i in ${list[@]}
        do
          echo ---------- zookeeper[ $i ]启动 ------------
                ssh $i "source /etc/profile;zkServer.sh start"
        done
};;
"stop"){
        for i in ${list[@]}
        do
          echo ---------- zookeeper[ $i ]停止 ------------    
                ssh $i "source /etc/profile;zkServer.sh stop"
        done
};;
"status"){
        for i in ${list[@]}
        do
          echo ---------- zookeeper [$i] 状态 ------------    
                ssh $i "source /etc/profile;zkServer.sh status"
        done
};;
esac

Then put the script under zk/bin/ to call

Start the test
zk.sh start
---------- zookeeper [master] 状态 ------------
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
---------- zookeeper [node2] 状态 ------------
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
---------- zookeeper [node3] 状态 ------------
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader

Hadoop

In cluster mode, we need Hadoop environment

[Download address][ https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable2/ ] download and unzip

tar xf hadoop-2.7.4.tar.gz  -C /opt/
Configure environment variables
#hadoop
export HADOOP_HOME=/opt/hadoop
export PATH=$JAVA_HOME/bin:$ZK_HOME/bin:$HADOOP_HOME/bin:$PATH
Change setting
  1. Specify the jdk installation location
vim hadoop-env.sh
export JAVA_HOME=/opt/java/
  1. core-site.xml
<configuration>
    <property>
        <!--指定 namenode 的 hdfs 协议文件系统的通信地址-->
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <!--指定 hadoop 集群存储临时文件的目录-->
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/tmp</value>
    </property>
</configuration>
  1. hdfs-site.xml
<configuration>

   <property>
     <!--namenode 节点数据(即元数据)的存放位置,可以指定多个目录实现容错,多个目录用逗号分隔 -->
     <name>dfs.namenode.name.dir</name>
     <value>/mybk/hdfs/name1,/mybk/hdfs/name2,/mybk/namenodebak/namedir</value>
   </property>
   <property>
     <!-- 节点数据存放的位置,数据块存放的位置 -->
     <name>dfs.datanode.data.dir</name>
     <value>/mybk/hdfs/data</value>
   </property>
   <property>
     <name>dfs.datanode.max.xcievers</name>
     <value>8192</value>
   </property>
   <property>
     <name>dfs.support.append</name>
     <value>true</value>
   </property>
   <property>
     <name>dfs.hosts</name>
     <value>/opt/hadoop/conf/dfs.hosts</value>
   </property>
   <property>
     <name>dfs.hosts.exclude</name>
     <value>/opt/hadoop/conf/dfs.hosts.exclude</value>
   </property>
   <property>
     <name>dfs.replication</name>
     <value>2</value>
   </property>
   <property>
       <name>dfs.balance.bandwidthPerSec</name>
       <value>1048576</value>
   </property>
</configuration>
  1. yarn-site.xml
<configuration>
    <property>
        <!--配置 NodeManager 上运行的附属服务。需要配置成 mapreduce_shuffle 后才可以在 Yarn 上运行 MapReduce 程序。-->
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <!--resourcemanager 的主机名-->
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
    </property>
</configuration>
  1. mapred-site.xml
<configuration>
    <property>
        <!--指定 mapreduce 作业运行在 yarn 上-->
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
  1. slaves

    Configure the host name or IP address of all the slave nodes, one per line, and the DataNode service and NodeManager service on all the slave nodes will not be started.

    cat slaves 
    master
    node2
    node
Distribution program

Distribute Hadoop configuration files to other two-day servers

scp -rf /opt/hadoop node2:/opt/
scp -rf /opt/hadoop node3:/opt/
initialization

Execute namenode initialization command on Hadoop

hdfs namenode -format
Start the cluster

Go to the sbin/ directory and start Hadoop. At this time   , the related services on hadoop002 and  hadoop003on will also be started:

#启动dfs服务
/opt/hadoop/sbin/start-dfs.sh
#启动yarn服务
/opt/hadoop/sbin/start-yarn.sh
#启动jobhistory
./mr-jobhistory-daemon.sh  start historyserver
View cluster

Use the jps command on each machine to view the service process or use the UI to view the port 50070

[root@master sbin]# jps
62560 NameNode
63265 ResourceManager
63377 NodeManager
62899 SecondaryNameNode
63428 Jps
29205 QuorumPeerMain
54859 DataNode

Install Hbase

[Hbase download link][ https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.4.9/hbase-1.4.9-bin.tar.gz ]

Install Hbase
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.4.13/hbase-1.4.13-bin.tar.gz
tar xf hbase-1.4.13-bin.tar.gz  -C /opt/
ln -s /opt/hbase-1.4.13 /opt/hbase
Configure Hbase-env.sh environment variables
export JAVA_HOME=/opt/java
export HBASE_CLASSPATH=/opt/hbase/conf
hbase-site.xml configure hbase
<configuration>
<property>
    <name>hbase.rootdir</name>
    <!-- hbase存放数据目录 -->
    <value>hdfs://master:9000/hbase/hbase_db</value>
    <!-- 端口要和Hadoop的fs.defaultFS端口一致-->
</property>
<property>
    <name>hbase.cluster.distributed</name>
    <!-- 是否分布式部署 -->
    <value>true</value>
</property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <!-- zookooper 服务启动的节点,只能为奇数个 -->
    <value>master,node2,node3</value>
</property>
<!-- 指定 zookeeper 的地址-->
<property>
        <name>hbase.zookeeper.quorum</name>
        <value>master:2181,node2:2181,node3:2181</value>
</property>
<property>
    <!--hbase master -->
    <name>hbase.master</name>
    <value>master</value>
</property>
<property>
    <!--hbase web 端口 -->
    <name>hbase.master.info.port</name>
    <value>16666</value>
</property>
</configuration>
Specify cluster node
[root@master conf]# cat regionservers 
master
node2
node3
backup-master
cat backup-master
node2

backup-masters This file does not exist and needs to be created. It is mainly used to indicate the standby master node, which can be multiple. Here we take 1 as an example

HDFS client configuration

This is an optional configuration. If you change the configuration of the HDFS client on the Hadoop cluster, such as setting the copy coefficient  dfs.replication to 5, you must use one of the following methods to make HBase aware, otherwise HBase will still use the default copy Factor 3 to create the file:

The first

Add the location information of the Hadoop configuration file to  hbase-env.sh the  HBASE_CLASSPATH attribute, an example is as follows:

export HBASE_CLASSPATH=/opt/etc/hadoop

The second

Copy hdfs-site.xml or  hadoop-site.xml copy Hadoop  to the  /opt/hbase/conf directory, or use symbolic links. If you use this method, it is recommended to copy both or create a symbolic link. An example is as follows:

# 拷贝
cp core-site.xml hdfs-site.xml /opt/hbase/conf/
# 使用符号链接
ln -s   /opt/hadoop/etc/hadoop/core-site.xml  /opt/hbase/conf/
ln -s   /opt//hadoop/etc/hadoop/hdfs-site.xml /opt/hbase/conf/

#####Configure hbase command environment command

vim /etc/profile
export HBASE_HOME=/opt/hbase
#PAHT
export PATH=$JAVA_HOME/bin:$ZK_HOME/bin:$HADOOP_HOME/bin:$HBASE_HOME/bin:$PATH
Copy hbase to other machines
 scp -r /opt/hbase node3:/opt/hbase
Start Hbase

Just operate on the master server

start-hbase.sh
View service
[root@master conf]# jps
2304 ResourceManager
4886 HMaster
5031 HRegionServer
1928 DataNode
3833 QuorumPeerMain
2142 SecondaryNameNode
10046 Jps
1807 NameNode

Browser web page access ip: 16666

Reference website

https://juejin.cn/post/6844903907735371789#heading-17

https://juejin.cn/post/6844903949728743432
https://blog.csdn.net/m0_37809146/category_9014679.html

Guess you like

Origin blog.51cto.com/13654115/2639961