HBase distributed cluster construction (zookeeper+hadoop+hbase) is super detailed!

Environmental preparation

zookeeper:zookeeper-3.4.14

hadoop:hadoop-2.8.5

hbase:hbase-1.4.13

master : purpose 、 resourcemanager ,

slave1:secondarynamenode、datanode,

slave2:datanode

1. Hadoop cluster construction

1. Decompress the hadoop installation package

tar zxvf hadoop-2.8.5.tar.gz

2. Add JAVA_HOME to the three configuration files of hadoop-env.sh, mapred-env.sh, and yarn-env.sh

cd hadoop-2.8.5/etc/hadoop/
export JAVA_HOME=/usr/local/jdk1.8.0_191

3, refurbishment core-site.xml, hdfs-site.xml, mapped-site.xml, yarn-site.xml

core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/hadoopdata</value>
    </property>
</configuration>

hdfs-site.xml

<configuration>
    <!--
        这两个配置用于指定namenode和datanode存放数据的位置
        如果不指定,数据会存放到core-site.xml中配hadoop.tmp.dir目录下
    -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/hadoop/hadoopdata/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/hadoop/hadoopdata/data</value>
    </property>
    <!--配置数据副本数,不要超过datanode节点数-->
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <!--指定secondarynamenode所在节点-->
    <property>
        <name>dfs.secondary.http.address</name>
        <value>slave1:50090</value>
    </property>
</configuration>

mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

yarn-site.xml

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

4. Modify slaves and add data nodes

slave1
slave2

5. Configure password-free login between clusters (must be configured, otherwise it will be blocked later)

ssh-keygen  四下回车
ssh-copy-id [email protected]
ssh-copy-id [email protected]
每个集群重复一遍,把本地的ssh复制到其他机器

6. Distribute the hadoop installation package to other nodes

scp -r hadoop-2.8.5 slave1:/usr/local/
scp -r hadoop-2.8.5 slave2:/usr/local/

7. Add the following ip and name to all node hosts

192.168.214.129 master
192.168.214.130 slave1
192.168.214.131 slave2

8. Add Hadoop environment variables to all nodes

vi /etc/profile
export HADOOP_HOME=/usr/local/hadoop-2.8.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile

9. Format the file system on the namenode master node

cd bin/
hdfs namenode -format

The above figure shows that the formatting is successful.

10. The master node starts the hdfs system, and the resourcemanager node starts yarn

start-dfs.sh
start-yarn.sh

shut down

cd sbin
stop-yarn.sh
stop-dfs.sh

11. When Hadoop needs to be reinstalled:

1) Delete the hadoop log in each machine

By default, the logs are under HADOOP_HOME/logs. If they are not deleted, the log files will accumulate and occupy disks.

2) Delete the data and files generated by the original namenode and datanode.

Delete the configured hadoop.tmp.dirdirectory, if configured dfs.datanode.data.dirand dfs.datanode.name.diralso delete it.

12. When starting Hadoop, a datanode fails to start

查看core-site.xml文件中hadoop.tmp.dir的参数,在该目录下找到current目录下VERSION文件文件,修改其中的clusterID与namenode节点相同
/home/hadoop/hadoopdata/name/current
/home/hadoop/hadoopdata/data/current

Two, Zookeeper cluster construction

1. Unzip the zookeeper installation package

tar -zxvf zookeeper-3.4.14.tar.gz

2. Copy conf/zoo_sample.cfg to zoo.cfg and modify zoo.cfg. Create a new directory data/, log/

dataDir=/usr/local/zookeeper-3.4.14/data
dataLogDir=/usr/local/zookeeper-3.4.14/log
server.0=192.168.214.129:2888:3888
server.1=192.168.214.130:2888:3888
server.2=192.168.214.131:2888:3888

3. Distribute the zookeeper installation package to other nodes

scp -r zookeeper-3.4.14 slave1:/usr/local/
scp -r zookeeper-3.4.14 slave2:/usr/local/

4. Create a new myid file under /zookeeper-3.4.14/data/ on all machines, and modify the file according to the name (0, 1, 2) in step 2 above

vi /zookeeper-3.4.14/data/myid
0(master上)
1(slave1)
2(slave2)

5. Start and stop

bin/zkServer.sh start
​
bin/zkServer.sh stop

 

Three, HBase cluster construction

1. Decompress the hbase installation package

tar -zxvf hbase-1.4.13-bin.tar.gz

2、修改hbase-env.sh、hbase-site.xml、backup-masters、regionservers

cd hbase-1.4.13/conf

hbase-env.sh

# 加入JAVA_HOME
export JAVA_HOME=/usr/local/java/jdk1.8.0_73
# 指定不使用自带的zookeeper,如果搭建了zookeeper集群,改为false
export HBASE_MANAGES_ZK=true

hbase-site.xml

First create hdfs://master:9000/user/hbase file

hadoop fs -mkdir /user/hbase

configuration>
  <!--
  可以不配置,如果要配置,需要和zookeeper配置文件zoo.cfg中的dataDir指定的路径相同
  zoo.cfg中dataDir=/var/zookeeper
  那么:
  <property>
<name>hbase.master</name>
<value>master</value>
</property>
<property>
<name>hbase.master.port</name>
<value>15999</value>
</property>
  <property>
      <name>hbase.zookeeper.property.dataDir</name>
      <value>/var/zookeeper</value>
  </property>
  -->
   
  <!--指定hbase的数据在hdfs上存放的位置-->
  <property>
      <name>hbase.rootdir</name>
      <value>hdfs://master:9000/user/hbase</value>
  </property>
   
  <!--指定hbase集群为分布式集群-->
  <property>
      <name>hbase.cluster.distributed</name>
      <value>true</value>
  </property>
   
  <!--指定zookeeper集群-->
  <property>
      <name>hbase.zookeeper.quorum</name>
      <value>slave1:2181,slave2:2181</value>
  </property>
   
  <!--用于hbase数据读取时的编码格式-->
  <property>
      <name>hbase.regionserver.thrift.framed</name>
      <value>true</value>
  </property>
  <property>
      <name>hbase.regionserver.thrift.compact</name>
      <value>true</value>
  </property>

</configuration>

backup-masters: add backup hbase-master

vi backup-masters
​
slave1

regionservers: join the list of RegionServer nodes

vi regionservers
# 删除localhost
master
slave1
slave2

3. Copy hadoop files core-site.xml and hdfs-site.xml to hbase

cp /usr/local/hadoop-2.8.5/etc/hadoop/core-site.xml ./
cp /usr/local/hadoop-2.8.5/etc/hadoop/hdfs-site.xml ./

4. Distribute the hbase installation package to other nodes

scp -r hbase-1.4.13 slave1:/usr/local/
scp -r hbase-1.4.13 slave2:/usr/local/

5. Add environment variables

vi /etc/profile
export HBASE_HOME=/usr/local/hbase-1.4.13
export PATH=$PATH:$HBASE_HOME/bin:$HBASE_HOME/sbin
source /etc/profile

6. Start the hbase cluster

start-hbase.sh

Start another connection and use the hbase shell to enter the hbase console, indicating that the installation is successful.

7. Error when starting HBase

Caused by: java.io.IOException: Problem binding to /10.12.4.75:60000 : Cannot assign requested address. To switch ports use the 'hbase.master.port' configuration property.
​
解决方法:修改/etc/hosts文件,在本地ip后面加上通过 hostname -f 查询到的名字。

Guess you like

Origin blog.csdn.net/wh672843916/article/details/106060457