hadoopHA安装教程

五节点HadoopHA安装教程:

Master1    namenode,resourcemanager,nodemanager,datanode,journalnode, DFSZKFailoverController

Master2    namenode,resourcemanager,nodemanager,datanode,journalnode, DFSZKFailoverController

Slave1      nodemanager,datenode,journalnode, QuorumPeerMain

Slave2            nodemanager,datenode,journalnode, QuorumPeerMain

Slave3            nodemanager,datenode,journalnode, QuorumPeerMain

1.     安装jdk

配置环境变量

##############JAVA

export JAVA_HOME=/home/zhouwang/jdk1.8.0_151

export PATH=$JAVA_HOME/bin:$PATH

export JRE_HOME=$JAVA_HOME/jre

export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH

2.配置个节点的/etc/hosts文件

192.168.71.128 master1

192.168.71.132 master2

192.168.71.129 slave1

192.168.71.130 slave2

192.168.71.131 slave3

3.配置SSH免密登录

Ssh-keygen -t rsa

Cat id_rsa.pub >> authorized_keys

Chmod 644 authorized_keys

Scp ~/.ssh/id_rsa.pub authorized_keys zhouwang@master:~/.ssh

Authorized_keys的权限必须是640

4.安装zookeeper

重命名conf文件夹下的zoo.example.cfg为zoo.cfg。新修改内容

clientPort=2181

dataDir=/home/zhouwang/zookeeper/data

dataLogDir=/home/zhouwang/zookeeper/log

server.0=master1:2888:3888

server.1=master2:2888:3888

server.2=slave1:2888:3888

server.3=slave2:2888:3888

server.4=slave3:2888:3888

在zookeeper文件夹下面创建相应的data和log文件夹,还有myid文件

Mkdir data

Mkdir log

Vim myid 输入1,每个节点的myid文件的值与server.x对应上

配置环境变量如下:

##############ZOOKEEPER

export ZOOKEEPER_HOME=/home/zhouwang/zookeeper

export PATH=$PATH:$ZOOKEEPER_HOME/bin

5. 安装Hadoop

修改etc/hadoop下的四个配置文件

(1)     core-site.xml

<configuration>

   <!-- 指定hdfs的nameservice为master -->

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://master/</value>

    </property>

   <!-- 指定hadoop临时目录 -->

    <property>

        <name>hadoop.tmp.dir</name>

        <value>/home/zhouwang/hadoop/tmp</value>

    </property>

   <!-- 指定zookeeper地址 -->

    <property>

        <name>ha.zookeeper.quorum</name>

        <value>master1:2181,master2:2181,slave1:2181,slave2:2181,slave3:2181</value>

    </property>

</configuration>

(2)     hdfs_site.xml

<configuration>

  <property>

        <name>dfs.namenode.name.dir</name>

        <value>/home/zhouwang/hadoop/dfs/name</value>

    </property>

    <property>

        <name>dfs.datanode.data.dir</name>

        <value>/home/zhouwang/hadoop/dfs/data</value>

    </property>

    <property>

        <name>dfs.replication</name>

        <value>3</value>

    </property>

    <!--HDFS高可用配置 -->

    <!--指定hdfs的nameservice,需要和core-site.xml中的保持一致-->

    <property>

        <name>dfs.nameservices</name>

        <value>master</value>

    </property>

    <!--指定master的两个namenode的名称 -->

    <property>

        <name>dfs.ha.namenodes.master</name>

        <value>nn1,nn2</value>

    </property>

    <!-- nn1,nn2 rpc 通信地址 -->

    <property>

        <name>dfs.namenode.rpc-address.master.nn1</name>

        <value>master1:9000</value>

    </property>

    <property>

        <name>dfs.namenode.rpc-address.master.nn2</name>

        <value>master2:9000</value>

    </property>

    <!-- nn1.nn2 http 通信地址 -->

    <property>

        <name>dfs.namenode.http-address.master.nn1</name>

        <value>master1:50070</value>

    </property>

    <property>

        <name>dfs.namenode.http-address.master.nn2</name>

        <value>master2:50070</value>

    </property>

    <!--=========Namenode同步==========-->

    <!--保证数据恢复 -->

    <property>

        <name>dfs.journalnode.http-address</name>

        <value>0.0.0.0:8480</value>

    </property>

    <property>

        <name>dfs.journalnode.rpc-address</name>

        <value>0.0.0.0:8485</value>

    </property>

    <property>

        <!--指定NameNode的元数据在JournalNode上的存放位置 -->

        <name>dfs.namenode.shared.edits.dir</name>

        <value>qjournal://master1:8485;master2:8485;slave1:8485;slave2:8485;slave3:8485/master</value>

    </property>

    <property>

        <!--JournalNode存放数据地址 -->

        <name>dfs.journalnode.edits.dir</name>

        <value>/home/zhouwang/hadoop/dfs/journal</value>

    </property>

    <property>

        <!--NameNode失败自动切换实现方式 -->

        <name>dfs.client.failover.proxy.provider.master</name>

        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

    </property>

    <!--=========Namenode fencing:======== -->

    <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行 -->

    <property>

        <name>dfs.ha.fencing.methods</name>

        <value>sshfence

                       shell(/bin/true)</value>

    </property>

    <!-- 使用sshfence隔离机制时需要ssh免登陆 -->

    <property>

        <name>dfs.ha.fencing.ssh.private-key-files</name>

        <value>/home/zhouwang/.ssh/id_rsa</value>

    </property>

    <!-- 配置sshfence隔离机制超时时间 -->

    <property>

        <name>dfs.ha.fencing.ssh.connect-timeout</name>

        <value>30000</value>

    </property>

    <!--开启基于Zookeeper及ZKFC进程的自动备援设置,监视进程是否死掉 -->

    <property>

        <name>dfs.ha.automatic-failover.enabled</name>

        <value>true</value>

    </property>

   

</configuration>

(3) mapred-site.xml

<configuration>

       <!-- 指定mr框架为yarn方式 --> 

  <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>       

  <!-- 配置 MapReduce JobHistory Server地址 ,默认端口10020 -->

  <property>

        <name>mapreduce.jobhistory.address</name>

        <value>master1:10020</value>

  </property>

  <!-- 配置 MapReduce JobHistory Server HTTP地址, 默认端口19888 -->

  <property>

        <name>mapreduce.jobhistory.webapp.address</name>

        <value>master1:19888</value>

  </property>

</configuration>

(4) yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

       <!--NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序-->

    <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

    </property>

    <property>

        <name>yarn.resourcemanager.connect.retry-interval.ms</name>

        <value>2000</value>

    </property>

    <property>

        <name>yarn.resourcemanager.ha.enabled</name>

        <value>true</value>

    </property>

    <!-- 指定RM的cluster id -->

    <property>

        <name>yarn.resourcemanager.cluster-id</name>

        <value>cluster</value>

    </property>

    <!--指定两台RM主机名标识符-->

    <property>

        <name>yarn.resourcemanager.ha.rm-ids</name>

        <value>rm1,rm2</value>

    </property>

    <!--RM主机1-->

    <property>

        <name>yarn.resourcemanager.hostname.rm1</name>

        <value>master1</value>

    </property>

    <!--RM主机2-->

    <property>

        <name>yarn.resourcemanager.hostname.rm2</name>

        <value>master2</value>

    </property>

    <!--RM故障自动切换-->

    <property>

        <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>

        <value>true</value>

    </property>

    <!--RM故障自动恢复 -->

    <property>

    <name>yarn.resourcemanager.recovery.enabled</name> 

        <value>true</value> 

    </property>

    <!--RM状态信息存储方式,一种基于内存(MemStore),另一种基于ZK(ZKStore)-->

    <property>

        <name>yarn.resourcemanager.store.class</name>

        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

    </property>

    <!-- 指定zk集群地址 -->

    <property>

        <name>yarn.resourcemanager.zk-address</name>

        <value>master1:2181,master2:2181,slave1:2181,slave2:2181,slave3:2181</value>

    </property>

    <!--向RM调度资源地址-->

    <property>

        <name>yarn.resourcemanager.scheduler.address.rm1</name>

        <value>master1:8030</value>

    </property>

    <property>

        <name>yarn.resourcemanager.scheduler.address.rm2</name>

        <value>master2:8030</value>

    </property>

    <!--NodeManager通过该地址交换信息-->

    <property>

        <name>yarn.resourcemanager.resource-tracker.address.rm1</name>

        <value>master1:8031</value>

    </property>

    <property>

        <name>yarn.resourcemanager.resource-tracker.address.rm2</name>

        <value>master2:8031</value>

    </property>

    <!--客户端通过该地址向RM提交对应用程序操作-->

    <property>

        <name>yarn.resourcemanager.address.rm1</name>

        <value>master1:8032</value>

    </property>

    <property>

        <name>yarn.resourcemanager.address.rm2</name>

        <value>master2:8032</value>

    </property>

    <!--管理员通过该地址向RM发送管理命令-->

    <property>

        <name>yarn.resourcemanager.admin.address.rm1</name>

        <value>master1:8033</value>

    </property>

    <property>

        <name>yarn.resourcemanager.admin.address.rm2</name>

        <value>master2:8033</value>

    </property>

    <!--RM HTTP访问地址,查看集群信息-->

    <property>

        <name>yarn.resourcemanager.webapp.address.rm1</name>

        <value>master1:8088</value>

    </property>

    <property>

        <name>yarn.resourcemanager.webapp.address.rm2</name>

        <value>master2:8088</value>

    </property>

</configuration>

(5) slaves

#localhost

192.168.71.129 slave1

192.168.71.130 slave2

192.168.71.131 slave3

192.168.71.128 master1

192.168.71.132 master2

设置配置文件:

###############HADOOP

export HADOOP_HOME=/home/zhouwang/hadoop

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export YARN_CONF_DIR=$HADOOP_HOME/etc/Hadoop

将hadoop文件夹分发到各个节点上去:

Scp -r Hadoop zhouwang@XXX:~/Hadoop

6. 第一次启动集群

配置了zookeeper的节点启动zkServer.sh服务

zkServer.sh start

然后查看zookeeper状态,zkServer.sh status,显式为follower或者leader则说明启动成功

然后,数据节点启动journalnode服务,hadoop-deamen.sh start journalnode

在master1格式化namenode: Hadoop namenode -format

然后启动namenode服务: Hadoop-deamon.sh start namenode

在master2同步master1的元数据: HDFS namenode -bootstrapStandby

然后再master2上启动namenode服务: Hadoop-deamon.sh start namenode

在master上格式化ZKFC

Hdfs zkfc -formatZK

在master1和master2上执行hadoop-deamon.sh start zkfc 启动DFSZKFailoverController 服务#ZKFC用于监控NameNode active和standby节点状态

在master1上启动hadoop-deamons.sh start datanode 启动所有数据节点上的datanode服务。

在master1上执行start-yarn.xml启动yarn服务。

至此安装完成。

7.第一次停止集群

先停止hdfs:stop-dfs.sh

在停止yarn:stop-yarn.sh

再次启动或者停止就可以执行start-all.sh 和stop-all.sh了。

8.Hdfs操作命令

Hadoop dfsadmin -report #查看datanode的节点信息

Hdfs haadmin -getServiceState nn1 查看namenode的状态

hdfs haadmin -transitionToActive/transitionToStandby -forcemanual nn1 强制切换节点的active和是standby状态。

猜你喜欢

转载自my.oschina.net/zhouwang93/blog/1819907