Hadoop to build a high-availability HA cluster

First, the tools to prepare

  1,7 virtual machines (at least three), this set up to seven, for example, with a good ip, turn off the firewall, modify the mapping between host names and IP (/ etc / hosts), turn off the firewall

  2, the JDK install, configure the environment variables

Second, the cluster plan:

Cluster Programming (seven): 
    Process hostname software IP installation to run 
    .. Hadoop01 192.168 * 121 the JDK, hadoop the NameNode, DFSZKFailoverController (zkfc) 
    hadoop02 192.168 * 122 the JDK, hadoop the NameNode, DFSZKFailoverController (zkfc).. 
    192.168 hadoop03 *. JDK .123, the ResourceManager Hadoop 
    hadoop04 192.168. *. JDK 124, the ResourceManager Hadoop 
    hadoop05 192.168. *. JDK 125, Hadoop, ZooKeeper DataNodes, the NodeManager, JournalNode, QuorumPeerMain  
    hadoop06 192.168. *. JDK 126, Hadoop, ZooKeeper DataNodes, the NodeManager, JournalNode , QuorumPeerMain
    hadoop07 192.168. *. JDK 127, Hadoop, ZooKeeper DataNodes, the NodeManager, JournalNode , QuorumPeerMain

Three cluster of cluster planning:

    Process hostname software IP installation running 
    hadoop01 192.168. *. 201 the JDK, hadoop the NameNode, DFSZKFailoverController (zkfc) JournalNode, QuorumPeerMain (ZooKeeper) 
    hadoop02 192.168. *. 202 the JDK, hadoop the NameNode, DFSZKFailoverController (zkfc) JournalNode, QuorumPeerMain (ZooKeeper ) 
    hadoop03 192.168. *. JDK 203, Hadoop DataNodes JournalNode, QuorumPeerMain (ZooKeeper)

Third, the installation steps

  1, zookeeper cluster configuration (hadoop05 on)

   1.1 unzip 

tar -zxvf zookeeper.tar.gz -C /hadoop/

  1.2 modify the configuration

CD / Hadoop / ZooKeeper / the conf /
 CP zoo_sample.cfg zoo.cfg 
Vim zoo.cfg 
# Review: dataDir = / Home / App / Hadoop / ZooKeeper / Data 
# last added: 

. Server . 1 = hadoop05: 2888 : 3888 
Server. 2 = hadoop06: 2888 : 3888 
Server. 3 = hadoop07: 2888 : 3888 

# save and exit 
# then create a tmp folder 

mkdir / hadoop / ZooKeeper / tmp 

# then create an empty file 

Touch / hadoop / ZooKeeper / tmp / myid 

# final is written to the file ID 

echo  . 1 > / Hadoop / ZooKeeper / tmp / MyID

   1.3 configured zookeeper copy to other nodes (respectively create a first directory under hadoop hadoop06, hadoop07 root: mkdir / hadoop)

scp -r /hadoop/zookeeper/ hadoop06:/hadoop/
scp -r /hadoop/zookeeper/ hadoop07:/hadoop/
            
# 注意:修改hadoop06、hadoop07对应/hadoop/zookeeper/tmp/myid内容
# hadoop06:
echo 2 > /hadoop/zookeeper/tmp/myid
# hadoop07:
echo 3 > /hadoop/zookeeper/tmp/myid

  

2. Installation Configuration hadoop cluster (operation on hadoop01) (hadoop using a version 3.2.1)

  2.1 unzip

tar -zxvf hadoop-3.2.1.tar.gz -C /hadoop/

  2.2 Configuring HDFS (hadoop2.0 all configuration files are in the $ HADOOP_HOME / etc / hadoop directory)

# Add the environment variable to hadoop
 VI / etc / Profile 
Export the JAVA_HOME = / usr / Java / jdk1. . 8 
Export HADOOP_HOME is = / hadoop / hadoop- 3.2 . . 1 
Export the PATH = the PATH $: $ the JAVA_HOME / bin: $ HADOOP_HOME is / bin
# hadoop2.0 profile all in $ HADOOP_HOME / etc / under hadoop 
cd / Home / hadoop / App / hadoop- 3.2 . 1 / etc / hadoop

  2.2.1 modify hadoo-env.sh

export JAVA_HOME=/home/hadoop/app/jdk1.7.0_55

  2.2.2 modify the core-site.xml

<configuration>
    <!-- 指定hdfs的nameservice为ns1 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://ns1</value>
    </property>
    <!-- 指定hadoop临时目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/app/hadoop-3.2.1/tmp</value>
    </property>
    
    <!-- 指定zookeeper地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hadoop05:2181,hadoop06:2181,hadoop07:2181</value>
    </property>
</configuration>

2.2.3 modify hdfs-site.xml

<Configuration> 
    <-! hdfs designated as a nameservice ns1, and the need to maintain consistency in the core-site.xml -> 
    <Property> 
        <name> dfs.nameservices </ name> 
        <value> NSl </ value> 
    < / Property> 
    <-! NSl following two NameNode, are NN1, NN2 -> 
    <Property> 
        <name> dfs.ha.namenodes.ns1 </ name> 
        <value> NN1, NN2 </ value> 
    < / Property> 
    <- RPC communication address of NN1 ->! 
    <Property> 
        <name> dfs.namenode.rpc-address.ns1.nn1 </ name> 
        <value> hadoop01: 9000 </ value> 
    </ Property> 
    <-! nn1 mailing address of http -> 
    <property>
        <name>dfs.namenode.http-address.ns1.nn1</name>
        <value>hadoop01:9870</value>
    </property>
    <!-- nn2的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns1.nn2</name>
        <value>hadoop02:9000</value>
    </property>
    <!-- nn2的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns1.nn2</name>
        <value>hadoop02:9870</value>
    </property>
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hadoop05: 8485; hadoop06: 8485; hadoop07: 8485 / NSl </ value> 
    </ Property> 
    <-! JournalNode specified data stored in the local disk position -> 
    <Property> 
        <name> dfs.journalnode.edits. the dir </ name> 
        <value> / Home / Hadoop / App / hadoop- 3.2 . . 1 / journaldata </ value> 
    </ Property> 
    <-! NameNode open automatically switch failure -> 
    <Property> 
        <name> dfs.ha failover.enabled-.automatic </ name> 
        <value> to true </ value> 
    </ Property> 
    <-! automatic switching implementation configuration failure -> 
    <Property> 
        <name> dfs.client.failover.proxy.provider.ns1</name>
        <value> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider </ value> 
    </ Property> 
    <-! isolation mechanism disposed method, a plurality of line feed mechanism of splitting, i.e., each line of the mechanism for temporary use - -> 
    <Property> 
        <name> dfs.ha.fencing.methods </ name> 
        <value> 
            sshfence 
            shell ( / bin / to true )
         </ value> 
    </ Property> 
    <-! required when using ssh-free isolation mechanism sshfence Log in -> 
    <Property> 
        . <name> dfs.ha.fencing SSH .private-Key-Files </ name> 
        . <value> / Home / hadoop / SSH / id_rsa </ value> 
    </ Property> 
    <!- isolation mechanism disposed sshfence timeout -> 
    <Property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
</configuration>

2.2.4 modify mapred-site.xml

<configuration>
    <!-- 指定mr框架为yarn方式 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

2.2.5 modify yarn-site.xml

<configuration>
    <!-- 开启RM高可用 -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <!-- 指定RM的cluster id -->
    <property>
       <name>yarn.resourcemanager.cluster-id</name>
       <value>yrc</value>
    </property>
    <!-- 指定RM的名字 -->
    <property>
       <name>yarn.resourcemanager.ha.rm-ids</name>
       <value>rm1,rm2</value>
    </property>
    <!- RM are designated address -> 
       <name> yarn.resourcemanager.
    <Property>hostname.rm1</name>
       <value>hadoop03</value>
    </property>
    <property>
       <name>yarn.resourcemanager.hostname.rm2</name>
       <value>hadoop04</value>
    </property>
    <!-- 指定zk集群地址 -->
    <property>
       <name>yarn.resourcemanager.zk-address</name>
       <value>hadoop05:2181,hadoop06:2181,hadoop07:2181</value>
    </property>
    <property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
    </property>
</configuration>

2.2.6 modify workers (workers is the position of the specified child node, because you want to start HDFS on hadoop01, starting yarn in hadoop03, so workers file is specified on the hadoop01 datanode location, workers file is specified on the hadoop03 of nodemanager position)

2. * versions are slaves

hadoop05
hadoop06
hadoop07

2.2.7 Configuring the password-free login

# First, configure hadoop01 to hadoop02, hadoop05, hadoop06, hadoop07-free password 
workers # produce a pair of keys on hadoop01 
workersssh -keygen - t rsa 
workers # copy the public key to other nodes, including their own 
workersssh -coyp- the above mentioned id hadoop01 
workersssh -coyp- the above mentioned id hadoop02 
workersssh -coyp- the above mentioned id hadoop05 
workersssh -coyp- the above mentioned id hadoop06 
workersssh -coyp- the above mentioned id hadoop07 
workers # configure hadoop03 to hadoop04, hadoop05, hadoop06, password-free login hadoop07 of 
workers # produce a pair of keys on hadoop03 
workersssh -keygen - T RSA 
Workers # copy the public key to the other nodes 
workersssh -coyp-the above mentioned id hadoop04 
workersssh -coyp- the above mentioned id hadoop05 
workersssh -coyp- the above mentioned id hadoop06 
workersssh -coyp- the above mentioned id hadoop07 
Workers # Note: To configure between two namenode ssh password-free login, do not forget to configure hadoop02 hadoop01-free landing 
on hadoop02 workers the production of a pair of keys 
workersssh -keygen - t rsa 
workersssh -coyp- the above mentioned id -i hadoop01

2.4 configured copy to other nodes hadoop

scp -r /hadoop-3.2.1/ hadoop02:/home/hadoop/app/
scp -r /hadoop-3.2.1/ hadoop03:/home/hadoop/app/
scp -r /hadoop-3.2.1/ hadoop04:/home/hadoop/app/
scp -r /hadoop-3.2.1/ hadoop05:/home/hadoop/app/
scp -r /hadoop-3.2.1/ hadoop06:/home/hadoop/app/
scp -r /hadoop-3.2.1/ hadoop07:/home/hadoop/app/

### Note: The following steps strictly in accordance with the next step

2.5 start zookeeper cluster (respectively hadoop05, hadoop06, start zk on hadoop07)

cd / Home / hadoop / App / ZooKeeper / bin / 
. / zkServer. SH Start 
# Check status: a leader, two follower 
. / zkServer. SH Status

2.6 Start journalnode (respectively in hadoop05, hadoop06, performed on hadoop07)

cd / Home / hadoop / App / hadoop- 3.2 . 1 
sbin / HDFS - daemon Start journalnode 
# run jps command inspection, hadoop05, hadoop06, more JournalNode process on hadoop07

2.7 HDFS format

# Execute commands on hadoop01: 
HDFS the NameNode - format 
after # Formatting in accordance with core configuration -site.xml in hadoop.tmp.dir generated files, here is my configuration / hadoop / hadoop- 3.2 . 1 / tmp , 
# then / Hadoop / hadoop- 3.2 . . 1 / tmp to copy the hadoop02 / Hadoop / hadoop- 3.2 . . 1 / under. scp -r tmp / hadoop02: / Home / hadoop / App / hadoop- 3.2 . 1 / ## may be so, recommend the NameNode HDFS -bootstrapStandby

2.8 format ZKFC (can be executed on hadoop01)

hdfs zkfc -formatZK

2.9 start HDFS (executed on hadoop01)

sbin/start-dfs.sh

2.10 start YARN (##### ##### Note: the implementation of start-yarn.sh on hadoop03, to separate namenode and resourcemanager because of performance issues, because they have to take up a lot of resources, so put them apart they are separated will start on different machines)

sbin/start-yarn.sh

2.11 manually start the resoucemanager hadoop04

sbin/yarn --daemon start resourcemanager

This, hadoop-3.2.1 configuration is completed, you can count browser to access:
HTTP: //192.168.*.201: 9870
the NameNode 'hadoop01: 9000' (the Active)
HTTP: //192.168.*.202: 9870
the NameNode ' hadoop02: 9000 '(standby)

    Verify HDFS HA 
        first upload a file to HDFS 
        hadoop FS -put / etc / Profile / Profile 
        hadoop FS - LS / 
        and then kill off active in the NameNode 
        kill - 9 <pid of NN> 
        accessed through a browser: HTTP: // 192.168. 1.202: 9870 
        the NameNode ' hadoop02: 9000 ' (active) 
        the NameNode on hadoop02 this time became active 
        in the implementation of the command: 
        hadoop FS - LS / 
        -rw-r - r--    3 root Supergroup        1926  2014 - 02 - 06 15 : 36 / Profile 
        just upload files still exist! ! ! 
        Manually start the hang of the NameNode 
        sbin / hadoop-daemon. SH Start the NameNode 
        accessed through a browser: HTTP: // 192.168.1.201:9870 
        the NameNode ' hadoop01: 9000 ' (STANDBY) 
    
    verification YARN: 
        Demo run the hadoop provided in the WordCount procedure: 
        Hadoop JAR Share / Hadoop / MapReduce / Hadoop-MapReduce-examples- 2.4 . . 1 .jar WordCount / Profile / OUT 
    
    the OK, done! ! !

 

Guess you like

Origin www.cnblogs.com/syq816/p/12623864.html