Big data cluster construction-"high availability hadoop chapter"-rookie Xiaohui

Summary article:
Big data cluster construction (zookeeper, high-availability hadoop, high-availability hbase)
Continued article:
Big data cluster construction-"zookeeper"

Three, hadoop cluster construction (first copy zhiyou001 to others, make sure that zookeeper configuration is completed before configuration)

create:mkdir /opt/hadoop
enter:cd /opt/hadoop
Upload:
Unzip:tar -zxvf tar -zxvf hadoop-2.7.3
Configure hadoop environment variables

vi /etc/profile
//添加
#配置hadoop
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin

Refresh the configuration file and test

source /etc/profile
hadoop

Modify core-site.xml

cd /opt/hadoop/hadoop-2.7.3/etc/hadoop/
vi core-site.xml

//在<configuration>中添加如下内容
 <!-- 指定hdfs的nameservice为ns --> 
        <property> 
                <name>fs.defaultFS</name> 
                <value>hdfs://ns</value> 
        </property> 

        <!--指定hadoop数据存放目录--> 
        <property> 
                <name>hadoop.tmp.dir</name> 
                <value>/zhiyou/hadoop/tmp</value> 
        </property> 

        <property> 
                <name>io.file.buffer.size</name> 
                <value>4096</value> 
        </property> 

        <!--指定zookeeper地址--> 
        <property> 
                <name>ha.zookeeper.quorum</name> 
                <value>zhiyou001:2181,zhiyou002:2181,zhiyou003:2181</value> 
        </property> 


        <property>
                <name>ipc.client.connect.max.retries</name>
                <value>100</value>
                <description>Indicates the number of retries a client will make to establish
                a server connection.
                </description>
        </property>

        <property>
                <name>ipc.client.connect.retry.interval</name>
                <value>10000</value>
                <description>Indicates the number of milliseconds a client will wait for
                before retrying to establish a server connection.
                </description>
        </property>

hdfs-site.xml

vi hdfs-site.xml
//同样在<configuration>中添加如下内容
<!--指定hdfs的nameservice为ns，需要和core-site.xml中的保持一致 -->      
    <property>      
        <name>dfs.nameservices</name>      
        <value>ns</value>      
    </property>    
    <!-- ns下面有两个NameNode，分别是nn1，nn2 -->  
    <property>  
       <name>dfs.ha.namenodes.ns</name>  
       <value>nn1,nn2</value>  
    </property>  
    <!-- nn1的RPC通信地址 -->  
    <property>  
       <name>dfs.namenode.rpc-address.ns.nn1</name>  
       <value>zhiyou001:9000</value>  
    </property>  
    <!-- nn1的http通信地址 -->  
    <property>  
        <name>dfs.namenode.http-address.ns.nn1</name>  
        <value>zhiyou001:50070</value>  
    </property>  
    <!-- nn2的RPC通信地址 -->  
    <property>  
        <name>dfs.namenode.rpc-address.ns.nn2</name>  
        <value>zhiyou002:9000</value>  
    </property>  
    <!-- nn2的http通信地址 -->  
    <property>  
        <name>dfs.namenode.http-address.ns.nn2</name>  
        <value>zhiyou002:50070</value>  
    </property>  
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->  
    <property>  
         <name>dfs.namenode.shared.edits.dir</name>  
         <value>qjournal://zhiyou001:8485;zhiyou002:8485;zhiyou003:8485/ns</value>  
    </property>  
    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->  
    <property>  
          <name>dfs.journalnode.edits.dir</name>  
          <value>/zhiyou/hadoop/journal</value>  
    </property>  
    <!-- 开启NameNode故障时自动切换 -->  
    <property>  
          <name>dfs.ha.automatic-failover.enabled</name>  
          <value>true</value>  
    </property>  
    <!-- 配置失败自动切换实现方式 -->  
    <property>  
            <name>dfs.client.failover.proxy.provider.ns</name>  
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>  
    </property>  
    <!-- 配置隔离机制 -->  
    <property>  
             <name>dfs.ha.fencing.methods</name>  
             <value>sshfence</value>  
    </property>  
    <!-- 使用隔离机制时需要ssh免登陆 -->  
    <property>  
            <name>dfs.ha.fencing.ssh.private-key-files</name>  
            <value>/root/.ssh/id_rsa</value>  
    </property>  
                                
    <property>      
        <name>dfs.namenode.name.dir</name>      
        <value>file:///zhiyou/hadoop/hdfs/name</value>      
    </property>      
      
    <property>      
        <name>dfs.datanode.data.dir</name>      
        <value>file:///zhiyou/hadoop/hdfs/data</value>      
    </property>      
      
    <property>      
       <name>dfs.replication</name>      
       <value>3</value>      
    </property>     
    <!-- 在NN和DN上开启WebHDFS (REST API)功能,不是必须 -->                                                                      
    <property>      
       <name>dfs.webhdfs.enabled</name>      
       <value>true</value>      
    </property>

mapred-site.xml

mv mapred-site.xml.template mapred-site.xml
vi mapred-site.xml 
//添加configuration
<property>     　
    <name>mapreduce.framework.name</name> 
    <value>yarn</value> 
</property>

yarn-site.xml

vi yarn-site.xml 
//添加configuration
<!-- 指定nodemanager启动时加载server的方式为shuffle server --> 
　　<property> 
　　　　<name>yarn.nodemanager.aux-services</name> 
　　　　<value>mapreduce_shuffle</value> 
　　</property> 
　　<!-- 指定resourcemanager地址 --> 
　　<property> 
　　　　<name>yarn.resourcemanager.hostname</name> 
　　　　<value>zhiyou003</value> 
　　</property>

hadoop-env.sh

//找到文件的export JAVA_HOME=${JAVA_HOME} 一行修改
export JAVA_HOME=/opt/java/jdk1.8.0_141

slaves

//修改localhost为
zhiyou001
zhiyou002
zhiyou003

Copy the content to zhiyou002, zhiyou003

//1. hadoop
scp -r /opt/hadoop/ root@zhiyou002:/opt/hadoop/
scp -r /opt/hadoop/ root@zhiyou003:/opt/hadoop/
//2. /etc/profile
scp -r /etc/profile root@zhiyou002:/etc/profile
scp -r /etc/profile root@zhiyou003:/etc/profile

Three start zookeeper

cd /opt/zookeeper/zookeeper-3.4.12/bin/
./zkServer.sh start

Three start journalnode cluster

cd /opt/hadoop/hadoop-2.7.3/sbin/
./hadoop-daemon.sh  start  journalnode

Format zkfc on zhiyou001:hdfs zkfc -formatZK
Format hdfs on zhiyou001:hadoop namenode -format
Start namenode on zhiyou001:./hadoop-daemon.sh start namenode

Start data synchronization and standby namenode on zhiyou002

hdfs namenode -bootstrapStandby
./hadoop-daemon.sh start namenode

Start the datanode on zhiyou01:./hadoop-daemons.sh start datanode
Start yarn on zhiyou03:./start-yarn.sh
Start zkfc on zhiyou01:./hadoop-daemons.sh start zkfc
View the startup status of all processes
zhiyou001 (7)

zhiyou002 (7 pieces)
zhiyou003 (6 pieces)

At this time, the browser can access hadoop (this machine needs to be equipped with C:\Windows\System32\drivers\etc\hosts and the virtual machine is the same)

192.168.80.128 zhiyou001
192.168.80.129 zhiyou002
192.168.80.130 zhiyou003
访问：
zhiyou001:50070
zhiyou002:50070

enter description here

High availability test:

//zhiyou001和zhiyou002都先安装fuser
yum -y install psmisc
//关闭hadoop
cd /opt/hadoop/hadoop-2.7.3/sbin/
./stop-all.sh
//重新开启
./start-all.sh

Browser test

a. Check status

zhiyou001:50070（active）
zhiyou002:50070（standby）

b. Turn off the active (zhiyou001) namenode node
kill -9 “namenode节点”
c. Visit again; find that the node is automatically upgraded to (active)
zhiyou002:50070

Upload file test

Uploaded files can only be uploaded and viewed through the active node. When the active node is down, the standby node is upgraded to the active node. You can also view the original uploaded file.

Possible problems:

The blogger encountered an error that the datanode node could not be started during the configuration process, and
finally checked the log file and found that the id in the generated name file and the data file were inconsistent. It should be a problem during formatting. The simplest and rude solution: delete the hadoop files of the last two hosts and the root directory zhiyou folder generated by the formatting of the three hosts. Re-copy the hadoop folder to the last two hosts. The problem is solved! It is also possible that the Baidu datanode cannot be started to generate the name and data files according to the tutorial configuration.

Next article :
Big data cluster construction-"Highly Available Habse"

Big data cluster construction --- "High-availability hadoop chapter" - rookie Xiaohui

Big data cluster construction-"high availability hadoop chapter"-rookie Xiaohui

Three, hadoop cluster construction (first copy zhiyou001 to others, make sure that zookeeper configuration is completed before configuration)

Guess you like