Big data cluster construction --- "High-availability hadoop chapter" - rookie Xiaohui

Big data cluster construction-"high availability hadoop chapter"-rookie Xiaohui


Summary article:
Big data cluster construction (zookeeper, high-availability hadoop, high-availability hbase)
Continued article:
Big data cluster construction-"zookeeper"


Three, hadoop cluster construction (first copy zhiyou001 to others, make sure that zookeeper configuration is completed before configuration)

  1. create:mkdir /opt/hadoop
  2. enter:cd /opt/hadoop
  3. Upload:
  4. Unzip:tar -zxvf tar -zxvf hadoop-2.7.3
  5. Configure hadoop environment variables
vi /etc/profile
//添加
#配置hadoop
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin
  1. Refresh the configuration file and test
source /etc/profile
hadoop
  1. Modify core-site.xml
cd /opt/hadoop/hadoop-2.7.3/etc/hadoop/
vi core-site.xml

//在<configuration>中添加如下内容
 <!-- 指定hdfs的nameservice为ns --> 
        <property> 
                <name>fs.defaultFS</name> 
                <value>hdfs://ns</value> 
        </property> 

        <!--指定hadoop数据存放目录--> 
        <property> 
                <name>hadoop.tmp.dir</name> 
                <value>/zhiyou/hadoop/tmp</value> 
        </property> 

        <property> 
                <name>io.file.buffer.size</name> 
                <value>4096</value> 
        </property> 

        <!--指定zookeeper地址--> 
        <property> 
                <name>ha.zookeeper.quorum</name> 
                <value>zhiyou001:2181,zhiyou002:2181,zhiyou003:2181</value> 
        </property> 


        <property>
                <name>ipc.client.connect.max.retries</name>
                <value>100</value>
                <description>Indicates the number of retries a client will make to establish
                a server connection.
                </description>
        </property>

        <property>
                <name>ipc.client.connect.retry.interval</name>
                <value>10000</value>
                <description>Indicates the number of milliseconds a client will wait for
                before retrying to establish a server connection.
                </description>
        </property>

  • enter description here
  1. hdfs-site.xml
vi hdfs-site.xml
//同样在<configuration>中添加如下内容
<!--指定hdfs的nameservice为ns,需要和core-site.xml中的保持一致 -->      
    <property>      
        <name>dfs.nameservices</name>      
        <value>ns</value>      
    </property>    
    <!-- ns下面有两个NameNode,分别是nn1,nn2 -->  
    <property>  
       <name>dfs.ha.namenodes.ns</name>  
       <value>nn1,nn2</value>  
    </property>  
    <!-- nn1的RPC通信地址 -->  
    <property>  
       <name>dfs.namenode.rpc-address.ns.nn1</name>  
       <value>zhiyou001:9000</value>  
    </property>  
    <!-- nn1的http通信地址 -->  
    <property>  
        <name>dfs.namenode.http-address.ns.nn1</name>  
        <value>zhiyou001:50070</value>  
    </property>  
    <!-- nn2的RPC通信地址 -->  
    <property>  
        <name>dfs.namenode.rpc-address.ns.nn2</name>  
        <value>zhiyou002:9000</value>  
    </property>  
    <!-- nn2的http通信地址 -->  
    <property>  
        <name>dfs.namenode.http-address.ns.nn2</name>  
        <value>zhiyou002:50070</value>  
    </property>  
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->  
    <property>  
         <name>dfs.namenode.shared.edits.dir</name>  
         <value>qjournal://zhiyou001:8485;zhiyou002:8485;zhiyou003:8485/ns</value>  
    </property>  
    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->  
    <property>  
          <name>dfs.journalnode.edits.dir</name>  
          <value>/zhiyou/hadoop/journal</value>  
    </property>  
    <!-- 开启NameNode故障时自动切换 -->  
    <property>  
          <name>dfs.ha.automatic-failover.enabled</name>  
          <value>true</value>  
    </property>  
    <!-- 配置失败自动切换实现方式 -->  
    <property>  
            <name>dfs.client.failover.proxy.provider.ns</name>  
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>  
    </property>  
    <!-- 配置隔离机制 -->  
    <property>  
             <name>dfs.ha.fencing.methods</name>  
             <value>sshfence</value>  
    </property>  
    <!-- 使用隔离机制时需要ssh免登陆 -->  
    <property>  
            <name>dfs.ha.fencing.ssh.private-key-files</name>  
            <value>/root/.ssh/id_rsa</value>  
    </property>  
                                
    <property>      
        <name>dfs.namenode.name.dir</name>      
        <value>file:///zhiyou/hadoop/hdfs/name</value>      
    </property>      
      
    <property>      
        <name>dfs.datanode.data.dir</name>      
        <value>file:///zhiyou/hadoop/hdfs/data</value>      
    </property>      
      
    <property>      
       <name>dfs.replication</name>      
       <value>3</value>      
    </property>     
    <!-- 在NN和DN上开启WebHDFS (REST API)功能,不是必须 -->                                                                      
    <property>      
       <name>dfs.webhdfs.enabled</name>      
       <value>true</value>      
    </property>
  • enter description here
  1. mapred-site.xml
mv mapred-site.xml.template mapred-site.xml
vi mapred-site.xml 
//添加configuration
<property>      
    <name>mapreduce.framework.name</name> 
    <value>yarn</value> 
</property>
  1. yarn-site.xml
vi yarn-site.xml 
//添加configuration
<!-- 指定nodemanager启动时加载server的方式为shuffle server --> 
  <property> 
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle</value> 
  </property> 
  <!-- 指定resourcemanager地址 --> 
  <property> 
    <name>yarn.resourcemanager.hostname</name> 
    <value>zhiyou003</value> 
  </property> 
  • enter description here
  1. hadoop-env.sh
//找到文件的export JAVA_HOME=${JAVA_HOME} 一行修改
export JAVA_HOME=/opt/java/jdk1.8.0_141
  1. slaves
//修改localhost为
zhiyou001
zhiyou002
zhiyou003
  1. Copy the content to zhiyou002, zhiyou003
//1. hadoop
scp -r /opt/hadoop/ root@zhiyou002:/opt/hadoop/
scp -r /opt/hadoop/ root@zhiyou003:/opt/hadoop/
//2. /etc/profile
scp -r /etc/profile root@zhiyou002:/etc/profile
scp -r /etc/profile root@zhiyou003:/etc/profile
  1. Three start zookeeper
cd /opt/zookeeper/zookeeper-3.4.12/bin/
./zkServer.sh start
  1. Three start journalnode cluster
cd /opt/hadoop/hadoop-2.7.3/sbin/
./hadoop-daemon.sh  start  journalnode
  • enter description here
  1. Format zkfc on zhiyou001:hdfs zkfc -formatZK
  2. Format hdfs on zhiyou001:hadoop namenode -format
  3. Start namenode on zhiyou001:./hadoop-daemon.sh start namenode
  • enter description here
  1. Start data synchronization and standby namenode on zhiyou002
hdfs namenode -bootstrapStandby
./hadoop-daemon.sh start namenode
  1. Start the datanode on zhiyou01:./hadoop-daemons.sh start datanode
  2. Start yarn on zhiyou03:./start-yarn.sh
  3. Start zkfc on zhiyou01:./hadoop-daemons.sh start zkfc
  4. View the startup status of all processes
    zhiyou001 (7)
  • enter description here
    zhiyou002 (7 pieces)
  • enter description here
    zhiyou003 (6 pieces)
  • enter description here
  1. At this time, the browser can access hadoop (this machine needs to be equipped with C:\Windows\System32\drivers\etc\hosts and the virtual machine is the same)
192.168.80.128 zhiyou001
192.168.80.129 zhiyou002
192.168.80.130 zhiyou003
访问:
zhiyou001:50070
zhiyou002:50070

enter description here

  1. High availability test:
//zhiyou001和zhiyou002都先安装fuser
yum -y install psmisc
//关闭hadoop
cd /opt/hadoop/hadoop-2.7.3/sbin/
./stop-all.sh
//重新开启
./start-all.sh
  1. Browser test

a. Check status

zhiyou001:50070(active)
zhiyou002:50070(standby)
  • enter description here
    b. Turn off the active (zhiyou001) namenode node
    kill -9 “namenode节点”
  • enter description here
    c. Visit again; find that the node is automatically upgraded to (active)
    zhiyou002:50070
  • enter description here
  1. Upload file test
  • Uploaded files can only be uploaded and viewed through the active node. When the active node is down, the standby node is upgraded to the active node. You can also view the original uploaded file.
    enter description here

Possible problems:

  • The blogger encountered an error that the datanode node could not be started during the configuration process, and
    finally checked the log file and found that the id in the generated name file and the data file were inconsistent. It should be a problem during formatting. The simplest and rude solution: delete the hadoop files of the last two hosts and the root directory zhiyou folder generated by the formatting of the three hosts. Re-copy the hadoop folder to the last two hosts. The problem is solved! It is also possible that the Baidu datanode cannot be started to generate the name and data files according to the tutorial configuration.

Next article :
Big data cluster construction-"Highly Available Habse"

Guess you like

Origin blog.csdn.net/qq_39231769/article/details/102750693