hadoop-3.0.0 cluster construction

hadoop-3.0.0 cluster construction

  • download hadoop package
wget -c http://ftp.jaist.ac.jp/pub/apache/hadoop/common/hadoop-3.0.0/hadoop-3.0.0.tar.gz
  • decompress
tar -zxvf hadoop-3.0.0.tar.gz -C /usr/java/
  • configure
    • Configure environment variables, open vim /etc/profilefiles
    export HADOOP_HOME=/usr/java/hadoop-3.0.0
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    
    make it effective immediately
    source /etc/profile
    
    • configure/etc/hosts
    192.168.56.101 master
    192.168.56.102 slave1
    192.168.56.103 slave2
    
    • turn off firewall
    systemctl stop firewalld
    
    • Configuration vim core-site.xmlfile (in the etc/hadoop directory), add the following configuration
    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
      </property>
      <property>
        <name>hadoop.proxyuser.wujinlei.groups</name>
        <value>*</value>
      </property>
      <property>
        <name>hadoop.proxyuser.wujinlei.hosts</name>
        <value>*</value>
      </property>
    </configuration>
    
    • Configuration vim hdfs-site.xmlfile, add the following configuration.
      • masterOpen ports on the machine 9870for external access to the web page ( NameNode HTTP UI) to view the cluster status.
      • slaveOpen ports on the machine 9864for external access to web pages ( DataNode HTTP UI).
    <configuration>
      <property>
        <name>dfs.replication</name>
        <value>2</value>
      </property>
      <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/home/wujinlei/hadoop/dfs/name</value>
      </property>
      <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/home/wujinlei/hadoop/dfs/data</value>
      </property>
      <property>
        <name>dfs.namenode.http-address</name>
        <value>master:9870</value>
      </property>
      <property>
        <name>dfs.datanode.http.address</name>
        <value>master:9864</value>
      </property>
    </configuration>
    
    • Configuration vim yarn-site.xml, add the following configuration, masteropen the port on the machine 8088for external access to the web page, and view the cluster task scheduling
    <configuration>
      <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
      </property>
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
      <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
      </property>
      <property>
        <name>yarn.application.classpath</name>
        <value>
            /usr/java/hadoop-3.0.0/etc/hadoop,
            /usr/java/hadoop-3.0.0/share/hadoop/common/lib/*,
            /usr/java/hadoop-3.0.0/share/hadoop/common/*,
            /usr/java/hadoop-3.0.0/share/hadoop/hdfs,
            /usr/java/hadoop-3.0.0/share/hadoop/hdfs/lib/*,
            /usr/java/hadoop-3.0.0/share/hadoop/hdfs/*,
            /usr/java/hadoop-3.0.0/share/hadoop/mapreduce/*,
            /usr/java/hadoop-3.0.0/share/hadoop/yarn,
            /usr/java/hadoop-3.0.0/share/hadoop/yarn/lib/*,
            /usr/java/hadoop-3.0.0/share/hadoop/yarn/*,
            /usr/java/jdk1.8.0_45/lib/tools.jar
        </value>
      </property>
      <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
      </property>
    </configuration>
    
    • Configuration vim mapred-site.xml, add the following configuration
    <configuration>
      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
      </property>
      <property>
        <name>mapreduce.jobhistory.address</name>
        <value>0.0.0.0:10020</value>
      </property>
      <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>0.0.0.0:19888</value>
      </property>
    </configuration>
    
    • Edit the file under etc/hadoop worksas follows
    slave1
    slave2
    
  • start the cluster
    • Copy the above configured hadoop file to the other two nodes slave1andslave2
    • Format before starting, commandhdfs namenode -format
    • Start alone
      • start dfs, commandstart-dfs.sh
      • start yarn, commandstart-yarn.sh
    • start all
      • Orderstart-all.sh
  • visit web page
  • http://192.168.56.101:8088
  • http://192.168.56.101:9870
  • start history service
mapred historyserver

This service is used to access historical task details

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325244577&siteId=291194637