Hadoop installation of three models

Preparations (operating on Linux clients)

  • Install Linux (centOS7)

  • Turn off the firewall, IP Host Name Mapping (vi / etc / hosts), modify the host name (vi / etc / hostname)

  • Installation jdk

    tar -zxvf jdk-8u144-linux-x64.tar.gz -C /opt/module

    Configuration environment variable

    vi /etc/profile
    
    #JAVA_HOME
    export JAVA_HOME=/opt/module/jdk1.8.0_144
    export PATH=PATH=$PATH:$JAVA_HOME/bin
    
    使环境变量生效
    source /etc/profile

Hadoop Local Mode (client machine 1)

  1. Hadoop installation

    tar -zxvf hadoop-2.8.4.tar.gz -C /opt/module
  2. Configuration environment variable

    #HADOOP_HOME
    export HADOOP_HOME=/opt/module/hadoop-2.8.4/
    export PATH=PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    使环境变量生效
    source /etc/profile
  3. Profiles

    hadoop-en.sh

    export JAVA_HOME=/opt/module/jdk1.8.0_144
  4. Hadoop comes with example test program

    Directory /opt/module/hadoop-2.8.4/share/hadoop/mapreduce of hadoop-mapreduce-examples-2.8.4.jar

Pseudo Hadoop distribution pattern (the client machine 1)

  • Cluster Programming

    bigdata111 bigdata112 bigdata113
    HDFS NN SN DN DN DN
    YARN NM RM NM NM

    NN: NameNode DN: DataNode SN: SecondaryNameNode

    RM:ResourceManager NM:NodeManager

  • Free Password
    • Generating a public and private key ssh-keygen -t rsa three consecutive carriage
    • ssh-copy-id host 1
    • ssh-copy-id host 2
    • ssh-copy-id host 3
  1. Install Hadoop, configure the environment variables

  2. Profiles

    core-site.xml

    <!-- 指定HDFS中NameNode的地址 -->
    <property>
     <name>fs.defaultFS</name>
     <value>hdfs://主机名1:9000</value>
    </property>
    
    <!-- 指定hadoop运行时产生文件的存储目录 -->
    <property>
     <name>hadoop.tmp.dir</name>
     <value>/opt/module/hadoop-2.X.X/data/tmp</value>
    </property>

    hdfs-site-xml

    <!--数据冗余数-->
    <property>
     <name>dfs.replication</name>
     <value>3</value>
    </property>
    
    <!--secondary的地址-->
    <property>
     <name>dfs.namenode.secondary.http-address</name>
     <value>主机名1:50090</value>
    </property>
    
    <!--关闭权限-->
    <property>
     <name>dfs.permissions</name>
     <value>false</value>
    </property>

    yarn-site.xml

    <!-- reducer获取数据的方式 -->
    <property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
    </property>
    
    <!-- 指定YARN的ResourceManager的地址 -->
    <property>
     <name>yarn.resourcemanager.hostname</name>
     <value>主机名1</value>
    </property>
    
    <!-- 日志聚集功能使能 -->
    <property>
     <name>yarn.log-aggregation-enable</name>
     <value>true</value>
    </property>
    
    <!-- 日志保留时间设置7天(秒) -->
    <property>
     <name>yarn.log-aggregation.retain-seconds</name>
     <value>604800</value>
    </property>

    mapred-site.xml

    <!-- 指定mr运行在yarn上-->
    <property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
    </property>
    
    <!--历史服务器的地址-->
    <property>
     <name>mapreduce.jobhistory.address</name>
     <value>主机名1:10020</value>
    </property>
    
    <!--历史服务器页面的地址-->
    <property>
     <name>mapreduce.jobhistory.webapp.address</name>
     <value>主机名1:19888</value>
    </property>

    hadoop-en.sh

    export JAVA_HOME=/opt/module/jdk1.8.0_144
  3. Formatting NameNode

    hadoop namenode -format

Hadoop fully distributed mode (client machine 3)

  • Three machines: a pseudo-distributed mode are more than a configuration file slaves
bigdata111、bigdata112、bigdata113(自己设置的主机名)

Guess you like

Origin www.cnblogs.com/codehug/p/11334269.html