Pseudo-distributed mode to build Hadoop


title: Hadoop pseudo-distributed mode structures

Quitters never win and winners never quit.

Operating environment:

  1. Ubuntu18.10-server version of the Mirror: Ubuntu-18.10-Live-Server-amd64.iso

  2. Oracle VM VirtualBox

  3. Hdoop latest version

  4. jdk1.8.0_191

Get started:

  1. New virtual machine (of course, is the latest version of Ubuntu, in order to improve usability, we recommend using Server Edition)

  2. Virtual machine's network settings to the default mode (network address translation) or bridge mode, it is recommended not to use NAT mode

  3. :( set up port forwarding virtual machine host ports can set up their own, they do not conflict with other processes can)

    a. host 9000 virtual machine 22 (ssh link dedicated port, it is recommended to keep in mind)

    b. 9001 host virtual machine 8088 (hadoop detection port)

  4. Start the virtual machine and configure SSH connection:

    Start the virtual machine (recommended no interface to start), open the local terminal, enter the command:

    $ ssh -p 9000 [email protected]  # username是安装镜像是设置的用户名

    Follow the prompts to choose yes, enter the password.

    Select the local terminal of reasons: my local terminal has been embellished and customized interface than the start of the black box experience is good, there is to be closer to the actual operating environment

    # hostname 配置
    $ hostname bigdata-senior01.chybinmy.com  # 临时修改,省去重启虚拟机
    $ vim /etc/hostname  # 永久修改,为以后着想
    $ bigdata-senior01.chybinmy.com  # 将里面的主机名改为bigdata-senior01.chybinmy.com
    
    # host 配置
    $ ifconfig  # 得到当前虚拟机网络地址(inet后面的第一个地址,假设为10.42.0.32)
    # 配置Host
    $ vim /etc/hosts
    $ 10.42.0.32 bigdata-senior01.chybinmy.com  # 在文件尾部添加
  5. New hadoop users:

    $ sudo adduser hadoop  # 新建hadoop用户,根据提示输入密码
  6. Hadoop user switching:

    $ su hadoop
  7. New folder and the folder owner changed hadoop:

    $ sudo mkdir /opt/modules
    
    $ sudo chown -R hadoop:hadoop /opt/modules
  8. JDK hadoop and download and copy them to the / opt / modules / directory:

    $ scp -P 9990 hadoop-2.9.2.tar.gz [email protected]:/opt/modules  # jdk类似

    Tip: Here you can use sftp to transfer files, I use a file manager that comes with ubuntu18.04 sftp transmission.

  9. Extracting files (hadoop and jdk):

    $ tar -zxvf hadoop-2.9.2.tar.gz  # jdk解压方法类似
    # jdk 解压后如果虚拟机没有java环境的话,
    # 需要配置java环境,请自行配置!!!
  10. Configuration hadoop:

    . A environment variable:

    $ vim /etc/profile
    
    $ export HADOOP_HOME="/opt/modules/hadoop-2.9.2"
    
    $ export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    
    执行:source /etc/profile 使得配置生效

    . B authentication parameters:

    $ echo $HADOOP_HOME  # 值为:/opt/modules/hadoop-2.5.0

    . C JAVA_HOME configuration parameters hadoop-env.sh, mapred-env.sh, yarn-env.sh file:

    $ sudo vim  ${HADOOP_HOME}/etc/hadoop/hadoop-env.sh
    
    修改JAVA_HOME参数为:
    
    $ export JAVA_HOME="/opt/modules/jdk1.8.0_181"

    . D arranged core-site.xml:

    $ sudo vim ${HADOOP_HOME}/etc/hadoop/core-site.xml  # 输入该命令
    # 在<configuration></cinfiguration>中添加以下内容
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://bigdata-senior01.chybinmy.com:8020</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/data/tmp</value>
    </property>

    . E Create a temporary directory:

    $ sudo mkdir -p /opt/data/tmp  # 创建目录
    
    $ sudo chown -R hadoop:hadoop /opt/data/tmp  # 改目录拥有者
  11. Configuring HDFS:

    . A configuration hdfs-site.xml:

    $ sudo vim ${HADOOP_HOME}/etc/hadoop/hdfs-site.xml
    # 在<configuration></cinfiguration>中添加以下内容
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

    . B Format HDFS:

    $ hdfs namenode -format
    # 格式化后,查看/opt/data/tmp目录下是否有dfs目录
    # 如果有,说明格式化成功

    . C test results:

    # 启动NameNode
    $ ${HADOOP_HOME}/sbin/hadoop-daemon.sh start namenode
    # 启动DataNode
    $ ${HADOOP_HOME}/sbin/hadoop-daemon.sh start datanode
    # 启动SecondaryNameNode
    $ ${HADOOP_HOME}/sbin/hadoop-daemon.sh start secondarynamenode
    # 再运行:
    $ jps
    # 如果有类似下面四条结果就是成功了:
    3034 NameNode
    
    3233 Jps
    
    3193 SecondaryNameNode
    
    3110 DataNode
    
  12. Configuration YARN:

    . A place mapred-site.xml:

    $ cp ${HADOOP_HOME}/etc/hadoop/mapred-site.xml.template ${HADOOP_HOME}/etc/hadoop/mapred-site.xml
    $ sudo vim ${HADOOP_HOME}/etc/hadoop/mapred-site.xml
    # 在<configuration></cinfiguration>中添加以下内容
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

    . B placed yarn-site.xml:

    $ sudo vim ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
    # 在<configuration></cinfiguration>中添加以下内容
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>bigdata-senior01.chybinmy.com</value>
    </property>

    . C Start Resourcemanager:

    $ ${HADOOP_HOME}/sbin/yarn-daemon.sh start resourcemanager

    . D start nodemanager:

    $ ${HADOOP_HOME}/sbin/yarn-daemon.sh start nodemanager

    . E Detection:

    # 再运行:
    $ jps
    # 如果有类似下面六条结果就是成功了:
    3034 NameNode
    
    4439 NodeManager
    
    4197 ResourceManager
    
    4543 Jps
    
    3193 SecondaryNameNode
    
    3110 DataNode
  13. Check if successfully built a pseudo-distributed hadoop:

    # YARN的Web客户端端口号是8088,
    # 因为我们设置了虚拟机的端口转发,
    # 所以只需要主机访问127.0.0.1:9001就可以查看
    # 只要成功访问网站则说明搭建成功

Build experience:

  1. Build them better not restart the virtual machine, if the restart suggested deleting the user to re-build hadoop

  2. If the virtual machine is not configured java environment, it will prompt no jps this command, so you should configure the virtual machine to anticipate your java environment

  3. After each modification of the environment, it is preferable to use the source / etc / profile update configuration

  4. You may be used as $ {HADOOP_HOME} / sbin / start_all.sh to the quick start and quick to close hadoop service stop_all.sh

Guess you like

Origin www.cnblogs.com/fofade/p/10977686.html