Playing Skill: Hadoop Big Data platform to build a distributed cluster environment

Personal blog navigation page (click on the right link to open a personal blog): Daniel take you on technology stack 

1 Overview

This article describes big data platform Hadoop distributed environment to build, deploy the following diagram for the Hadoop nodes will be deployed in NameNode master1, SecondaryNameNode deployed in master2, slave1, slave2, slave3 were deployed in a node DataNode

NN = NameNode (node ​​name)

SND = SecondaryNameNode (NameNode secondary node)

DN = DataNode (node ​​data)

2 preparation

(1) Preparation five servers

如:master1、master2、slave1、slave2、slave3

(2) Shut down all servers firewall

$ systemctl stop firewalld
$ systemctl disable firewalld

(3) are modified / etc / hosts file for each server, as follows:

192.168.56.132 master1
192.168.56.133 master2
192.168.56.134 slave1
192.168.56.135 slave2
192.168.56.136 slave3

Note: The modification corresponds to a server / etc / hostname file, respectively master1, master2, slave1, slave2, slave3

(4) respectively to create a common user and group in each server

$ groupadd hadoop #增加新用户组
$ useradd hadoop -m -g hadoop #增加新用户
$ passwd hadoop #修改hadoop用户的密码

Switch to hadoop user: su hadoop

(5) between each server password-free login configuration, perform a separate service in their respective

$ ssh-keygen -t rsa #一直按回车,会生成公私钥
$ ssh-copy-id hadoop@master1 #拷贝公钥到master1服务器
$ ssh-copy-id hadoop@master2 #拷贝公钥到master2服务器
$ ssh-copy-id hadoop@slave1 #拷贝公钥到slave1服务器
$ ssh-copy-id hadoop@slave2 #拷贝公钥到slave2服务器
$ ssh-copy-id hadoop@slave3 #拷贝公钥到slave3服务器

Note: The above operation requires the user to log in to the operating hadoop

(6) Download hadoop Package, hadoop-2.7.5.tar.gz

Official website address: https: //archive.apache.org/dist/hadoop/common/hadoop-2.7.5/

3 to start the installation deployment

(1) Create a hadoop installation directory

$ mkdir -p /home/hadoop/app/hadoop/{tmp,hdfs/{data,name}}

(2) the installation package to unpack / home / hadoop / app / lower hadoop

$tar zxf tar -zxf hadoop-2.7.5.tar.gz -C /home/hadoop/app/hadoop

(3) arranged hadoop environment variables, modify the / etc / profile

JAVA_HOME=/usr/java/jdk1.8.0_131
JRE_HOME=/usr/java/jdk1.8.0_131/jre
HADOOP_HOME=/home/hadoop/app/hadoop/hadoop-2.7.5
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export PATH

(4) Refresh environment variable

$source /etc/profile

4 Configure Hadoop

(1) arranged core-site.xml

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/core-site.xml
<configuration>
    <property>
	    <!-- 配置HDFS的NameNode所在节点服务器 -->
        <name>fs.defaultFS</name>
        <value>hdfs://master1:9000</value>
    </property>

    <property>
	    <!-- 配置Hadoop的临时目录 -->
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/app/hadoop/tmp</value>
    </property>
</configuration>

The default configuration Address: http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-common/core-default.xml

(2) Configuration hdfs-site.xml

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/hdfs-site.xml
<configuration>
    <property>
	    <!-- 配置HDFS的DataNode的备份数量 -->
        <name>dfs.replication</name>
        <value>3</value>
    </property>

    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/hadoop/app/hadoop/hdfs/name</value>
    </property>

    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/hadoop/app/hadoop/hdfs/data</value>
    </property>
   
    <property>
        <!-- 配置HDFS的权限控制 -->
	    <name>dfs.permissions.enabled</name>
	    <value>false</value>
    </property>

    <property>
        <!-- 配置SecondaryNameNode的节点地址 -->
        <name>dfs.namenode.secondary.http-address</name>
        <value>master2:50090</value>
    </property>
</configuration>

The default configuration Address: http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

(3) placed mapred-site.xml

$ cp /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/mapred-site.xml.template /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/mapred-site.xml
$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/mapred-site.xml
<configuration>
    <property>
	    <!-- 配置MR运行的环境 -->
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

The default configuration Address: http://hadoop.apache.org/docs/r2.7.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

(4) placed yarn-site.xml

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    
    <property>
	    <!-- 配置ResourceManager的服务节点 -->
        <name>yarn.resourcemanager.hostname</name>
        <value>master1</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master1:8032</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master1:8088</value>
    </property>
</configuration>

The default configuration Address: http://hadoop.apache.org/docs/r2.7.5/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

(5) arranged slaves

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/slaves
slave1
slave2
slave3

slaves in the configuration file is where the node service DataNode

(6) arranged hadoop-env

Hadoop-env.sh modify the file JAVA_HOME environment variables, as follows:

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_131

(7) arranged yarn-env

Modify yarn-env.sh file JAVA_HOME environment variables, as follows:

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_131

(8) arranged mapred-env

Modify mapred-env.sh file JAVA_HOME environment variables, as follows:

$ vi /home/hadoop/app/hadoop/hadoop-2.7.5/etc/hadoop/mapred-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_131

(9) In the master1 hadoop are configured to the remote copy maser2, slave1, slave2, slave3 server

$ scp -r /home/hadoop/app/hadoop hadoop@master2:/home/hadoop/app/
$ scp -r /home/hadoop/app/hadoop hadoop@slave1:/home/hadoop/app/
$ scp -r /home/hadoop/app/hadoop hadoop@slave2:/home/hadoop/app/
$ scp -r /home/hadoop/app/hadoop hadoop@slave3:/home/hadoop/app/

5 Start test

(1) Initialization Hadoop cluster nodes master1

$ hadoop namenode -format

(2) Start Hadoop cluster

$ start-dfs.sh
$ start-yarn.sh

If (3) Verify successful cluster

Browser access port, following the successful deployment of proven cluster 50070

Attached Java / C / C ++ / machine learning / Algorithms and Data Structures / front-end / Android / Python / programmer reading / single books books Daquan:

(Click on the right to open there in the dry personal blog): Technical dry Flowering
===== >> ① [Java Daniel take you on the road to advanced] << ====
===== >> ② [+ acm algorithm data structure Daniel take you on the road to advanced] << ===
===== >> ③ [database Daniel take you on the road to advanced] << == ===
===== >> ④ [Daniel Web front-end to take you on the road to advanced] << ====
===== >> ⑤ [machine learning python and Daniel take you entry to the Advanced Road] << ====
===== >> ⑥ [architect Daniel take you on the road to advanced] << =====
===== >> ⑦ [C ++ Daniel advanced to take you on the road] << ====
===== >> ⑧ [ios Daniel take you on the road to advanced] << ====
=====> > ⑨ [Web security Daniel take you on the road to advanced] << =====
===== >> ⑩ [Linux operating system and Daniel take you on the road to advanced] << = ====

There is no unearned fruits, hope you young friends, friends want to learn techniques, overcoming all obstacles in the way of the road determined to tie into technology, understand the book, and then knock on the code, understand the principle, and go practice, will It will bring you life, your job, your future a dream.

Published 47 original articles · won praise 0 · Views 286

Guess you like

Origin blog.csdn.net/weixin_41663412/article/details/104860305