Hadoop 2.5.0 version distributed deployment configuration process

Service planning for 3 servers:

 

nameNode

dataNode

resourceManager

nodeManager

jobHistoryServer

secondaryNameNode

hadoop.node.server01

(192.168.159.130)

AND

AND

 

AND

AND

 

hadoop.node.server02

(192.168.159.128)

 

AND

AND

AND

   

hadoop.node.server03

(192.168.159.129)

 

AND

 

AND

 

AND

1. First configure all properties on server01:

1. Download the hadoop package.

I am using the mature version 2.5.0. There are two ways to download: one is to use the packaged version of CDH, download address: http://archive.cloudera.com/cdh5/cdh/5/ , one is apache Download from the official website. The latest version of the official website is already version 2.7.5. To find the old version of 2.5.0, you must go to https://archive.apache.org/dist/hadoop/common/

2. Unzip the hadoop package, the command is tar -zxvf hadoop.tar.gz

3. Configure the JAVA environment variable values ​​for the following files

Configuration file address: ${HADOOP_HOME}/etc/hadoop, the configuration files are as follows:

hadoop-env.sh

mapred-env.sh

yarn-env.sh

The above three files are configured: export JAVA_HOME=/opt/modules/jdk1.8

4. Modify core-site.xml and add the following property configuration:

 ====core-site.xml====

            <!--Specify the first one as namenode-->

            <property>

                 <name>fs.defaultFS</name>

                 <value>hdfs://hadoop.node.server01:8020</value>

            </property>

    

            <property>

                <name>hadoop.tmp.dir</name>

                <value>/usr/local/hadoop/hadoop-2.5.0/data</value>

            </property>

5. Modify the configuration hdfs-site.xml and add the following properties:

 <!-- The number of distributed replicas is set to 3 -->

            <property>

                <name>dfs.replication</name>

                <value>3</value>

            </property>

            <!-- secondarynamenode hostname-->

            <property>

                <name>dfs.namenode.secondary.http-address</name>

                <value>hadoop.node.server03:50090</value>

            </property>

            <!-- namenode's web access host name: port number -->

            <property>

                <name>dfs.namenode.http-address</name>

                <value>hadoop.node.server01:50070</value>

            </property>

            <!-- Turn off permission checking user or user group-->

            <property>

                <name>dfs.permissions.enabled</name>

                <value>false</value>

            </property>

6. Modify yarn-site.xml and add the following properties:

<property>

                <name>yarn.resourcemanager.hostname</name>

                <value>hadoop.node.server02</value>

            </property>

            <property>

                <name>yarn.nodemanager.aux-services</name>

                <value>mapreduce_shuffle</value>

            </property>

            <property>

                <name>yarn.log-aggregation-enable</name>

                <value>true</value>

            </property>

            <property>

                <name>yarn.log-aggregation.retain-seconds</name>

                <value>86400</value>

            </property>

7. Modify mapred.site.xml

First rename mapred-site.xml.template to mapred-site.xml, and then add the following properties:

<property>

                <name>mapreduce.framework.name</name>

                <value>yarn</value>

            </property>

            <property>

                <name>mapreduce.jobhistory.address</name>

                <value>hadoop.node.server01:10020</value>

            </property>

            <property>

                 <name>mapreduce.jobhistory.webapp.address</name>

                 <value>hadoop.node.server01:19888</value>

            </property>

2. Send the above modified configuration file to the other two services:

Execute the following two commands in server01 and send them to server02 and server03 respectively to replace the configuration file:

scp /usr/local/hadoop/hadoop-2.5.0/etc/hadoop/* hadoop.node.server02: /usr/local/hadoop/hadoop-2.5.0/etc/hadoop/

scp /usr/local/hadoop/hadoop-2.5.0/etc/hadoop/* hadoop.node.server03: /usr/local/hadoop/hadoop-2.5.0/etc/hadoop/

3. Configure passwordless login

On the server01 server where the namenode is located, and the server02 server where the resourceManager is located, configure passwordless ssh login to other servers respectively.

1), execute the following command on server01:

ssh-keygen -t rsa

ssh-copy-id hadoop.node.server02

ssh-copy-id hadoop.node.server03

2) Execute the following command on server02:

ssh-keygen -t rsa

ssh-copy-id hadoop.node.server01

ssh-copy-id hadoop.node.server03

Fourth, format the namenode in the server01 server where the namenode is located

/usr/local/hadoop/hadoop-2.5.0/bin/hdfs purpose -format

5. Start the namenode service

In the server01 server, execute the command: /usr/local/hadoop/hadoop-2.5.0/sbin/start-dfs.sh

6. Start the YARN service

In the server02 server, execute the command: /usr/local/hadoop/hadoop-2.5.0/sbin/start-yarn.sh

Seven, start jobhistoryserver

On the server01 server, execute the command: /usr/local/hadoop/hadoop-2.5.0/sbin/mr-jobhistory-daemon.sh start historyserver

 

At this point, all services are ready.

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325809852&siteId=291194637