Installation and deployment of Hadoop 2.5.0 under CentOS7

background

Record the installation and deployment process of Hadoop 2.5.0 under CentOS7

step

1. Create a new cdh folder and unzip the hadoop compressed package into the cdh folder

#mkdir cdh
#tar -zxvf hadoop-2.5.0-cdh5.3.6.tar.gz -C cdh

2. Switch to the etc/hadoop directory under the hadoop decompression directory, and modify hadoop-env.sh, mapred-env.sh, mapred-site.xml.template, hdfs-site.xml, yarn-site.xml, core-site .xml and slaves seven files.

Both envs can only modify the JAVA_HOME of the file on the room (java1.8 in my CentOS is installed in the /home/szc/jdk8_64 directory, so JAVA_HOME is set to /home/szc/jdk8_64)

export JAVA_HOME=/home/szc/jdk8_64

After the mapred-site.xml.template file is modified as follows, rename it to mapred-site.xml file

<configuration>

    <property>

        <name>mapreduce.framework.name</name>

        <value>yarn</value>

    </property>

    <property>

        <name>mapreduce.jobhistory.address</name>

        <value>192.168.57.141:10020</value>

    </property>

    <property>

        <name>mapreduce.jobhistory.webapp.address</name>

        <value>192.168.57.141:19888</value>

    </property>

</configuration>

hdfs-site.xml

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

    <property>

        <name>dfs.namenode.secondary.http-address</name>

        <value>192.168.57.141:50091</value>

    </property>

</configuration>

yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

    <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

    </property>



    <property>

        <name>yarn.resoucemanager.hostname</name>

        <value>192.168.57.141</value>

    </property>



    <property>

        <name>yarn.log-aggregation-enable</name>

        <value>true</value>

    </property>

    <property>

        <name>yarn.log-aggregation.retain-seconds</name>

        <value>604800</value>

    </property>

</configuration>

core-site.xml, note that the szc in the last two attributes is replaced with your own user name, and the directory corresponding to hadoop.tmp.dir should also be created by yourself

<configuration>

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://192.168.57.141:8020</value>

    </property>



    <property>

        <name>hadoop.tmp.dir</name>

        <value>/home/szc/cdh/hadoop-2.5.0-cdh5.3.6/data/tmp</value>

    </property>



    <property>

        <name>hadoop.proxyuser.szc.hosts</name>

        <value>*</value>

    </property>



    <property>

        <name>hadoop.proxyuser.szc.groups</name>

        <value>*</value>

    </property>


    <property>

        <name>hadoop.proxyuser.root.hosts</name>

        <value>*</value>

    </property>



    <property>

        <name>hadoop.proxyuser.root.groups</name>

        <value>*</value>

    </property

</configuration>

slaves

192.168.57.141

All ip above are the native ip of centos

 

3. Format hdfs

Switch to the bin directory of the hadoop decompression directory, and then run the command

#hdfs namenode -format

The screenshot after completion is as follows

 

4. Start the corresponding process

Switch to the sbin directory of the Hadoop decompression directory, run start-dfs.sh, start-yarn.sh to start hdfs and yarn, and then run the following command to start historyserver

./mr-jobhistory-daemon.sh start historyserver

 

5. Windows browser to view the ui interface of the cluster

Open port 50070 first

[root@localhost sbin]# firewall-cmd --add-port=50070/tcp --permanent

success

[root@localhost sbin]# firewall-cmd --reload

success

Then enter the centos ip:50070 in the windows browser, and the following interface will be displayed after pressing Enter

 

So far, hadoop deployment is complete

Conclusion

Above, thank you

Guess you like

Origin blog.csdn.net/qq_37475168/article/details/107304245