Hadoop pseudo-distributed cluster construction (super detailed)

If you need java installation package and hadoop installation package, you can choose to download it here

Link: https://pan.baidu.com/s/1sQQ-uiwmJvYFPgpVsftL6Q?

Extraction code: aubt

The first step is to install centos7 and check the network environment

Import the centos7 environment into the virtual machine. The installation of centos7 will not be described in detail here. Let’s go directly to the practical information.

Pay attention to the VLAN settings: The virtual machine is set to NAT mode

Check the linux network ip address;

Check the network connection ping  www.baidu.com

If Ping fails

Open network card editor   

After opening, modify it to

Save and exit after editing is complete.

Then restart the network service service network restart

Modify the name hostname Modify vim /etc/hostname

Reboot the computer reboot

The third step is to upload the java and hadoop installation packages and complete the environment configuration.

Install java jdk

Upload the corresponding installation package to the virtual machine and decompress it

Create two new directories software (installation package file) and server (decompressed file) in the virtual machine node1

mkdir -p /develop/software

mkdir -p /develop/server

Enter and import the installation package (I use xsheel here to transfer local files)

Check the java version and complete the uninstallation

Retrieve java environment rpm -qa | grep jdk or rpm -qa | grep java (same)

Uninstall the java environment yum remove java.jdk

Remaining after uninstallation is complete

Install new java

Unzip the jdk tar package to the /develop/server directory

tar -zxvf jdk-8u241-linux-x64.tar.gz -C /develop/server/ (the following is the newly created folder)

Unzip the hadoop installation package to /develop/server

tar -zxvf hadoop-2.7.5.tar.gz -C /develop/server/

View java installation files

Modify folder root permissions chown -R root (rwx) (read and write access) address

Modify the file name mv original file name Modify the file name (convenient for environment configuration)

Enter the jdk directory and copy the jdk address

Configure environment variables

vim /etc/profile

Refresh configuration file

source /etc/profile

the fourth step

Modify hadoop configuration file

Set namenode as the entrance to the java program  

Enter the file core-site.xml

<configuration>

        <property>

                <name>fs.defaultFS</name>

                <value>hdfs://localhost:9000</value>

        </property>

        <property>

                <name>hadoop.temp.dir</name>

                <value>/develop/server/hadoop/temp</value>

        </property>

</configuration>

Configure hadoop java path  

Enter the file vim Hadoop-env.sh

Modify java path

Modify the hdfs-site.xml file and set replication to 1. There is only one node here.

     <property>

                <name>dfs.replication</name>

                <value>1</value>

        </property>

        <property>

                <name>dfs.namenode.name.dir</name>

                <value>file:/develop/server/hadoop/temp/name</value>

        </property>

         <property>

                <name>dfs.namenode.data.dir</name>

                <value>file:/develop/server/hadoop/temp/data</value>

        </property>

Modify vim mapred-site.xml.template file

Modify vim yarn-site.xml file

        <property>

                <name>yarn.resourcemanager.hostname</name>

                <value>master</value>

        </property>

        <property>

                <name>yarn.nodemanager.aux-services</name>

                <value>mapreduce_shuffle</value>

        </property>

         <property>

                <name>yarn.resourcemanager.webapp.address</name>

                <value>localhost:9000</value>

        </property>

Switch to the etc/hadopp directory of hadoop and modify hadoop-env.sh

last step  

start and stop

Enter the bin file of hadoop and cd bin

Input: :hdfs namenode -format

get

Indicates successful formatting

Start hdfs and yarn

Enter the sbin folder cd sbin

Start start-all.sh

get

Use jsp to view 

stop

get

Use jps to view

Access hadoop at startup

Enter http://localhost:50070 in the browser   to access hdfs. The access is successful.

Guess you like

Origin blog.csdn.net/weixin_63279539/article/details/127187886