Ultra-detailed Hadoop installation (with SSH password-free login, detailed graphics and text, a must-see for novices)

Hadoop pseudo-distribution installation (with SSH password-free login, collect it to see)

 

table of Contents

Hadoop pseudo-distribution installation (with SSH password-free login, collect it to see)

One: Preparation before installation

Two: jdk installation, java environment configuration

Three: Pseudo-distributed hadoop environment installation

Four: SSH password-free login configuration [simplified startup]

Five: Summary


 

One: Preparation before installation

Work before installation: Ensure that the virtual machine and linux system have been installed, and the hadoop installation package and jdk installation package are ready

The following are the installation tools and network disk download links that may be used [Pick up]:

  1. hadoop-2.7.7.tar.gz

Link: https://pan.baidu.com/s/18tMvslkfjyulrip9-2IitQ 
extraction code: aoye 

     2. jdk-8u211-linux-x64.tar

Link: https://pan.baidu.com/s/1lGoPs6zwmkysw4uW2PhXag 
Extraction code: aoye 

     3. WinSCP-5.17.5-Setup

Link: https://pan.baidu.com/s/137BmJIktaGzN0p4fylbUhQ 
Extraction code: aoye 
 

Two: jdk installation, java environment configuration

1. Upload hadoop-2.7.7.tar.gz installation package and jdk-8u211-linux-x64.tar installation package to linux system [you can upload the installation package on windows system to linux system through winscp tool]

   

2. Place the installation package in the /opt directory under the root directory [You can select the corresponding directory when uploading the installation package! Can also be moved later]

3. Configure jdk [create directory jdk under /opt, unzip jdk under the file]

4. Configure environment variables and startup programs in the /etc/profile directory, enter: vi /etc/profile  [Press i to enter the editing mode, after editing, press esc and then enter : wq to save and exit]

#JAVA ENV
export JAVA_HOME=/jdk unzip directory
export PATH=$JAVA_HOME/bin:$PATH

5. Refresh to make the edited file take effect, enter: source /etc/profile

6. Enter to check whether the java environment is installed successfully: java -version 

 

Three: Pseudo-distributed hadoop environment installation

Hadoop installation mode: local installation, pseudo-distributed installation, fully distributed installation

Installation process: Unzip the installation package-configure the environment-modify five configuration files-format namenode-start hadoop

 

1. Unzip the hadoop-2.7.7.tar.gz installation package [ create a hadoop directory in the /opt directory, and extract hadoop-2.7.7.tar.gz to that directory]

The directory after decompression is as follows:

2. Configure hadoop environment [configure in the /etc/profile directory, enter: vi /etc/profile , press i to enter the editing mode, press esc and then enter : wq to save and exit after editing ]

#HADOOP ENV
export HADOOP_HOME=/hadoop解压目录
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

3. Refresh to make the file effective, and check whether the environment configuration is successful [Refresh: source /etc/profile   ; Check whether the configuration is successful: hadoop ]

4. Enter the /etc/hadoop directory under the hadoop decompression directory [for example: cd /opt/hadoop/hadoop-2.7.7/etc/hadoop]

5. Modify one of the five configurations: hadoop-env.sh [Press i to enter the editing mode, after editing, press esc and then enter : wq to save and exit]

6. Modify one of the five configuration files: hdfs-site.xml [Enter in the ../etc/hadoop directory: vi hdfs-site.xml ]

<property>
    <name>dfs.replication</name>
    <value>1</value>
</property>
<property>
    <name>dfs.permissions</name>
    <value>1</value>
</property>

7. Modify one of the five configuration files: core-site.xml

 <property>
       <name>fs.defaultFS</name>
       <value>hdfs://主机名:9000</value>
    </property>
    <property>
       <name>hadoop.tmp.dir</name>

      # The tmp directory needs to be created in the corresponding directory
       <value>/opt/hadoop/hadoop-2.7.7/tmp</value>
    </property>

8. Modify one of the five configuration files: mapred-site.xml [Without this configuration file, generate by copying mapred-site.xml.template, enter: cp mapred-site.xml.template mspred-site.xml ]

  <property>
       <name>mapreduce.framwork.name</name>
       <value>yarn</value>
    </property>

9. Configure one of five configuration files: yarn-site.xml

<property>
    <name>yarn-resourcemanager.hostname</name>
    <value>主机名</value>
</property>
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

10. Format the namenode [Go back to the main directory of hadoop decompression, enter the bin directory, and enter the command to format: hdfs namenode -format ]

After success, the screenshot is as follows:

[If there is a formatting error, you can look it up according to the reported log. Note that there should be no errors in the modified configuration file! !

11. Start hadoop [Enter the /sbin directory under the hadoop installation directory  and enter the command to start: ./start-all.sh ]

Note: You need to enter the password multiple times, which is troublesome. At the end, a password-free login configuration will be given (you don't need to enter the password multiple times to start Hadoop). The password input is not displayed, just press Enter after entering!

12. After successful startup, there must be the following five processes [namenode, datanode, secondarynamenode, resourcemanager. nodemanager], enter to view:   jps

13. Hadoop is configured and started successfully, you can enter the port number in the browser to access the web page: hostname: 50070

There is a problem of not being able to access the page, which can be solved by the following:

1. Check the firewall status and turn off the firewall:

2. Configure the windows side ip mapping [Enter the following directory: C:\Windows\System32\drivers\etc, modify the hosts file to add the linux host ip and host name, remember to save and exit ]

 

Four: SSH password-free login configuration [simplified startup]

1. Close hadoop [under the sbin directory: ./stop-all.sh]

2. Enter the /root/.ssh directory: cd ~/.ssh

3. Enter ssh-keygen -t rsa [Enter three times to generate two files id_rsa (private key), id_rsa.pub (shared key)]

4. ssh-copy-id host name [copy the public key to the machine to log in without password]

5. After configuration, it is as follows [No need to enter password after startup]

Five: Summary

The above is the hadoop pseudo-distribution installation and password-free login. There are many explanations for unclear places.

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/m0_46440794/article/details/115256984