Hadoop pseudo-distribution installation (with SSH password-free login, collect it to see)
table of Contents
Hadoop pseudo-distribution installation (with SSH password-free login, collect it to see)
One: Preparation before installation
Two: jdk installation, java environment configuration
Three: Pseudo-distributed hadoop environment installation
Four: SSH password-free login configuration [simplified startup]
One: Preparation before installation
Work before installation: Ensure that the virtual machine and linux system have been installed, and the hadoop installation package and jdk installation package are ready
The following are the installation tools and network disk download links that may be used [Pick up]:
- hadoop-2.7.7.tar.gz
Link: https://pan.baidu.com/s/18tMvslkfjyulrip9-2IitQ
extraction code: aoye
2. jdk-8u211-linux-x64.tar
Link: https://pan.baidu.com/s/1lGoPs6zwmkysw4uW2PhXag
Extraction code: aoye
3. WinSCP-5.17.5-Setup
Link: https://pan.baidu.com/s/137BmJIktaGzN0p4fylbUhQ
Extraction code: aoye
Two: jdk installation, java environment configuration
1. Upload hadoop-2.7.7.tar.gz installation package and jdk-8u211-linux-x64.tar installation package to linux system [you can upload the installation package on windows system to linux system through winscp tool]
2. Place the installation package in the /opt directory under the root directory [You can select the corresponding directory when uploading the installation package! Can also be moved later]
3. Configure jdk [create directory jdk under /opt, unzip jdk under the file]
4. Configure environment variables and startup programs in the /etc/profile directory, enter: vi /etc/profile [Press i to enter the editing mode, after editing, press esc and then enter : wq to save and exit]
#JAVA ENV
export JAVA_HOME=/jdk unzip directory
export PATH=$JAVA_HOME/bin:$PATH
5. Refresh to make the edited file take effect, enter: source /etc/profile
6. Enter to check whether the java environment is installed successfully: java -version
Three: Pseudo-distributed hadoop environment installation
Hadoop installation mode: local installation, pseudo-distributed installation, fully distributed installation
Installation process: Unzip the installation package-configure the environment-modify five configuration files-format namenode-start hadoop
1. Unzip the hadoop-2.7.7.tar.gz installation package [ create a hadoop directory in the /opt directory, and extract hadoop-2.7.7.tar.gz to that directory]
The directory after decompression is as follows:
2. Configure hadoop environment [configure in the /etc/profile directory, enter: vi /etc/profile , press i to enter the editing mode, press esc and then enter : wq to save and exit after editing ]
#HADOOP ENV
export HADOOP_HOME=/hadoop解压目录
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
3. Refresh to make the file effective, and check whether the environment configuration is successful [Refresh: source /etc/profile ; Check whether the configuration is successful: hadoop ]
4. Enter the /etc/hadoop directory under the hadoop decompression directory [for example: cd /opt/hadoop/hadoop-2.7.7/etc/hadoop]
5. Modify one of the five configurations: hadoop-env.sh [Press i to enter the editing mode, after editing, press esc and then enter : wq to save and exit]
6. Modify one of the five configuration files: hdfs-site.xml [Enter in the ../etc/hadoop directory: vi hdfs-site.xml ]
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>1</value>
</property>
7. Modify one of the five configuration files: core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://主机名:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name># The tmp directory needs to be created in the corresponding directory
<value>/opt/hadoop/hadoop-2.7.7/tmp</value>
</property>
8. Modify one of the five configuration files: mapred-site.xml [Without this configuration file, generate by copying mapred-site.xml.template, enter: cp mapred-site.xml.template mspred-site.xml ]
<property>
<name>mapreduce.framwork.name</name>
<value>yarn</value>
</property>
9. Configure one of five configuration files: yarn-site.xml
<property>
<name>yarn-resourcemanager.hostname</name>
<value>主机名</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
10. Format the namenode [Go back to the main directory of hadoop decompression, enter the bin directory, and enter the command to format: hdfs namenode -format ]
After success, the screenshot is as follows:
[If there is a formatting error, you can look it up according to the reported log. Note that there should be no errors in the modified configuration file! ! 】
11. Start hadoop [Enter the /sbin directory under the hadoop installation directory and enter the command to start: ./start-all.sh ]
Note: You need to enter the password multiple times, which is troublesome. At the end, a password-free login configuration will be given (you don't need to enter the password multiple times to start Hadoop). The password input is not displayed, just press Enter after entering!
12. After successful startup, there must be the following five processes [namenode, datanode, secondarynamenode, resourcemanager. nodemanager], enter to view: jps
13. Hadoop is configured and started successfully, you can enter the port number in the browser to access the web page: hostname: 50070
There is a problem of not being able to access the page, which can be solved by the following:
1. Check the firewall status and turn off the firewall:
2. Configure the windows side ip mapping [Enter the following directory: C:\Windows\System32\drivers\etc, modify the hosts file to add the linux host ip and host name, remember to save and exit ]
Four: SSH password-free login configuration [simplified startup]
1. Close hadoop [under the sbin directory: ./stop-all.sh]
2. Enter the /root/.ssh directory: cd ~/.ssh
3. Enter ssh-keygen -t rsa [Enter three times to generate two files id_rsa (private key), id_rsa.pub (shared key)]
4. ssh-copy-id host name [copy the public key to the machine to log in without password]
5. After configuration, it is as follows [No need to enter password after startup]
Five: Summary
The above is the hadoop pseudo-distribution installation and password-free login. There are many explanations for unclear places.