1. Environmental preparation
Centos7 virtual machine a
jdk1.8
hadoop-3.1.3
with Huawei open source mirror station download address: https://mirrors.huaweicloud.com/java/jdk/
2. Java environment installation
Login as root user and create folder
cd /opt
mkdir app
mkdir soft
Upload the downloaded jdk and hadoop installation packages to the /opt/soft directory of the virtual machine
1. Create hadoop users and user groups
#创建用户组
groupadd hadoop
#创建用户
useradd hadoop
#hadoop用户设置密码
passwd hadoop
#将/opt目录的所有者修改为hadoop,否则没有权限操作
chown -R hadoop:hadoop /opt
2. Edit /etc/sudoers
#vim /etc/sudoers find the "root ALL=(ALL) ALL" line,
#Insert a new line below, the content is "hadoop ALL=(ALL) ALL"
vim /etc/sudoers
#加入下面的内容
hadoop ALL=(ALL) ALL
Use :wq! to save and exit editing. Must add! Otherwise editing is unsuccessful.
#切换成hadoop用户
su hadoop
The prepared installation package and the location of the installation package.
Subsequent environment construction and operations are performed under the hadoop user
3. Unzip jdk
tar -zxvf jdk-8u151-linux-x64.tar.gz -C ../app/
4. Rename the decompressed jdk package.
cd ../app
#注意更换为自己jdk的包名称
mv jdk1.8.0_151/ java
5. Configure java environment variables
vim ~/.bashrc
#加入下面的内容
export JAVA_HOME=/opt/app/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
#然后使配置生效
source ~/.bashrc
#验证java配置是否生效
java -version
The following picture appears to indicate that the installation is successful.
3. hadoop pseudo-distributed installation
1. Unzip the hadoop installation package and configure hadoop environment variables
cd /opt/soft
tar -zxvf hadoop-2.7.7.tar.gz -C ../app/
#对解压后的包进行重新命名
cd ../app
mv hadoop-2.7.7/ hadoop
Configure hadoop environment variables
#编辑~/.bashrc
vim ~/.bashrc
#添加以下内容
export HADOOP_HOME=/opt/app/hadoop
export PATH=${HADOOP_HOME}/sbin:${HADOOP_HOME}/bin:$PATH
#使环境变量文件生效
source ~/.bashrc
Verify that the environment variable is set successfully.
2. Modify the hosts file and modify the hostname
#修改虚拟机主机名称
sudo hostnamectl set-hostname hadoop
#查看当前虚拟机名称
hostname
#编辑hosts
sudo vim /etc/hosts
3. Log in without password
# 执行该命令后遇到提示信息,一直按回车就可以
ssh-keygen -t rsa
# 将你的公共密钥填充到一个远程机器上的authorized_keys文件中
ssh-copy-id hadoop
Test whether ssh is password-free successfully
ssh hadoop
4. Configure hadoop-env.sh
cd ./hadoop/etc/hadoop
vim hadoop-env.sh
#在文件中添加或者修改,并保存
export JAVA_HOME=/opt/app/java
5. Modify core-site.xml
vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/app/hadoop/tmp</value>
</property>
</configuration>
#注意将ip和路径更换称自己的定义的
6. Modify hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/app/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/app/hadoop/tmp/dfs/data</value>
</property>
</configuration>
#注意将ip和路径更换称自己的定义的
7. Modify mapred-site.xml
#注意将ip和路径更换称自己的定义的
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/opt/app/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/opt/app/hadoop</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/opt/app/hadoop</value>
</property>
</configuration>
8. Modify yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
9. Modify slaves
vim slaves
#修改为
hadoop
10. Format namenode
hdfs namenode -format
The following indicates that the formatting was successful.
11. Start hadoop
start-all.sh
#或者分别启动hdfs和yarn
start-dfs.sh
start-yarn.sh
jps to verify whether the startup is successful, and the following 5 processes appear to indicate that the startup is successful.
12. Visit the webui of hadoop to view the status
ip:9870 to access
yarn's access
ip:8088
to this hadoop pseudo-distributed even if the setup is successful.