centos build hadoop pseudo-distributed

1. Environmental preparation

Centos7 virtual machine a
jdk1.8
hadoop-3.1.3
with Huawei open source mirror station download address: https://mirrors.huaweicloud.com/java/jdk/

2. Java environment installation

Login as root user and create folder

cd /opt
mkdir app
mkdir soft

Upload the downloaded jdk and hadoop installation packages to the /opt/soft directory of the virtual machine

1. Create hadoop users and user groups

#创建用户组
groupadd hadoop
#创建用户
useradd hadoop
#hadoop用户设置密码
passwd hadoop
#将/opt目录的所有者修改为hadoop,否则没有权限操作
chown -R hadoop:hadoop /opt

2. Edit /etc/sudoers

#vim /etc/sudoers find the "root ALL=(ALL) ALL" line,
#Insert a new line below, the content is "hadoop ALL=(ALL) ALL"

vim /etc/sudoers
#加入下面的内容
hadoop    ALL=(ALL)       ALL

Use :wq! to save and exit editing. Must add! Otherwise editing is unsuccessful.

#切换成hadoop用户
su hadoop

The prepared installation package and the location of the installation package.
insert image description here
Subsequent environment construction and operations are performed under the hadoop user

3. Unzip jdk

tar -zxvf jdk-8u151-linux-x64.tar.gz -C ../app/

4. Rename the decompressed jdk package.

cd ../app
#注意更换为自己jdk的包名称
mv jdk1.8.0_151/ java

5. Configure java environment variables

vim ~/.bashrc
#加入下面的内容
export JAVA_HOME=/opt/app/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib​
export PATH=${JAVA_HOME}/bin:$PATH
#然后使配置生效
source ~/.bashrc
#验证java配置是否生效
java -version

The following picture appears to indicate that the installation is successful.
insert image description here

3. hadoop pseudo-distributed installation

1. Unzip the hadoop installation package and configure hadoop environment variables

cd /opt/soft
tar -zxvf hadoop-2.7.7.tar.gz -C ../app/
#对解压后的包进行重新命名
cd ../app
mv hadoop-2.7.7/ hadoop 

Configure hadoop environment variables

#编辑~/.bashrc
vim ~/.bashrc
#添加以下内容
export HADOOP_HOME=/opt/app/hadoop
export PATH=${HADOOP_HOME}/sbin:${HADOOP_HOME}/bin:$PATH
#使环境变量文件生效
source ~/.bashrc

Verify that the environment variable is set successfully.
insert image description here

2. Modify the hosts file and modify the hostname

#修改虚拟机主机名称
sudo hostnamectl set-hostname hadoop
#查看当前虚拟机名称
hostname
#编辑hosts
sudo vim /etc/hosts 

insert image description here

3. Log in without password

# 执行该命令后遇到提示信息,一直按回车就可以
ssh-keygen -t rsa
# 将你的公共密钥填充到一个远程机器上的authorized_keys文件中
ssh-copy-id hadoop

insert image description here
Test whether ssh is password-free successfully

ssh hadoop

insert image description here

4. Configure hadoop-env.sh

cd ./hadoop/etc/hadoop
vim hadoop-env.sh
#在文件中添加或者修改,并保存
export JAVA_HOME=/opt/app/java

5. Modify core-site.xml

vim core-site.xml
<configuration>
	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://hadoop:9000</value>
	</property>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>file:/opt/app/hadoop/tmp</value>
	</property>
</configuration>
#注意将ip和路径更换称自己的定义的

6. Modify hdfs-site.xml

<configuration>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>hadoop:50090</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/opt/app/hadoop/tmp/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/opt/app/hadoop/tmp/dfs/data</value>
        </property>
</configuration>
#注意将ip和路径更换称自己的定义的

7. Modify mapred-site.xml

#注意将ip和路径更换称自己的定义的
<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>hadoop:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>hadoop:19888</value>
        </property>
        <property>
                <name>yarn.app.mapreduce.am.env</name>
                <value>HADOOP_MAPRED_HOME=/opt/app/hadoop</value>
        </property>
        <property>
                <name>mapreduce.map.env</name>
                <value>HADOOP_MAPRED_HOME=/opt/app/hadoop</value>
        </property>
        <property>
                <name>mapreduce.reduce.env</name>
                <value>HADOOP_MAPRED_HOME=/opt/app/hadoop</value>
        </property> 	        
</configuration>

8. Modify yarn-site.xml

<configuration>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>hadoop</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
</configuration>

9. Modify slaves

vim slaves
#修改为
hadoop

10. Format namenode

hdfs namenode -format

The following indicates that the formatting was successful.
insert image description here

11. Start hadoop

start-all.sh
#或者分别启动hdfs和yarn
start-dfs.sh
start-yarn.sh

jps to verify whether the startup is successful, and the following 5 processes appear to indicate that the startup is successful.
insert image description here

12. Visit the webui of hadoop to view the status

ip:9870 to access
insert image description here
yarn's access
ip:8088
insert image description here
to this hadoop pseudo-distributed even if the setup is successful.

Guess you like

Origin blog.csdn.net/m0_46120209/article/details/127202482