1. hadoop download
Hadoop download address: http://hadoop.apache.org/
2. Create a hadoop directory under /home, and upload files to the specified directory
cd /home
mkdir Hadoop
cd hadoop
##解压
tar -zxf hadoop-3.3.0.tar.gz
3. Create tmp, hdfs/name, hdfs/data folders under Hadoop
mkdir tmp
mkdir hdfs
mkdir hdfs/date
mkdir hdfs/name
4. Set environment variables
vi /etc/profile
#set hadoop path
export HADOOP_HOME=/home/hadoop/hadoop-3.3.0
export PATH=$PATH:$HADOOP_HOME/bin
Environment variables take effect
source /etc/profile
5. Modify the 5 configuration files
hadoop-3.2.0/etc/hadoop/hadoop-env.sh
hadoop-3.2.0/etc/hadoop/core-site.xml
hadoop-3.2.0/etc/hadoop/hdfs-site.xml
hadoop-3.2.0/etc/hadoop/mapred-site.xml
hadoop-3.2.0/etc/hadoop/yarn-site.xml
5.1、hadoop-env.sh
#View java_home path echo $ JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_152
#注意下面如果用户不是root请修改成对应用户
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
Another configuration user
start-dfs.sh、stop-dfs.sh
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
start-yarn.sh、stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
5.2、core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<!--注释 : HDFS的URI,文件系统://namenode标识:端口号-->
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
<!--注释: namenode上本地的hadoop临时文件夹-->
</property>
</configuration>
5.3 hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>副本个数,配置默认是3,应小于datanode机器数量</description>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:50070</value>
<description>将绑定IP改为0.0.0.0,而不是本地回环IP,这样,就能够实现外网访问本机的50070端口了</description>
</property>
</configuration>
5.4 mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
5.5 yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
6. HDFS startup
(1) Format HDFS
bin/hdfs namenode -format
Formatting is to divide the DataNode in HDFS, a distributed file system, and store the initial metadata of all the divisions in the namenode. (If the server starts again, this step is also required, otherwise the startup may fail)
(2) start
启动 NameNode
sbin/hadoop-daemon.sh start namenode
启动 DataNode
sbin /hadoop-daemon.sh start datanode
启动 SecondaryNameNode
sbin/hadoop-daemon.sh start secondarynamenode
(3) Use the jps command to check whether the startup is successful, and if there is a result, it is successful.
NameNode
SecondaryNameNode
DataNode
(4) Visit xxxx:50070 and you will be successful if you can open the interface