Pre-Operation
jdk and hadoop environment variables to be configured
Name three servers are
hadoop112、hadoop113、hadoop114
Modify the / etc / hosts file to bind the three name servers
For example
hadoop112 192.168.1.112
hadoop113 192.168.1.113
hadoop114 192.168.1.114note
hadoop and jdk directories under / opt folder,
Where the environment variable as follows
If the configuration file later appeared hadoop path problem, according to their actual environment changed a bit
Configuration Map:
hadoop fully distributed configuration table hadoop112 hadoop113 hadoop114 HDFS NameNode
DataNode
DataNode
SecondaryNameNode
DataNode
YARN NodeManager
ResourceManager
NodeManager
NodeManager
Modify workers file (version 2.x equivalent of slaves file) can not have spaces and blank lines
The following operations are modify the configuration file, at hadoop-3.2.1 / etc / hadoop folders
First modify workers file! Otherwise behind DataNode node failed to start
vim workers
Writes: workers record is datanode address
hadoop112
hadoop113
hadoop114
1) core profile
Configuring core-site.xml note path problem!
sudo vim core-site.xml
Write the following in the configuration file
<!-- 指定HDFS中NameNode的地址 --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop112:9000</value> </property> <!-- 指定Hadoop运行时产生文件的存储目录 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop-3.2.1/data/tmp</value> </property> <!-- 缓存文件大小 --> <property> <name>io.file.buffer.size</name> <value>131072</value> </property>
(2) HDFS profile
2.1, configuration hadoop-env.sh
sudo came hadoop-env.sh
Add the value of java environment variables, java home directory can be obtained by echo $ JAVA_HOME (the premise is installed and configured java)
Adding to the beginning of the file
export JAVA_HOME=/opt/module/jdk1.8.0_211/
2.2 configured hdfs-site.xml
sudo vim hdfs-site.xml
Note that the file is written to the copy <configuration> tag within
<!-- 副本数设置为3 --> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- 指定 hadoop 辅助名称节点的地址 --> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop114:50090</value> </property> <!-- 指定名称结点缓存数据的路径 --> <property> <name>dfs.namenode.name.dir</name> <value>/opt/module/hadoop-3.2.1/data/tmp</value> </property> <!-- 指定数据结点缓存数据的路径 --> <property> <name>dfs.datanode.data.dir</name> <value>/opt/module/hadoop-3.2.1/data/data</value> </property>
(3) YARN profile
3.1, configuration yarn-env.sh
sudo came yarn-env.sh
Write
export JAVA_HOME=/opt/module/jdk1.8.0_211/
3.2, placed yarn-site.xml
sudo vim yarn-site.xml
Note copied into <configuration> tag
<!-- Reducer获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定启动YARN的ResourceManager的主机地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop113</value> </property>
(4) MapReduce profile
4.1, configuration mapred-env.sh
sudo came mapred-env.sh
Write
export JAVA_HOME=/opt/module/jdk1.8.0_211/
4.2, placed mapred-site.xml
sudo vim mapred-site.xml
<!-- 指定以Yarn方式运行 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property>
Distribution Profile ===== xsync script reading this blog
https://blog.csdn.net/qq_41813208/article/details/102575933
Distributed by xsync script
Execute the following command, the modified hadoop profile synchronization to hadoop113, hadoop114 server
xsync /opt/moudel/hadoop-3.2.1
Test it hadoop113, hadoop114 server configuration files there and hadoop112 as synchronized
For example, look at the core-site.xml file
cat /opt/module/hadoop-3.2.1/etc/hadoop/core-site.xml
And whether the same hadoop112
Rallied server part
Finally rallied cluster
first need to quit all servers DataNode, NameNode, SecondaryNameNode processClose enter jps jps show all processes except the
note! ! !
Namenode needs to be formatted before starting hdfsIf this is the first time you need to perform
hdfs namenode -format
Behind not use this command for the following reasons linked https://blog.csdn.net/qq_41813208/article/details/100753659
(* Note that if the front and formatting will not need to format the reasons, can not see the reason namenode blog has been formatted)
First, start hdfs
close methodInput: stop-dfs.sh
Start method
start-dfs.sh
This script file to the root directory hadoop sbin / under
Second, start YARN
huge hole need to take note!
Must be turned on hadoop113, because ResourceManeger on hadoop113!
carried outstart-yarn.sh
Start to get results, if she encounters a password, configure it no secret login
Login no secret about these two reference blog:
https://blog.csdn.net/qq_41813208/article/details/102597273
https://blog.csdn.net/qq_41813208/article/details/102575933
When all start below,
HDFS will start the process line when you start hdfs
Yarn will start one row to start the process YARN
hadoop112 | hadoop113 | hadoop114 | |
HDFS | NameNode DataNode |
DataNode |
SecondaryNameNode DataNode |
YARN | NodeManager |
ResourceManager NodeManager |
NodeManager |
Finally, to test
Note: hadoop-3.x version of the port by the 50070 into a 9870 port
If you can not access the page, the server's firewall is turned off
在hadoop112上关闭,输入sudo systemctl stop firewalld.service 即可关闭防火墙,就可以访问到下面的页面
永久关闭sudo systemctl disable firewalld.service
浏览器输入hadoop112的ip+9870端口访问页面表示成功!
http://hadoop112:9870
问题解决篇
如果出现其他结点都启动起来了,但是名称结点起不来。说明namenode经过了多次格式化,解决方法是
首先先关闭集群
删除 对应文件(就是前面core-site.xml配置中hadoop.tmp.dir配置对应的路径),然后重新格式化一下。
hdfs namenode -format
原因:https://blog.csdn.net/qq_41813208/article/details/100753659
重新格式化后,启动hdfs
输入 start-dfs.sh
这时候你会发现那么node结点启动起来了。