hadoop集群平台的搭建

环境配置:
master:192.168.1.20
slave1:192.168.1.21
slave2:192.168.1.22

准备工作:

#yum安装需要的服务,关闭防火墙和selinux,
yum -y install wget vim gcc net-tools curl lrzsz rsync
yum update

systemctl status firewalld
systemctl stop  firewalld 
systemctl disable  firewalld 

vim /etc/selinux/config  ##修改为disabled

vim /etc/security/limits.conf  #可以打开的文件数量,追加到尾部
* soft nofile 65536      # open files  (-n)
* hard nofile 65536
* soft nproc 65565
* hard nproc 65565       # max user processes   (-u)

更改hostname:

vim /etc/hostname
#分别在master和两个slave中删除原来的本机hostname,添加
master/slave1/slave2

更改hosts:

vim /etc/hosts
#在三个主机中同样追加:
192.168.1.20    master
192.168.1.21    slave1
192.168.1.22    slave2

安装jdk:

tar -xzvf /usr/local/src/jdk-16_linux-x64_bin.tar.gz -C /usr/local/
vim /etc/profile
##追加
export JAVA_HOME=/usr/local/jdk-16
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JRE_HOME=$JAVA_HOME/jre
##刷新
source /etc/profile  
##验证
java -version

新增hadoop用户:

#三个机器都要新增
useradd hadoop
#密码为123456
passwd hadoop

配置ssh无密码验证登录【每个节点都需要操作】

切换到hadoop用户:

[root@localhost ~]# su - hadoop
[hadoop@localhost ~]$ 

节点生成密钥对:

ssh-keygen -t rsa -P ''或者直接 ssh-keygen
#一直确认就可以
#查看hadoop目录下是否生成无密码密钥对
[hadoop@localhost .ssh]$ cd /home/hadoop/.ssh
[hadoop@localhost .ssh]$ ls
id_rsa  id_rsa.pub
#将id_rsa.pub追加到授权key文件中
[hadoop@localhost .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@localhost .ssh]$ ls -a 
.  ..  authorized_keys  id_rsa  id_rsa.pub
`给authorized_keys修改权限`
[hadoop@master .ssh]$ chmod 600 ~/.ssh/authorized_keys 
[hadoop@master .ssh]$ ll 
总用量 16
-rw------- 1 hadoop hadoop  410 3月  25 20:43 authorized_keys
-rw------- 1 hadoop hadoop 1679 3月  25 20:34 id_rsa
-rw-r--r-- 1 hadoop hadoop  410 3月  25 20:34 id_rsa.pub
-rw-r--r-- 1 hadoop hadoop  171 3月  25 20:50 known_hosts 

配置ssh服务:

`root用户登录`
[root@localhost ~]# vim /etc/ssh/sshd_config 
#找到#PubkeyAuthentication yes 将#号去掉
PubkeyAuthentication yes

重启ssh服务:

 systemctl restart sshd

验证ssh登录本机:

`切换到hadoop用户`
su - hadoop
[hadoop@master ~]$ ssh localhost
#首次登录主机时提示系统无法确认host主机的真实性,只知道它的公钥指纹,询问用户是否需要继续连接,此时输入yes即可。下次登录直接登录,不需要输入任何的确认和密码,就表示配置ssh无密码登录成功

交换ssh密钥

在master和slave1,slave2 之间交换密钥,实现master和slave的ssh无密码登录
将master节点的公钥id_rsa.pub复制到每个slave节点:【在hadoop用户下操作 】

[hadoop@master ~]$ scp ~/.ssh/id_rsa.pub hadoop@slave1:~/
The authenticity of host 'slave1 (192.168.1.21)' can't be established.
ECDSA key fingerprint is SHA256:jxYxSoANdkaRE8gUXyYb0qmCBDBjg8lBsfbeXl+aM4E.
ECDSA key fingerprint is MD5:25:15:dd:51:12:ee:b5:6e:fd:08:81:b2:78:84:26:3c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,192.168.1.21' (ECDSA) to the list of known hosts.
hadoop@slave1's password: 
Permission denied, please try again.
hadoop@slave1's password: 
id_rsa.pub                                      100%  410   756.8KB/s   00:00    
#同样给slave2操作
scp ~/.ssh/id_rsa.pub hadoop@slave2:~/

在每个slave节点上把master节点复制的公钥复制到authorized_keys文件中:
在slave1,slave2节点上登录hadoop用户操作:

[hadoop@localhost ~]$ cat ~/id_rsa.pub >> ~/.ssh/authorized_keys 
`注意路径是不同的`

每个slave节点删除master的公钥文件id_rsa.pub:

rm -rf ~/id_rsa.pub

将slave节点的公钥文件保存到master:【每个slave都要操作一次】

#将slave节点的公钥复制到master下:
[hadoop@localhost .ssh]$ scp ~/.ssh/id_rsa.pub hadoop@master:~/
The authenticity of host 'master (192.168.1.20)' can't be established.
ECDSA key fingerprint is SHA256:AlbOTMHeCJIgoXJOW7d9N9pSMRUs11+z++45WorTBKA.
ECDSA key fingerprint is MD5:14:20:a8:b5:b0:b7:54:f7:5e:07:b2:0b:31:ee:6a:fc.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.1.20' (ECDSA) to the list of known hosts.
hadoop@master's password: 
id_rsa.pub                              100%  410    24.9KB/s   00:00  
#在master节点将复制过来的slave公钥复制到authorized_keys 文件中
[hadoop@master ~]$ cat ~/id_rsa.pub >> ~/.ssh/authorized_keys 
#删除slave节点的公钥文件
[hadoop@master ~]$ rm -rf ~/id_rsa.pub 

验证:

查看master的authorized_keys 文件中有master,slave1,slave2共3个公钥,slave1和slave2中有本身的公钥和master的公钥共2个。

`在master上分别登录两个slave:`
[hadoop@master .ssh]$ ssh hadoop@slave1
Last failed login: Thu Mar 25 22:18:20 CST 2021 from master on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Thu Mar 25 22:10:12 2021 from localhost
[hadoop@localhost ~]$ exit 
登出
Connection to slave1 closed.
[hadoop@master .ssh]$ ssh hadoop@slave2
Last login: Thu Mar 25 22:11:52 2021 from localhost
[hadoop@localhost ~]$ exit 
登出
`在slave上登录mater:`
[hadoop@localhost .ssh]$ ssh hadoop@master
Last failed login: Thu Mar 25 22:42:12 CST 2021 from slave1 on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Thu Mar 25 20:57:34 2021 from localhost
[hadoop@master ~]$ exit 
登出

master节点安装hadoop

下载,解压缩,移动到/usr/local/下:

wget https://mirrors.bfsu.edu.cn/apache/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz
tar -zxvf /usr/local/src/hadoop-3.2.2.tar.gz 
mv /usr/local/src/hadoop-3.2.2 /usr/local/

配置hadoop环境变量:

vim /etc/profile
#追加
#hadoop
export HADOOP_HOME=/usr/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
#刷新
source /etc/profile
#检查
[root@master hadoop]# /usr/local/hadoop/bin/hdaoop version
Hadoop 3.2.2
Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932
Compiled by hexiaoqiao on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 5a8f564f46624254b27f6a33126ff4
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar

修改hadoop-env.sh配置文件:

cd /usr/local/hadoop/etc/hadoop/
vim hadoop-env.sh
#追加
export JAVA_HOME=/usr/local/jdk-16

配置参数:

配置hdfs-site.xml配置文件:

#在<configuration> </configuration>中间添加
		 <property>
                <name>dfs.namenode.http-address</name>
                <value>master:50070</value>
        </property>
 		<property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/usr/local/hadoop/dfs/name</value>
                <description>hdfs的namenode在本地文件系统中的位置</description>
        </property>
        <property>
                <name>dfs.namenode.data.dir</name>
                <value>file:/usr/local/hadoop/dfs/data</value>
                <description>hdfs的datanode在本地文件系统中的位置</description>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
                <description>冗余副本数为3</description>
        </property>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>192.168.1.20:50090</value>
                <description>定义hdfs对应的http服务器的地址和端口</description>
        </property>
        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>ture</value>
                <description>是否通过http协议读取hdfs的文件。如果是则集群的安全性较差</descrip
tion>
        </property>

配置core-site.xml配置文件:

#在<configuration> </configuration>中间添加
 		<property>
                <name>fs.defaultFS</name>
                <value>hdfs://192.168.1.20:9000</value>
                <description>文件系统主机和端口</description>
        </property>
        <property>
                <name>io.file.buffer.size</name>
                <value>131072</value>
                <description>流文件的缓存区大小为128M</description>
        </property>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>file:/usr/local/hadoop/tmp</value>
                <description>临时文件夹(此项若是没配置,则系统默认的临时文件夹为/tmp/hadoop-hadoop.此目录在linux系统重新启动时会被删除,必须重新执行hadoop系统格式化命令,否则hadoop运行会出错)</description>
        </property>
	<!-- 当前用户全设置成root -->
		<property>
			<name>hadoop.http.staticuser.user</name>
			<value>root</value>
		</property>

	<!-- 不开启权限检查 -->
		<property>
			<name>dfs.permissions.enabled</name>
			<value>false</value>
		</property>

配置mapred-site.xml:

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
                <description>默认local模式,classic,yarn。使用yarn是使用yarn集群来实现资源的分配</description>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>master:10020</value>
                <description>定义作业历史服务器的地址和端口,通过作业历史服务来查看已经运行完成的mapreduce任务</description>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>master:19888</value>
                <description>定义历史服务器web应用访问的地址和端口</description>
        </property>
        <property>
        <name>mapreduce.application.classpath</name>
        <value>
            /usr/local/hadoop/etc/hadoop,
            /usr/local/hadoop/share/hadoop/common/*,
            /usr/local/hadoop/share/hadoop/common/lib/*,
            /usr/local/hadoop/share/hadoop/hdfs/*,
            /usr/local/hadoop/share/hadoop/hdfs/lib/*,
            /usr/local/hadoop/share/hadoop/mapreduce/*,
            /usr/local/hadoop/share/hadoop/mapreduce/lib/*,
            /usr/local/hadoop/share/hadoop/yarn/*,
            /usr/local/hadoop/share/hadoop/yarn/lib/*
        </value>
    </property>
</configuration>

配置yarn-site.xml:

<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.resourcemanager.address</name>
                <value>master:8032</value>
                <description>RsourceManager提供给客户端访问的地址,客户端通过该地址向RM提交应用程序,杀死应用程序等</description>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>master:8030</value>
                <description>定义作业历史服务器的地址和端口,通过历史服务器来查看已经运行完的mapreduce作业记录</description>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>master:8031</value>
                <description>ResourceManager提供给nodemanager的地址,nodemanager通过该地址向MR汇报心跳,领取任务等</description>
        </property>
        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>master:8033</value>
                <description>resourcemanager提供给管理员的地址,管理员可以通过该地址向RM发送管理命令</description>
        </property>
        <property>
                <name>yarn.resourcemanager.webapp.address</name>
                <value>master:8088</value>
                <description>resourcemanager对web服务器提供的地址,用户可以通过该地址在浏览器中查看集群的各类信息</description>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
                <description></description>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
                <description>通过该配置,用户可以自定义一些服务,例如,map-reduce的shuffle功能就是采用这种方式来实现的,这样就可以在nodemanager上扩展自己的服务</description>
        </property>
</configuration>

hadoop其他的相关配置:

  1. 配置workers文件:
vim /usr/local/hadoop/etc/hadoop/workers
#删除loaclhost,添加
master
slave1
slave2
#将master和两个slave节点都当作data节点
  1. 新建目录/usr/local/hadoop/tmp,/usr/local/hadoop/dfs/name,/usr/local/hadoop/dfs/data
[root@master hadoop]# mkdir /usr/local/hadoop/tmp
[root@master hadoop]# mkdir /usr/local/hadoop/dfs/name -p
[root@master hadoop]# mkdir /usr/local/hadoop/dfs/data -p 
  1. 修改/usr/local/hadoop/权限
[root@master hadoop]# chown -R hadoop:hadoop /usr/local/hadoop/
  1. 同步配置到slave节点:
[root@master hadoop]# scp -r  /usr/local/hadoop/ root@slave1:/usr/local/
[root@master hadoop]# scp -r  /usr/local/hadoop/ root@slave2:/usr/local/
  1. 在每个slave节点上配置hadop环境变量:
vim /etc/profile  
#追加
#hadoop
export HADOOP_HOME=/usr/local/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
#刷新
source /etc/profile
  1. 在slave节点上给hadoop目录授权
[root@slave1 ~]# chown -R hadoop:hadoop /usr/local/hadoop/
[root@slave2 ~]# chown -R hadoop:hadoop /usr/local/hadoop/ 

7.在master和slave节点上切换到Hadoop用户

su - hadoop

master节点进行namenode格式化

将namenode上的数据清零,第一次启动HDFS时要进行格式化,以后启动无须进行格式化,否则会datanode丢失。另外,只要运行过HDFS,hadoop的工作目录就会有数据,如果需要重新格式化,则需要在格式化前删除工作目录的数据,否则会出问题。

[hadoop@master ~]$ /usr/local/hadoop/bin/hdfs namenode -format
WARNING: /usr/local/hadoop/logs does not exist. Creating.
2021-03-26 19:22:49,107 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.168.1.20
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.2.2
..............................................略
2021-03-27 09:53:13,285 INFO common.Storage: Storage directory /usr/local/hadoop/dfs/name has been successfully formatted.
............................略
2021-03-26 19:22:50,686 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.20
************************************************************/

启动namenode:

[hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start namenode
WARNING: Use of this script to start HDFS daemons is deprecated.
WARNING: Attempting to execute replacement "hdfs --daemon start" instead.

查看java进程:

[hadoop@master hadoop]$ jps 
1625 NameNode
1691 Jps

启动datanode:

[hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start datanode
WARNING: Use of this script to start HDFS daemons is deprecated.
WARNING: Attempting to execute replacement "hdfs --daemon start" instead.
[hadoop@master hadoop]$ jps
1825 Jps
1762 DataNode
1625 NameNode

启动secondarynamenode:

[hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start secondarynamenode
WARNING: Use of this script to start HDFS daemons is deprecated.
WARNING: Attempting to execute replacement "hdfs --daemon start" instead.
[hadoop@master hadoop]$ jps
1762 DataNode
1893 SecondaryNameNode
1926 Jps
1625 NameNode

检查集群是否连接成功:

[hadoop@master sbin]$ hdfs dfsadmin -report
.........略
Live datanodes (1):

Name: 192.168.1.20:9866 (master)
Hostname: master
Decommission Status : Normal
Configured Capacity: 30041706496 (27.98 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 4121952256 (3.84 GB)
DFS Remaining: 25919746048 (24.14 GB)
DFS Used%: 0.00%
DFS Remaining%: 86.28%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 27 13:36:07 CST 2021
Last Block Report: Sat Mar 27 13:22:55 CST 2021
Num of Blocks: 0

出错了,没连接到两个slave节点。

解决方式:

#一键停止服务
`/usr/local/hadoop/sbin/start-all.sh
删除前面格式化和启动服务产生的数据[删除下面目录中的所有文件]:
/usr/local/hadoop/logs/
/usr/local/hadoop/dfs/data/
/usr/local/hadoop/dfs/name/
/usr/local/hadoop/tmp/
重新格式化master
启动服务
再次检查`
[hadoop@master sbin]$ hdfs dfsadmin -report
Configured Capacity: 92271554560 (85.93 GB)
Present Capacity: 81996009472 (76.36 GB)
DFS Remaining: 81995984896 (76.36 GB)
DFS Used: 24576 (24 KB)
DFS Used%: 0.00%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.1.20:9866 (master)
Hostname: master
Decommission Status : Normal
Configured Capacity: 30041706496 (27.98 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 4121952256 (3.84 GB)
DFS Remaining: 25919746048 (24.14 GB)
DFS Used%: 0.00%
DFS Remaining%: 86.28%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 27 13:36:07 CST 2021
Last Block Report: Sat Mar 27 13:22:55 CST 2021
Num of Blocks: 0


Name: 192.168.1.21:9866 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 31114924032 (28.98 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 3144146944 (2.93 GB)
DFS Remaining: 27970768896 (26.05 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.90%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 27 13:36:07 CST 2021
Last Block Report: Sat Mar 27 13:22:49 CST 2021
Num of Blocks: 0


Name: 192.168.1.22:9866 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 31114924032 (28.98 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 3009445888 (2.80 GB)
DFS Remaining: 28105469952 (26.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.33%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 27 13:36:07 CST 2021
Last Block Report: Sat Mar 27 13:22:49 CST 2021
Num of Blocks: 0

停止服务:

  /usr/local/hadoop/sbin/hadoop-daemon.sh stop secondarynamenode
  /usr/local/hadoop/sbin/hadoop-daemon.sh stop datanode
  /usr/local/hadoop/sbin/hadoop-daemon.sh stop namenode

一键开始和停止服务:

/usr/local/hadoop/sbin/start-all.sh  #开启hadoop服务(namenode,datanode,secondarynamenode)
/usr/local/hadoop/sbin/stop-all.sh #停止hadoop服务

web端查看集群:

http://192.168.1.20:50070/

在这里插入图片描述
在这里插入图片描述

运行hadoop的wordcount进行测试

先建hdfs文件系统中的/input目录:

[hadoop@master sbin]$ hdfs dfs -mkdir /input
[hadoop@master sbin]$ hdfs dfs -ls /              
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2021-03-27 16:11 /input
[hadoop@master sbin]$ 

将输入数据文件复制放入到hdfs的/input目录中:

[hadoop@master sbin]$ hdfs dfs -put /chenfeng/pzs.log /input 
[hadoop@master sbin]$ hdfs dfs -ls /input 
Found 1 items
-rw-r--r--   3 hadoop supergroup  199205376 2021-03-28 22:31 /input/pzs.log
[hadoop@master sbin]$ 

在浏览器查看:
在这里插入图片描述
没有看到文件,而且还报错:

`Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error`
这是因为java 11以后 移除了javax.activation**。

解决方法1:
javax.activiation 文件的下载链接

https://jar-download.com/?search_box=javax.activation
`下载**javax.activiation** 由于下载到的文件是ZIP格式的
 是要提取hadoop\share\hadoop\common`

下载的时候挑评星多的下载
在这里插入图片描述
解决方法2:直接替换java版本为jdk8:

https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html

成功解决:在这里插入图片描述

运行wordcount案例:
若是hdfs系统中存在/output目录,先删除,要不然在运行案例时无法生存新的/output目录会执行失败。

#若存在/output目录请删除:
hdfs dfs -rm -r -f /outtput

测试开始:

[hadoop@master sbin]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount /input/pzs.log /output   
`2021-03-28 22:56:16,003 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032
2021-03-28 22:56:17,202 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-28 22:56:18,203 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-28 22:56:19,203 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)`

出现错误:

根据错误
【Retrying connect to server: master/192.168.1.20:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)】
发现原来是yarn没有启动,yarn启动会有两个进程:
resourcemanager
nodemanagers

启动yarn:

[hadoop@master sbin]$ start-yarn.sh
Starting resourcemanager
Starting nodemanagers

查看是否启动成功:

[hadoop@master sbin]$ jps 
4357 Jps
1750 DataNode
1910 SecondaryNameNode
1630 NameNode

启动不成功

同时查看日志:

[root@master logs]# tailf hadoop-hadoop-resourcemanager-master.log
[root@master logs]# tailf hadoop-hadoop-nodemanager-master.log

在日志中有这样的输出:

`
2021-03-28 23:35:04,775 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: registered UNIX signal handlers for [TERM, HUP, INT]
2021-03-28 23:35:05,212 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/usr/local/hadoop/etc/hadoop/core-site.xml
2021-03-28 23:35:05,317 INFO org.apache.hadoop.conf.Configuration: resource-types.xml not found
2021-03-28 23:35:05,317 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'.
2021-03-28 23:35:05,348 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/usr/local/hadoop/etc/hadoop/yarn-site.xml
2021-03-28 23:35:05,350 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher
2021-03-28 23:35:05,390 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: NMTokenKeyRollingInterval: 86400000ms and NMTokenKeyActivationDelay: 900000ms
2021-03-28 23:35:05,392 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: ContainerTokenKeyRollingInterval: 86400000ms and ContainerTokenKeyActivationDelay: 900000ms
`

提示找不到resource-types.xml

原因,hadoop3.XXXX中需要配置各种的环境变量:

解决办法:

  1. 在环境变量文件中添加hadoop需要的环境变量:
vim /etc/profile
#在原有的java和hadoop的后面追加
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
source /etc/profile

修改配置:
停止服务:

[hadoop@master sbin]$ /usr/local/hadoop/sbin/stop-all.sh
 vim mapred-site.xml 
 #添加
 <property>
        <name>mapreduce.application.classpath</name>
        <value>
            /usr/local/hadoop/etc/hadoop,
            /usr/local/hadoop/share/hadoop/common/*,
            /usr/local/hadoop/share/hadoop/common/lib/*,
            /usr/local/hadoop/share/hadoop/hdfs/*,
            /usr/local/hadoop/share/hadoop/hdfs/lib/*,
            /usr/local/hadoop/share/hadoop/mapreduce/*,
            /usr/local/hadoop/share/hadoop/mapreduce/lib/*,
            /usr/local/hadoop/share/hadoop/yarn/*,
            /usr/local/hadoop/share/hadoop/yarn/lib/*
        </value>
    </property>

启动服务:

[hadoop@master sbin]$ /usr/local/hadoop/sbin/start-all.sh

继续执行wordcount案例:

[hadoop@master sbin]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount /input/pslstreaming_log1.txt /output
2021-03-30 10:25:36,653 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032
2021-03-30 10:25:37,432 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1617071112579_0001
2021-03-30 10:25:37,979 INFO input.FileInputFormat: Total input files to process : 1
2021-03-30 10:25:38,225 INFO mapreduce.JobSubmitter: number of splits:1
2021-03-30 10:25:38,582 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1617071112579_0001
2021-03-30 10:25:38,583 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-03-30 10:25:38,712 INFO conf.Configuration: resource-types.xml not found
2021-03-30 10:25:38,712 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2021-03-30 10:25:39,091 INFO impl.YarnClientImpl: Submitted application application_1617071112579_0001
2021-03-30 10:25:39,123 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1617071112579_0001/
2021-03-30 10:25:39,123 INFO mapreduce.Job: Running job: job_1617071112579_0001
2021-03-30 10:25:46,264 INFO mapreduce.Job: Job job_1617071112579_0001 running in uber mode : false
2021-03-30 10:25:46,264 INFO mapreduce.Job:  map 0% reduce 0%
2021-03-30 10:25:53,462 INFO mapreduce.Job:  map 100% reduce 0%
2021-03-30 10:25:58,543 INFO mapreduce.Job:  map 100% reduce 100%
2021-03-30 10:25:59,555 INFO mapreduce.Job: Job job_1617071112579_0001 completed successfully
2021-03-30 10:25:59,617 INFO mapreduce.Job: Counters: 54
.........略

不明白为什么这个错误还是存在【resource.ResourceUtils: Unable to find ‘resource-types.xml’.】

查看输出文件:

[hadoop@master sbin]$ hdfs dfs -cat /output/part-r-00000|head 
"",     308
""],    2
"9716168072",   601
"9716168072"},  1
"?arrc=2&linkmode=7",   1
"Count=2        299
"a50_inactive_threshold":       300
"a50_refresh_interval": 119
"a50_state_check_interval":     300
"app_private_data":     299
cat: Unable to write to output stream.
#太多,只输出10行看一下

在网页端查看:
在这里插入图片描述

在网页中新建文件时出现错误,不能新建文件和目录:
在这里插入图片描述

`Permission denied: user=dr.who, access=WRITE, inode="/output":hadoop:supergroup:drwxr-xr-x`

问题的分析:
我在浏览器查看目录和删除目录及文件,为什么会是dr.who,dr.who其实是hadoop中http访问的静态用户名,并没有啥特殊含义,可以在core-default.xml中看到其配置

hadoop.http.staticuser.user=dr.who

我们可以通过修改core-site.xml,配置为当前用户,

    <property>
        <name>hadoop.http.staticuser.user</name>
        <value>hadoop</value>
    </property>

另外,通过查看hdfs的默认配置hdfs-default.xml发现hdfs默认是开启权限检查的。

dfs.permissions.enabled=true #是否在HDFS中开启权限检查,默认为true

解决方法一:
直接修改/user目录的权限设置,操作如下:

hdfs dfs -chmod -R 755 /user
`不知道什么原因不起作用,这个方法失败`
```解决方法二:
在Hadoop的配置文件core-site.xml中增加如下配置:

```bash
<!-- 当前用户全设置成root -->
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>

<!-- 不开启权限检查 -->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>

查看:
修改权限前:
在这里插入图片描述
修改权限后:
创建chenfeng目录:
在这里插入图片描述
输入192.168.1.20:8088查看yarn集群中运行的作业:
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/qq_40736702/article/details/115217456
今日推荐