Hadoop学习问题记录之基础篇

目的

记录学习hadoop过程中遇到的基础问题,无关大小、无关困扰时间长短。

问题一 全分布式环境中运行mapred程序,报异常:java.net.NoRouteToHostException: 没有到主机的路由

在全分布式环境中运行mapred程序,报异常:java.net.NoRouteToHostException: 没有到主机的路由,但同样的配置、同样的程序,在伪分布式环境中是没有问题的。具体异常信息如下:

2019-09-14 15:37:44,018 INFO mapreduce.Job: Running job: job_1568442070466_0003
2019-09-14 15:43:46,231 INFO mapreduce.Job: Job job_1568442070466_0003 running in uber mode : false
2019-09-14 15:43:46,233 INFO mapreduce.Job:  map 0% reduce 0%
2019-09-14 15:43:46,273 INFO mapreduce.Job: Job job_1568442070466_0003 failed with state FAILED due to: Application application_1568442070466_0003 failed 2 times due to Error launching appattempt_1568442070466_0003_000002. Got exception: java.net.NoRouteToHostException: No Route to Host from  master/192.168.212.132 to slave1:45816 failed on socket timeout exception: java.net.NoRouteToHostException: 没有到主机的路由; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost
        at sun.reflect.GeneratedConstructorAccessor56.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
        at org.apache.hadoop.ipc.Client.call(Client.java:1457)
        at org.apache.hadoop.ipc.Client.call(Client.java:1367)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy83.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
        at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy84.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:308)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.NoRouteToHostException: 没有到主机的路由
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
        at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
        at org.apache.hadoop.ipc.Client.call(Client.java:1403)
        ... 19 more
. Failing the application.
2019-09-14 15:43:46,332 INFO mapreduce.Job: Counters: 0
异常信息

问题定位

1、虽然异常信息中只提到了一个slave节点,但已知所有节点配置均一模一样,所以这应该是一个共性问题;

2、根据异常的字面意思,即当前提交工作的master节点找不到slave节点,而找不到无非是以下几种情况:

  • 当前主机的hosts配置错误
    经ping slave1命令测试成功,这个情况可以排除。
  • 目标主机对应端口(45816)未打开
    打开所有slave节点的45816端口后,重新提交一次作业观察运行情况。
    再次提交作业,发现仍然报错,但此时的端口变成了43910,这个端口我们并未处理,自然也是关闭的。由端口的变化,可推测,每次提交作业调度运行时,用到的IPC端口是框架随机使用的,所以再用打开某个端口的方式来解决这个问题已经明显不可取了。
    解决思路有两个:1、为slave节点添加对master节点的IP级别允许通过;2、使框架用到的ipc端口固定,然后用允许端口的方式开放。(暂未找到这种配置方法)
    在使用解决思路1,使用以下命令开放给master节点访问:
    firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.212.132" accept'
    firewall-cmd --reload

再次提交作业,发现作业已经可以运行起来,但仍有NoRouteToHostException报错,但并不影响mapreduce程序的执行。推测是由于网络波动、或者虚拟机的局限性导致的间歇性丢失目标主机,有哪位大神知道,还望指点一二,这里先不深究了。错误信息如下:

2019-09-14 17:31:55,733 INFO mapreduce.Job: Task Id : attempt_1568452715996_0001_m_000013_2, Status : FAILED
Container launch failed for container_1568452715996_0001_01_000056 : java.net.NoRouteToHostException: No Route to Host from  slave3/192.168.212.135 to slave1:42387 failed on socket timeout exception: java.net.NoRouteToHostException: 没有到主机的路由; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost
        at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
        at org.apache.hadoop.ipc.Client.call(Client.java:1457)
        at org.apache.hadoop.ipc.Client.call(Client.java:1367)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy84.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy85.startContainers(Unknown Source)
        at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:160)
        at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.NoRouteToHostException: 没有到主机的路由
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
        at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
        at org.apache.hadoop.ipc.Client.call(Client.java:1403)
        ... 19 more
异常信息之一

猜你喜欢

转载自www.cnblogs.com/duanzi6/p/11519701.html
今日推荐