关于使用oozie做任务调度的问题。出现SQOOP from Oracle Connection reset error(从oracle导入数据到HDFS上面)

最近在做通过sqoop 将oracle数据库当中的数据导入的HDFS上面,但是当我串行的时候是没有一点问题的。但是为了达到集群当中资源的额最大的使用率。想让导入数据做成并行去处理。
在做并行的时候,有时候是好的,有时候就出错,这样不稳定的系统真的头大。出现的问题如下:

8/10/29 15:01:03 ERROR manager.SqlManager: Error executing statement: java.sql.SQLRecoverableException: IO Error: Connection reset
java.sql.SQLRecoverableException: IO Error: Connection reset
  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:682)
  at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:711)
  at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:385)
  at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:30)
  at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:558)
  at java.sql.DriverManager.getConnection(DriverManager.java:571)
  at java.sql.DriverManager.getConnection(DriverManager.java:215)
  at org.apache.sqoop.manager.OracleManager.makeConnection(OracleManager.java:329)
  at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
  at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763)
  at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786)
  at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289)
  at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260)
  at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:253)
  at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:336)
  at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1858)
  at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1657)
  at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
  at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:494)
  at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
  at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
  at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
  at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
  at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.net.SocketException: Connection reset
  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
  at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
  at oracle.net.ns.DataPacket.send(DataPacket.java:209)
  at oracle.net.ns.NetOutputStream.flush(NetOutputStream.java:215)
  at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:302)
  at oracle.net.ns.NetInputStream.read(NetInputStream.java:249)
  at oracle.net.ns.NetInputStream.read(NetInputStream.java:171)
  at oracle.net.ns.NetInputStream.read(NetInputStream.java:89)
  at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123)
  at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79)
  at oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1(T4CMAREngineStream.java:426)
  at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:390)
  at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:249)
  at oracle.jdbc.driver.T4CTTIoauthenticate.doOSESSKEY(T4CTTIoauthenticate.java:435)
  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:580)
  ... 25 more
18/10/29 15:01:03 ERROR tool.ImportTool: Import failed: java.io.IOException: No columns to generate for ClassWriter
  at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1663)
  at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
  at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:494)
  at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
  at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
  at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
  at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
  at org.apache.sqoop.Sqoop.main(Sqoop.java:252)

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

看到上面的问题,第一反应应该是oracle 驱动的问题,然后就修改oracle的驱动包,然后从ojdbc6.jar的jar包开始替换,然后换成ojdbc7.jar但是问题依然是存在的。然后就FQ到国外的网站。

然后查看到相同的问题。提供的解决方法

 上麦这段话的意思是,当我们在执行map任务的时候,在主机上缺少一个快速随机产生的设备。然后去了sqoop的官网看到也是同样的这句话:

Solution: This problem occurs primarily due to the lack of a fast random number generation device on the host where the map tasks execute. On typical Linux systems this can be addressed by setting the following property in the java.security file:

在这里在我们的执行脚本当中加入这两句话:

export HADOOP_OPTS=-Djava.security.egd=file:/dev/../dev/urandom
sqoop import -D mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom"

至此我的问题得到了解决。整个并行的sqoop脚本执行成功。

下面是参考的文章:

http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue/

https://community.oracle.com/thread/943911?tstart=0&messageID=3793101

https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_oracle_connection_reset_errors

猜你喜欢

转载自www.cnblogs.com/gxgd/p/9884506.html