hadoop error summary 01: https://blog.csdn.net/qq_19968255/article/details/82803768
1. When the script at run time error information is as follows:
Examining task ID: task_201201061122_0007_m_000002 (and more) from job job_201201061122_0007
Exception in thread "Thread-23" java.lang.RuntimeException: Error while reading from task log url
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://10.200.187.27:50060/tasklog?taskid=attempt_201201061122_0007_m_000000_2&start=-8193
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at java.net.URL.openStream(URL.java:1010)
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
... 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
The http: // xxx:? 50060 / tasklog taskid = attempt_201201061122_0007_m_000000_2 & start = -8193 copy this out, entered into the IE browser address bar, then this message appears:
Hadoop run once when the java heap error occur. When the literal meaning of the heap allocation errors, we know that dynamic allocation of memory applications are in the heap. Here tips heap error, it must be enough memory to use. Namenode then the size of the memory of how the value of it?
namenode manages the information of all the files inside the cluster. Simple give a precise formula for calculating the memory size based on the file information is not realistic.
hadoop namenode default memory size is 1000M, this value for the millions of files is sufficient, can be conservatively set per million memory blocks need to 1000MB.
For example, there is such a scenario, a cluster contains 200 nodes, each node has 24TB of a disk, the size of the block of hadoop is 128MB, there are three copies of a total of the number of blocks in about 2 million or more, then the memory how much roughly you need?
First calculate how many blocks you can:
(200*24000000MB)/(128MB*3)=12500,000。
Then a conservative estimate of how much memory is required:
12500,000*1000MB/1000,000=12,500MB
From the results seen above, the memory size namenode 12,000MB this number of levels is set to satisfy.
After calculating the approximate value, how to set it?
hadoop configuration file, hadoop-env.sh there are options HADOOP_NAMENODE_OPTS, this option is used to set the JVM memory size. such as:
HADOOP_NAMENODE_OPTS=-Xmx2000m
Then that is assigned to namenode 2000MB of space.
If you change the memory size namenode, then secondarynamenode the size of the memory of the same should also change its options are HADOOP_SECONDARYNAMENODE_OPTS.
sqoop The driver has not received any packets from the server
Perform list-tables and list_databases are OK, but the import has a problem, guess MAP will be distributed to the other two hadoop nodes will connect mysql, estimated or permission issues mysql.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2.jdbc.url=jdbc:mysql://localhost:3306/totosea?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&failOverReadOnly=false
autoReconnect
When the database connection is disrupted, whether to automatically reconnect?
failOverReadOnly
Automatic reconnection after a successful connection is set to read-only?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3.Hive reserved keyword support
Failed to recognize predicate 'date'. Failed rule: 'identifier' in column specification
Do not use this keyword
conf->hive-site.xml
<property>
<name>hive.support.sql11.reserved.keywords</name>
<value>false</value>
</property>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4.Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep解决方法
14/03/26 23:10:04 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/03/26 23:10:05 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/03/26 23:10:06 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/03/26 23:10:07 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
In use sqoop to hive tool to export the table in the mysql prompt the message has been retried. Access to information through the Internet, the problem is not rigorous path specified hdfs lead.
Writing error
sqoop export --connect jdbc:mysql://c6h2:3306/log --username root --password 123 --table dailylog --fields-terminated-by '\001' --export-dir '/user/hive/warehouse/weblog_2013_05_30'
Solution
sqoop export --connect jdbc:mysql://c6h2:3306/log --username root --password 123 --table dailylog --fields-terminated-by '\001' --export-dir 'hdfs://cluster1:端口/user/hive/warehouse/weblog_2013_05_30'
Here plus hdfs protocol and cluster name, I am here hadoop2 ha of cluster model.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
5.DFSClient: Caught exception
19-10-2018 05:52:19 CST import_initdayuser INFO - java.lang.InterruptedException
19-10-2018 05:52:19 CST import_initdayuser INFO - at java.lang.Object.wait(Native Method)
19-10-2018 05:52:19 CST import_initdayuser INFO - at java.lang.Thread.join(Thread.java:1281)
19-10-2018 05:52:19 CST import_initdayuser INFO - at java.lang.Thread.join(Thread.java:1355)
19-10-2018 05:52:19 CST import_initdayuser INFO - at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:609)
19-10-2018 05:52:19 CST import_initdayuser INFO - at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:370)
19-10-2018 05:52:19 CST import_initdayuser INFO - at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:546)
This is the native libraries incompatible caused Hadoop currently considered a bug, no solution can be ignored.