hadoop error summary 02

Disclaimer: This article is a blogger (Mang lm) original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/qq_19968255/article/details/87188510

hadoop error summary 01: https://blog.csdn.net/qq_19968255/article/details/82803768

1. When the script at run time error information is as follows:

Examining task ID: task_201201061122_0007_m_000002 (and more) from job job_201201061122_0007

Exception in thread "Thread-23" java.lang.RuntimeException: Error while reading from task log url

at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)

at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)

at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)

at java.lang.Thread.run(Thread.java:662)

Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://10.200.187.27:50060/tasklog?taskid=attempt_201201061122_0007_m_000000_2&start=-8193

at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)

at java.net.URL.openStream(URL.java:1010)

at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)

... 3 more

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

MapReduce Jobs Launched:

 

The http: // xxx:? 50060 / tasklog taskid = attempt_201201061122_0007_m_000000_2 & start = -8193 copy this out, entered into the IE browser address bar, then this message appears:

Hadoop run once when the java heap error occur. When the literal meaning of the heap allocation errors, we know that dynamic allocation of memory applications are in the heap. Here tips heap error, it must be enough memory to use. Namenode then the size of the memory of how the value of it?

namenode manages the information of all the files inside the cluster. Simple give a precise formula for calculating the memory size based on the file information is not realistic.

hadoop namenode default memory size is 1000M, this value for the millions of files is sufficient, can be conservatively set per million memory blocks need to 1000MB.

For example, there is such a scenario, a cluster contains 200 nodes, each node has 24TB of a disk, the size of the block of hadoop is 128MB, there are three copies of a total of the number of blocks in about 2 million or more, then the memory how much roughly you need?

First calculate how many blocks you can:

(200*24000000MB)/(128MB*3)=12500,000。

Then a conservative estimate of how much memory is required:

12500,000*1000MB/1000,000=12,500MB

From the results seen above, the memory size namenode 12,000MB this number of levels is set to satisfy.

After calculating the approximate value, how to set it?

hadoop configuration file, hadoop-env.sh there are options HADOOP_NAMENODE_OPTS, this option is used to set the JVM memory size. such as:

HADOOP_NAMENODE_OPTS=-Xmx2000m 

Then that is assigned to namenode 2000MB of space.

If you change the memory size namenode, then secondarynamenode the size of the memory of the same should also change its options are HADOOP_SECONDARYNAMENODE_OPTS.

sqoop  The driver has not received any packets from the server

Perform list-tables and list_databases are OK, but the import has a problem, guess MAP will be distributed to the other two hadoop nodes will connect mysql, estimated or permission issues mysql.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

2.jdbc.url=jdbc:mysql://localhost:3306/totosea?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&failOverReadOnly=false

autoReconnect

When the database connection is disrupted, whether to automatically reconnect?

failOverReadOnly

Automatic reconnection after a successful connection is set to read-only?

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

3.Hive reserved keyword support

Failed to recognize predicate 'date'. Failed rule: 'identifier' in column specification

Do not use this keyword

conf->hive-site.xml

<property>

    <name>hive.support.sql11.reserved.keywords</name>

    <value>false</value>

</property>

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

4.Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep解决方法

14/03/26 23:10:04 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

14/03/26 23:10:05 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

14/03/26 23:10:06 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

14/03/26 23:10:07 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

In use sqoop to hive tool to export the table in the mysql prompt the message has been retried. Access to information through the Internet, the problem is not rigorous path specified hdfs lead.

Writing error

sqoop export --connect jdbc:mysql://c6h2:3306/log --username root --password 123 --table dailylog --fields-terminated-by '\001' --export-dir '/user/hive/warehouse/weblog_2013_05_30'

Solution

sqoop export --connect jdbc:mysql://c6h2:3306/log --username root --password 123 --table dailylog --fields-terminated-by '\001' --export-dir 'hdfs://cluster1:端口/user/hive/warehouse/weblog_2013_05_30'

Here plus hdfs protocol and cluster name, I am here hadoop2 ha of cluster model.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

5.DFSClient: Caught exception

19-10-2018 05:52:19 CST import_initdayuser INFO - java.lang.InterruptedException

19-10-2018 05:52:19 CST import_initdayuser INFO -    at java.lang.Object.wait(Native Method)

19-10-2018 05:52:19 CST import_initdayuser INFO -    at java.lang.Thread.join(Thread.java:1281)

19-10-2018 05:52:19 CST import_initdayuser INFO -    at java.lang.Thread.join(Thread.java:1355)

19-10-2018 05:52:19 CST import_initdayuser INFO -    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:609)

19-10-2018 05:52:19 CST import_initdayuser INFO -    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:370)

19-10-2018 05:52:19 CST import_initdayuser INFO -    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:546)

This is the native libraries incompatible caused Hadoop currently considered a bug, no solution can be ignored.

 

Guess you like

Origin blog.csdn.net/qq_19968255/article/details/87188510