spark error summary

1.需要加上转义字符
java.util.regex.PatternSyntaxException: Unclosed character class near index 0
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1

2.kafka not had time consumption data, the data has been lost or expired; the offset is the topic of kafka over the range, the value may be small maxratePerPartition set [https://blog.csdn.net/yxgxy270187133/ Article This article was / Details / 53.66676 million]
org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Range of Offsets OUT Configured with RESET Policy for Partitions NO: {the newsfeed-100-Content-docidlog-103 944 288}. 1 =


3.内存参数太小 --executor-memory 8G \ --driver-memory 8G \
Application application_1547156777102_0243 failed 2 times due to AM Container for appattempt_1547156777102_0243_000002 exited with exitCode: -104
For more detailed output, check the application tracking page:https://host-10-11-11-11:26001/cluster/app/application_1547156777102_0243 Then click on links to logs of each attempt.
Diagnostics: Container [pid=5064,containerID=container_e62_1547156777102_0243_02_000001] is running beyond physical memory limits. Current usage: 4.6 GB of 4.5 GB physical memory used; 6.3 GB of 22.5 GB virtual memory used. Killing container.


4. After the method calls the method defined
forward reference extends over definition of value xxx

************************************************** *****************
https://blog.csdn.net/appleyuchi/article/details/81633335
POM is provided refers to the need to compile, publish does not need, when we when submitting by spark-submit, spark offer streaming package needs, rather Intellij is submitted by java at runtime package still needs to streaming, you need to remove.
1. Solution: running out of cancellation of local <scope> Provided </ scope>, reimport the Projects Maven
java.lang.ClassNotFoundException: org.apache.spark.SparkConf

2.

[ERROR] E:\git3_commit2\hello\hello\src\main\scala\com\hello\rcm\hello
\textcontent\hello.scala:206: error: No org.json4s.Formats found. Try
to bring an instance of org.json4s.Formats in scope or use the org.json4s.Defau
ltFormats.
[INFO] val str = write(map)

添加
implicit val formats: DefaultFormats = DefaultFormats

3.
the Spark 2.0 DataFrame the Map operation Unable to find encoder for type stored in a Dataset. Analyze and solve problems

Mainly dataframe.map operation, this is before the spark 1.X can run, but in the spark 2.0 but can not pass, you can modify the dataframe.rdd.map


4.
https://blog.csdn.net/someby/article/details/90715799

When a Dataset DataFrame turn, necessary to introduce implicit conversion, and then customized to global variables Case Class


https://stackoverflow.com/questions/30033043/hadoop-job-fails-resource-manager-doesnt-recognize-attemptid/30391973#30391973


The
same sql statement, Spark Sql query data and hive shell inconsistent results
https://blog.csdn.net/HappyLin0x29a/article/details/88557168
[file format in order to optimize read parquet, spark default choice to use their own analytical methods read the data, the results read out data is a problem, so the configuration items spark.sql.hive.convertMetastoreParquet to false on the line]

 

Spark closure with serialization
https://blog.csdn.net/bluishglc/article/details/50945032

added @transient not serialized


6. A subclass inherits the parent class, override member variables
without lazy loading can cause resultArray not obtain fruitName
class A {
the lazy Val fruitName = "Apple"
the lazy = Val resultArray the Array (fruitName, "2")
}

class B extends A{
override lazy val fruitName="orange"
}


7.
MetadataFetchFailedException: Missing an output location for shuffle

Guess you like

Origin www.cnblogs.com/ShyPeanut/p/11798913.html