This article mainly records various problems encountered in spark and their solutions (will continue to be updated in the future)
1.File does not exist. Holder DFSClient_NONMAPREDUCE_-67513653_1 does not have any open files
This morning I found that the program reported an error. I haven't found the cause of the error. Let me record it first.
2.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions
This error is reported because the startoffest is out of bounds, just set the judgment of the offer.
3,Cannot use map-side combining with array keys
This error is reported because flatMap should be used, but map is used . Please see the difference between the two here.
4,KafkaConsumer is not safe for multi-threaded access
Reporting this error is actually a bug in spark, which will appear in spark2.1.0 and spark2.2.0, 2.4.0 has been fixed
List of issues: https://issues.apache.org/jira/browse/SPARK-23636
5,java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/StringDeserializer
This error is reported because one jar package is missing, just hit the following jar package to the dependent package.