Usually, we package the Spark task into a jar package and submit it using spark-submit. Because spark is a distributed task, if there is no corresponding dependent jar file on the running machine, a ClassNotFound error will be reported.
There are three workarounds below:
Method 1: spark-submit –jars
According to the spark official website, specify --jars when submitting tasks, separated by commas. The disadvantage of this is that the jar package must be specified every time. If there are few jar packages, this can be done, but if there are too many jar packages, it will be very troublesome.
spark-submit --master yarn-client --jars ***.jar,***.jar(你的jar包,用逗号分隔) mysparksubmit.jar
- 1
- 1
If you use sbt, and have configured dependencies in build.sbt and downloaded them, then you can directly go to .ivy/cache/ in the user's home directory to copy the jar package required by your jar
Method 2: extraClassPath
When submitting, set parameters in spark-default, put all the required jar packages into a file, and then specify the directory in the parameters, which is much more convenient than the previous one:
spark.executor.extraClassPath=/home/hadoop/wzq_workspace/lib/*
spark.driver.extraClassPath=/home/hadoop/wzq_workspace/lib/*
- 1
- 2
- 1
- 2
It should be noted that you need to ensure that this directory exists on all machines that may run spark tasks, and test the jar package to all machines. The advantage of this is that you don't need to write a long list of jars when submitting code, but the disadvantage is that you have to copy all the jar packages.
Method 3: sbt-assembly
If you still feel the second trouble, this method is to package all dependent jar packages including the code you wrote together (fat-jar). Enter sbt in the project root directory, type plugins, and find that assembly is not installed by default, so we need to install the sbt-assembly plugin for sbt.
Add to project/plugins.sbt in your project directory
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")
resolvers += Resolver.url("bintray-sbt-plugins", url("http://dl.bintray.com/sbt/sbt-plugin-releases"))(Resolver.ivyStylePatterns)
- 1
- 2
- 3
- 1
- 2
- 3
Then we type sbt in the root directory, and then use plugins to view plugins. If you see sbtassembly.AssemblePlugin, it means that your plugin is installed successfully: you
also need to set up conflict resolution, and then use assembly on the sbt interactive command line. . This method will make the jar package very large after packaging.