win10 + idea + spark + scala + sbt configuration

Recently learning spark, it involves configuring sbt, bruised and battered, experiencing a variety of problems, here to talk about these issues in detail (how the specific configuration do not speak, personal experience).
Environment:
win 10 x64
IDEA Community Edition
hadoop 2.7.2
the Spark 2.4.5 spark2.4.5 download the official website
scala 2.11.8
if nothing is installed, you can read this article: https://blog.csdn.net/a1066196847 / article / details / 87923496
here to talk about, we have to look at the document spark official website:

first of all there is Note that, look at the spark dependent scala version, do not download the latest scala, remember that it is the beginning of 2.11. Then the lowermost groupID, artifactID, version, the idea behind useful.

About spark and hadoop

The process will not go, look at this link: https://blog.csdn.net/a1066196847/article/details/87923496
here to talk about a few of my problems:
1. Since the pycharm previously installed, but not configured for python environment variables, where attention to add the environment variable
C: \ the Users \ Administrator \ Anaconda3
C: \ the Users \ Administrator \ Anaconda3 \ Scripts
plus PY in PATHEXT in; .PYM
Here Insert Picture Description

2. The system can not find the path on the issue after the spark-shell command: The
more likely reason, but there are problems on the more hidden problems in the% JAVA_HOME% of the set, when we first pay attention to the Path jdk installation is added by default delete the path, and then check% JAVA_HOME% right (this is the most important reason is substantially, if not try to move in the path SPARK_HOME).

About IDEA configuration SBT and scala

The hardest hit: we must be patient, such as dump end of SBT, but if you build a log display process has been stuck with, such as a 1 hour card, then there is definitely a problem. Add that the first time open build.sbt add spark dependent on open sbt project, the following format is fifth, and pay attention to the previous configuration to a blank line.

1. With regard to dump structure from sbt very slow, there are many online tutorials is to change the source said sbt, but I tried a number of programs have become the source said, then the profile is also different, I simply directly over the wall, but if we can change the source, then it is naturally best.

2.idea of buildlog displayed: Waiting for lock on C: \ Users \ Administrator.ivy2.sbt.ivy.lock to be available ...
Solution: Find all java processes in the Explorer and closed

3. In the card in the card idea Loading project definition from C: \ Users \ Administrator \ IdeaProjects \ untitled3 \ project
Solution: Using the folder path cmd command attrib -r remove read-only attribute of the corresponding folders

4. versions match is very important, we have to wait idea of ​​the dump project completion (to change the source of domestic methods without success, so consider over the wall to solve)

5.Build.sbt spark in dependence libraryDependencies + = "org.apache.spark"% "spark-core_2.11"% "2.4.5"

6. Create a scala remember the SDK: file-project structure-global library- will be automatically detected, and you select the same scala version

7. If there is in the build process:
FetchError $ DownloadingArtifacts: Error fetching Artifacts:
Solution: C: \ Users \ 59404 \ AppData \ content under Local \ Temp directory clean up (all deleted, if the reservation is used)

8. When the "wrong checksum", as shown in the log file folder, mine is C: \ Users \ Administrator \ AppData \ Local \ Coursier \ cache \ v1 \ https \ repo1.maven.org \ maven2 \ org \ apache \ hadoop \ hadoop-hdfs \ 2.6.5 , display header checksum error
solution: I will not change the date of which belong to delete the whole day, and then showed a jar package does not exist, so the recycle bin jar package and restore, build successful again.

Released five original articles · won praise 0 · Views 512

Guess you like

Origin blog.csdn.net/pursuingparadise/article/details/104896667