Spark source code reading (spark-submit submission script analysis)

spark submission script

  • How to read

    First, download the spark source code (maybe github) from the official website and import idea. I downloaded 2.4.6. It is said that 3.0 was changed to gradle, but I didn't watch 3.0 anyway.
    Generally, the source code starts from the demo, and I looked at the bin directory. There is indeed run-example, but I don't want to see it. . . . . .
    In terms of usage, it is usually spark-shell or spark-submit, so stop watching spark-submit!

  • spark-submit script

    Spark source code reading (spark-submit submission script analysis)

    1. Verify that SPARK_HOME exists
    2. If it does not exist, find the find-spark-home script in the current directory and set the SPARK_HOME environment variable
    3. Run spark-class script
    4. Add the org.apache.spark.deploy.SparkSubmit parameter before the original parameter
  • find-spark-home script

    Spark source code reading (spark-submit submission script analysis)

    1. Verify that find_spark_home.py exists. The English description below is very clear. If you don't install it through pip install pyspark , there will be no such directory, so there is no default.
    2. Set SPARK_HOME to the upper directory of the bin directory
  • spark-class script

    Too long, let's listen next time! ! !

Guess you like

Origin blog.51cto.com/5530261/2553568