Big Data learning day23 ----- spark06 -------- 1. Spark execution flow

1. Spark execution process

1.1 submit jobs

spark-submit --master spark://feng05:7070 --executor-memory 1g --total-executor-cores 4 --class cn.51doit.spark.WordCount /root/wc.jar hdfs://feng05:9000/wc hdfs:/feng05:9000/out0

1.2 spark task execution mode

  • Client mode (default): Driver is in SparkSubmit process, on the client side
  • Cluster mode: Driver is in a cluster, the process is not SparkSubmit  

1.3 Creating SparkContext

  Use spark-submit the script starts SparkSubmit process, and then reflected by calling the main method we pass through --class class. In the main method that we write business logic, and create SparkContext, apply to the Master of resources, then Master communication with the worker, start executor, then all of the Executor reverse registration to Driver.

1.4 RDD create and build the DAG

  DAG (Directed Acyclic Graph) is called the directed acyclic graph is a series of graphs depicting RDD conversion relation, through a series of original RDD conversion DAG is formed, and then depending on the dependency of the RDD into different stage DAG

 

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/jj1106/p/12021585.html