spark source code analysis of the four - two TaskScheduler start the process of creating and spark source code analysis of the - of the initialization process SparkContext

  In the  two spark analysis of source code - the initialization process SparkContext  in step 14 and step 16 are described TaskScheduler initialization and startup.

  If two points, saying the initialization process of TaskScheduler

 TaskScheduler instantiation

1 val (sched, ts) = SparkContext.createTaskScheduler(this, master, deployMode)

It calls org.apache.spark.SparkContext # createTaskScheduler, source code as follows:

 1 /**
 2    * Create a task scheduler based on a given master URL.
 3    * Return a 2-tuple of the scheduler backend and the task scheduler.
 4    */
 5   private def createTaskScheduler(
 6       sc: SparkContext,
 7       master: String,
 8       deployMode: String): (SchedulerBackend, TaskScheduler) = {
 9     import SparkMasterRegex._
10 
11     // When running locally, don't try to re-execute tasks on failure.
12     val MAX_LOCAL_TASK_FAILURES = 1
13 
14     master match {
15       case "local" =>
16         val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
17         val backend = new LocalSchedulerBackend(sc.getConf, scheduler, 1)
18         scheduler.initialize(backend)
19         (backend, scheduler)
20 
21       case LOCAL_N_REGEX(threads) =>
22         def localCpuCount: Int = Runtime.getRuntime.availableProcessors()
23         // local[*] estimates the number of cores on the machine; local[N] uses exactly N threads.
24         val threadCount = if (threads == "*") localCpuCount else threads.toInt
25         if (threadCount <= 0) {
26           throw new SparkException(s"Asked to run locally with $threadCount threads")
27         }
28         val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
29         val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)
30         scheduler.initialize(backend)
31         (backend, scheduler)
32 
33       case LOCAL_N_FAILURES_REGEX(threads, maxFailures) =>
34         def localCpuCount: Int = Runtime.getRuntime.availableProcessors()
35         // local[*, M] means the number of cores on the computer with M failures
36         // local[N, M] means exactly N threads with M failures
37         val threadCount = if (threads == "*") localCpuCount else threads.toInt
38         val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true)
39         val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)
40         scheduler.initialize(backend)
41         (backend, scheduler)
42 
43       case SPARK_REGEX(sparkUrl) =>
44         val scheduler = new TaskSchedulerImpl(sc)
45         val masterUrls = sparkUrl.split(",").map("spark://" + _)
46         val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
47         scheduler.initialize(backend)
48         (backend, scheduler)
49 
50       case LOCAL_CLUSTER_REGEX(numSlaves, coresPerSlave, memoryPerSlave) =>
51         // Check to make sure memory requested <= memoryPerSlave. Otherwise Spark will just hang.
52         val memoryPerSlaveInt = memoryPerSlave.toInt
53         if (sc.executorMemory > memoryPerSlaveInt) {
54           throw new SparkException(
55             "Asked to launch cluster with %d MB RAM / worker but requested %d MB/worker".format(
56               memoryPerSlaveInt, sc.executorMemory))
57         }
58 
59         val scheduler = new TaskSchedulerImpl(sc)
60         val localCluster = new LocalSparkCluster(
61           numSlaves.toInt, coresPerSlave.toInt, memoryPerSlaveInt, sc.conf)
62         val masterUrls = localCluster.start()
63         val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
64         scheduler.initialize(backend)
65         backend.shutdownCallback = (backend: StandaloneSchedulerBackend) => {
66           localCluster.stop()
67         }
68         (backend, scheduler)
69 
70       case masterUrl =>
71         val cm = getClusterManager(masterUrl) match {
72           case Some(clusterMgr) => clusterMgr
73           case None => throw new SparkException("Could not parse Master URL: '" + master + "'")
74         }
75         try {
76           val scheduler = cm.createTaskScheduler(sc, masterUrl)
77           val backend = cm.createSchedulerBackend(sc, masterUrl, scheduler)
78           cm.initialize(scheduler, backend)
79           (backend, scheduler)
80         } catch {
81           case se: SparkException => throw se
82           case NonFatal(e) =>
83             throw new SparkException("External scheduler cannot be instantiated", e)
84         }
85     }
86   }

  Different achieve the following:

  

   Examples of moieties analysis is completed, the analysis focused on the lower half of the yarn-client mode of startup TaskScheduler

 yarn-client mode startup TaskScheduler

  Initialization scheduling pool

  The yarn-client mode, is achieved TaskScheduler org.apache.spark.scheduler.cluster.YarnScheduler, is achieved TaskSchedulerBackend org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend

  In org.apache.spark.SparkContext # createTaskScheduler method, the following call:

  

 1 case masterUrl =>
 2         val cm = getClusterManager(masterUrl) match {
 3           case Some(clusterMgr) => clusterMgr
 4           case None => throw new SparkException("Could not parse Master URL: '" + master + "'")
 5         }
 6         try {
 7           val scheduler = cm.createTaskScheduler(sc, masterUrl)
 8           val backend = cm.createSchedulerBackend(sc, masterUrl, scheduler)
 9           cm.initialize(scheduler, backend)
10           (backend, scheduler)
11         } catch {
12           case se: SparkException => throw se
13           case NonFatal(e) =>
14             throw new SparkException("External scheduler cannot be instantiated", e)
15         }

 

  Wherein, in cm.initialize (scheduler, backend) cm is org.apache.spark.scheduler.cluster.YarnClusterManager, is achieved TaskScheduler org.apache.spark.scheduler.cluster.YarnScheduler, is achieved TaskSchedulerBackend org.apache .spark.scheduler.cluster.YarnClientSchedulerBackend. YarnClusterManager initialize method to achieve the following:

1   override def initialize(scheduler: TaskScheduler, backend: SchedulerBackend): Unit = {
2     scheduler.asInstanceOf[TaskSchedulerImpl].initialize(backend)
3   }

     Which does not implement initialize, TaskSchedulerImpl parent implementation as follows:

 1 def initialize(backend: SchedulerBackend) {
 2     this.backend = backend
 3     schedulableBuilder = {
 4       schedulingMode match {
 5         case SchedulingMode.FIFO =>
 6           new FIFOSchedulableBuilder(rootPool)
 7         case SchedulingMode.FAIR =>
 8           new FairSchedulableBuilder(rootPool, conf)
 9         case _ =>
10           throw new IllegalArgumentException(s"Unsupported $SCHEDULER_MODE_PROPERTY: " +
11           s"$schedulingMode")
12       }
13     }
14     schedulableBuilder.buildPools()
15   }

 

  As can be seen, an important role is to set TaskScheduler of TaskSchedulerBackend reference.

  There are FIFO scheduling mode and FAIR modes. The default is FIFO mode, you can use spark.scheduler.mode to set the parameters. Use the builder pattern to create a Pool object.

  Which, org.apache.spark.scheduler.FIFOSchedulableBuilder # buildPools an empty implementation that does not do any operation; and org.apache.spark.scheduler.FairSchedulableBuilder # buildPools will load the appropriate scheduling assignment policy files; policy files can use spark .scheduler.allocation.file to set the parameters, if there is no further load the default settings will fairscheduler.xml file, if it is not, is not loaded. If there are scheduled pool configuration, depending on the configuration set the scheduling root pool and added to the pool. Finally initialize default pool and added to the root pool.

   Set taskscheduler variables in HeartBeatReceiver

1 _heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)

 

   First, _heartbeatReceiver RpcEndPointRef is a subject which requests will eventually be HeartbeatReceiver (Endpoint) received and processed. That org.apache.spark.HeartbeatReceiver # receiveAndReply method:

  

1 case TaskSchedulerIsSet =>
2       scheduler = sc.taskScheduler
3       context.reply(true)

 

     Specifically with regard to the relevant interpretation of RPC, we will have a special article devoted described later. Here is not to do too much explanation. // TODO

 Start TaskScheduler

  org.apache.spark.SparkContext initialization method has the following code starts TaskScheduler:

1 _taskScheduler.start()

 

  The yarn-client mode, a call start operation org.apache.spark.scheduler.cluster.YarnScheduler method, which follows the parent implementation TaskSchedulerImpl of:

 1 override def start() {
 2     // 1. 启动 task scheduler backend
 3     backend.start()
 4     // 2. 设定 speculationScheduler 定时任务
 5     if (!isLocal && conf.getBoolean("spark.speculation", false)) {
 6       logInfo("Starting speculative execution thread")
 7       speculationScheduler.scheduleWithFixedDelay(new Runnable {
 8         override def run(): Unit = Utils.tryOrStopSparkContext(sc) {
 9           checkSpeculatableTasks()
10         }
11       }, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
12     }
13   }

 

  Step 1: task scheduler backend start: org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend # start method is as follows:

 1 /**
 2    * Create a Yarn client to submit an application to the ResourceManager.
 3    * This waits until the application is running.
 4    */
 5   override def start() {
 6     // 1. 获取driver 的 host 和 port
 7     val driverHost = conf.get("spark.driver.host")
 8     val driverPort = conf.get("spark.driver.port")
 9     val hostport = driverHost + ":" + driverPort
10     // 2. 设定 driver 的 web UI 地址
11     sc.ui.foreach { ui => conf.set("spark.driver.appUIAddress", ui.webUrl) }
12 
13     argsArrayBuf = Val new new an ArrayBuffer [String] ()
 14      argsArrayBuf + = ( "--arg" , HostPort)
 15  
16      logDebug ( "Called ClientArguments with:" + argsArrayBuf.mkString ( "" ))
 . 17      Val args = new new ClientArguments (argsArrayBuf .toArray)
 18 is      totalExpectedExecutors = SchedulerBackendUtils.getInitialTargetExecutorNumber (the conf)
 . 19      // 3. start deploy client, and cut Rpc environment initialization driverClient and initializes the driver of rpc master endpoint and the RPC environment 
20 is      Client = new new Client (args, the conf)
 21 is      // 4. the yarn is bound to the application id
22     bindToYarn(client.submitApplication(), None)
23 
24     // SPARK-8687: Ensure all necessary properties have already been set before
25     // we initialize our driver scheduler backend, which serves these properties
26     // to the executors
27     super.start()
28    // 5. 检查 yarn application的状态,不能为 kill, finished等等
29     waitForApplication()
30    // 6. 监控线程
31     monitorThread = asyncMonitorApplication()
32     monitorThread.start()
33   }

 

  Key to explain the third step, the step involving the source code as follows:

 1 object Client {
 2   def main(args: Array[String]) {
 3     // scalastyle:off println
 4     if (!sys.props.contains("SPARK_SUBMIT")) {
 5       println("WARNING: This client is deprecated and will be removed in a future version of Spark")
 6       println("Use ./bin/spark-submit with \"--master spark://host:port\"")
 7     }
 8     // scalastyle:on println
 9     new ClientApp().start(args, new SparkConf())
10   }
11 }
12 
13 private[spark] class ClientApp extends SparkApplication {
14 
15   override def start(args: Array[String], conf: SparkConf): Unit = {
16     val driverArgs = new ClientArguments(args)
17 
18     if (!conf.contains("spark.rpc.askTimeout")) {
19       conf.set("spark.rpc.askTimeout", "10s")
20     }
21     Logger.getRootLogger.setLevel(driverArgs.logLevel)
22 
23     val rpcEnv =
24       RpcEnv.create("driverClient", Utils.localHostName(), 0, conf, new SecurityManager(conf))
25 
26     val masterEndpoints = driverArgs.masters.map(RpcAddress.fromSparkURL).
27       map(rpcEnv.setupEndpointRef(_, Master.ENDPOINT_NAME))
28     rpcEnv.setupEndpoint("client", new ClientEndpoint(rpcEnv, driverArgs, masterEndpoints, conf))
29 
30     rpcEnv.awaitTermination()
31   }
32 
33 }

 

  It can be seen in the main method of Client, the initialization of ClientApp object and call its start method, start method, first resolve the parameters of the driver. Then create a driver-side RPC environment, then according to endpointref information master of parsing, initialization master, and the establishment of a client endpoint and return to the client endpoint ref.

    Speculated that the timing of the implementation of tasks

  Let's continue to see the second step of the method of org.apache.spark.scheduler.cluster.YarnScheduler start of the first spark speculation task feature is off by default, because if there are many tasks are delayed, then it will restart one and the same task, this may consume all resources, cluster resources, and submit to the task on the cluster caused by uncontrollable effects. Starting a delay timer, performing timing of checkSpeculatableTasks method, as described below:

. 1  // the Check for speculatable Our Active Tasks All Jobs in. 
2    DEF checkSpeculatableTasks () {
 . 3      var = shouldRevive to false 
. 4      the synchronized {
 . 5        shouldRevive = rootPool.checkSpeculatableTasks (MIN_TIME_TO_SPECULATION) // 1. speculate whether a new task should run
 6      }
 7      IF (shouldRevive) {
 . 8        backend.reviveOffers () // run a new task 2.
 9      }
 10    }

   Among them, the first step to infer the task, there are two implementations of a Pool is to achieve, is to achieve a TaskSetManager, the Pool will be sub-Pool recursive call to get speculatable tasks. If you need to figure out, run the task scheduler backend of reviveOffers method, the general idea is as follows, first obtain free resources on the executor, then those resources will be allocated to speculative task, for its use.

  Summarized herein, this source analysis in Spark Context startup process to yarn-client mode as an example, analyzes how the task scheduler is started.

  Where on the introduction of direct RpcEnv skipped, the next will be devoted to Spark is built into the overall architecture Rpc mechanism and how it is run.

 

Guess you like

Origin www.cnblogs.com/johnny666888/p/11122802.html