Spark source code learning (7) - Application process of applying for resources

background

         During the initialization process of SparkContext, schudularBackEnd will send the registration information of RegisterApplication to the Master. After the registration is successful, it will call the schedule() method to allocate cores resources to the application, and notify the worker to start the executor.


text

1. Starting point - receive()

     Because the Master is a message loop body, his receive method will receive the registration application request from the client. Finally, after the registration is successful, the schedule() method will be called to schedule and allocate resources. The code is as follows:

case RegisterApplication(description, driver) =>
      // TODO Prevent repeated registrations from some driver
      if (state == RecoveryState.STANDBY) {
         //master must be in alive state
        // ignore, don't send response
      } else {
        logInfo("Registering app " + description.name)
        val app = createApplication(description, driver)
        //register
        registerApplication(app)
        logInfo("Registered app " + description.name + " with ID " + app.id)
        // Endurance
        persistenceEngine.addApplication(app)
        //return success message
        driver.send(RegisteredApplication(app.id, self))
        //Resource Scheduling
        schedule()
}

2. Core method - schedule()

     Role: allocate resources to drivers, allocate worker resources

     Call timing: when a new application is registered, and when worker resources change

     Note: This time, no resources are allocated for the driver, because the driver has already been started when the application is registered. This time, it is mainly to allocate worker resources.

private def schedule(): Unit = {
    

    //Status check, omit

    // Disrupt the order of workers
    val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE))
    val numWorkersAlive = shuffledAliveWorkers.size
    var curPos = 0


    //Allocate resources for the driver, start the driver, and do not call this time
    for (driver <- waitingDrivers.toList) { // iterate over a copy of waitingDrivers
      // We assign workers to each waiting driver in a round-robin fashion. For each driver, we
      // start from the last worker that was assigned a driver, and continue onwards until we have
      // explored all alive workers.
      var launched = false
      var numWorkersVisited = 0
      while (numWorkersVisited < numWorkersAlive && !launched) {
        val worker = shuffledAliveWorkers(curPos)
        numWorkersVisited += 1
        if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) {

          / / Meet the requirements to start the driver
          launchDriver(worker, driver)
          waitingDrivers -= driver
          launched = true
        }
        curPos = (curPos + 1) % numWorkersAlive
      }
    }

    //Start the worker core method
    startExecutorsOnWorkers()
  }

3. Start worker core method - startExecutorsOnWorkers()

     The schedulaExecutorsOnWorker method will decide whether to allocate all cores to one worker or as many workers as possible according to the parameter SpreadOutApps parameter setting, and return the number of cores allocated by each worker.

     The allocateWorkerResourceToExecutors method will use the number of cores that have been allocated. Different from the coresPerExecutor parameter, the Executor is started on the Worker by calling RPCEndPoint and sending a message.

        step:

               a. Filter available Workers

               b. Allocate the number of cores to each Worker

               c. Start the executor according to the cores assigned by each worker

private def startExecutorsOnWorkers(): Unit = {
   

    for (app <- waitingApps if app.coresLeft > 0) {
      val coresPerExecutor: Option[Int] = app.desc.coresPerExecutor
      
      //Filter out Workers that meet the coresPerExecutor condition
      val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)
        .filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB &&
          worker.coresFree >= coresPerExecutor.getOrElse(1))
        .sortBy(_.coresFree).reverse

      //Determine each Worker and allocate the number of cores in the collection
      val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

      // Now that we've decided how many cores to allocate on each worker, let's allocate them

      //Start executor according to the number of cores
      /*

          When starting the executor here, it will be different from the coresPerExecutor.
          If coresPerExecutor is defined, start multiple executors
          If coresPerExecutor is undefined, an executor will be started that holds all cores
      */
      for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
        allocateWorkerResourceToExecutors(
          app, assignedCores(pos), coresPerExecutor, usableWorkers(pos))
      }
    }
  }

4. After starting the executor

    After starting the executor, it is the communication between the executorBackEnd and the driver. The backend will register the executor with the driver, and the relevant information is recorded in the previous blog post information.

Summarize 

      When there is a new Application, or when the Worker information changes, it will cause the schedula() scheduling resource method to be called.

      When allocating cores resources, the SpreadOutApps parameter will determine whether to allocate cores resources in a computationally intensive or data-intensive manner.

      The coresPerExecutor parameter will decide to start several Executors in a Worker.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325564078&siteId=291194637