background
During the initialization process of SparkContext, schudularBackEnd will send the registration information of RegisterApplication to the Master. After the registration is successful, it will call the schedule() method to allocate cores resources to the application, and notify the worker to start the executor.
text
1. Starting point - receive()
Because the Master is a message loop body, his receive method will receive the registration application request from the client. Finally, after the registration is successful, the schedule() method will be called to schedule and allocate resources. The code is as follows:
case RegisterApplication(description, driver) => // TODO Prevent repeated registrations from some driver if (state == RecoveryState.STANDBY) { //master must be in alive state // ignore, don't send response } else { logInfo("Registering app " + description.name) val app = createApplication(description, driver) //register registerApplication(app) logInfo("Registered app " + description.name + " with ID " + app.id) // Endurance persistenceEngine.addApplication(app) //return success message driver.send(RegisteredApplication(app.id, self)) //Resource Scheduling schedule() }
2. Core method - schedule()
Role: allocate resources to drivers, allocate worker resources
Call timing: when a new application is registered, and when worker resources change
Note: This time, no resources are allocated for the driver, because the driver has already been started when the application is registered. This time, it is mainly to allocate worker resources.
private def schedule(): Unit = { //Status check, omit // Disrupt the order of workers val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE)) val numWorkersAlive = shuffledAliveWorkers.size var curPos = 0 //Allocate resources for the driver, start the driver, and do not call this time for (driver <- waitingDrivers.toList) { // iterate over a copy of waitingDrivers // We assign workers to each waiting driver in a round-robin fashion. For each driver, we // start from the last worker that was assigned a driver, and continue onwards until we have // explored all alive workers. var launched = false var numWorkersVisited = 0 while (numWorkersVisited < numWorkersAlive && !launched) { val worker = shuffledAliveWorkers(curPos) numWorkersVisited += 1 if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) { / / Meet the requirements to start the driver launchDriver(worker, driver) waitingDrivers -= driver launched = true } curPos = (curPos + 1) % numWorkersAlive } } //Start the worker core method startExecutorsOnWorkers() }
3. Start worker core method - startExecutorsOnWorkers()
The schedulaExecutorsOnWorker method will decide whether to allocate all cores to one worker or as many workers as possible according to the parameter SpreadOutApps parameter setting, and return the number of cores allocated by each worker.
The allocateWorkerResourceToExecutors method will use the number of cores that have been allocated. Different from the coresPerExecutor parameter, the Executor is started on the Worker by calling RPCEndPoint and sending a message.
step:
a. Filter available Workers
b. Allocate the number of cores to each Worker
c. Start the executor according to the cores assigned by each worker
private def startExecutorsOnWorkers(): Unit = { for (app <- waitingApps if app.coresLeft > 0) { val coresPerExecutor: Option[Int] = app.desc.coresPerExecutor //Filter out Workers that meet the coresPerExecutor condition val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE) .filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB && worker.coresFree >= coresPerExecutor.getOrElse(1)) .sortBy(_.coresFree).reverse //Determine each Worker and allocate the number of cores in the collection val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps) // Now that we've decided how many cores to allocate on each worker, let's allocate them //Start executor according to the number of cores /* When starting the executor here, it will be different from the coresPerExecutor. If coresPerExecutor is defined, start multiple executors If coresPerExecutor is undefined, an executor will be started that holds all cores */ for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) { allocateWorkerResourceToExecutors( app, assignedCores(pos), coresPerExecutor, usableWorkers(pos)) } } }
4. After starting the executor
After starting the executor, it is the communication between the executorBackEnd and the driver. The backend will register the executor with the driver, and the relevant information is recorded in the previous blog post information.
Summarize
When there is a new Application, or when the Worker information changes, it will cause the schedula() scheduling resource method to be called.
When allocating cores resources, the SpreadOutApps parameter will determine whether to allocate cores resources in a computationally intensive or data-intensive manner.
The coresPerExecutor parameter will decide to start several Executors in a Worker.