spark-3.0 Application Scheduling Algorithm parse

Various versions of the spark application scheduling algorithms still have this clear at the differences. From spark1.3.0 to spark 1.6.1, spark2.x to now the latest spark 3.x, scheduling algorithm with some modifications. Here we will study together what, Application scheduling mechanism spark the latest version of spark-3.0.

DEF startExecutorsOnWorkers Private (): Unit = { 
// Right now the this IS A Very Simple Scheduler We Keep Trying to the FIFO in The First Fit App.
// The Queue in, the then App The SECOND, etc.
for (App <- waitingApps) {
// If the spark-submmit script, specify each executor core number of the CPU,
// then assign each Executor of the core number,
// otherwise the default for each executor assign only one core the CPU
Val coresPerExecutor = app.desc.coresPerExecutor.getOrElse (. 1)
// the Cores the If left coresPerExecutor the iS less Within last, the left Cores BE Will not allocated
// current number of core APP need not be less than the assigned CPU core of a single executor start number
if ( app.coresLeft> = coresPerExecutor) {
// do Not that the Filter OUT Workers have have enough Resources to Launch executo AN / * * ku / R & lt
// filtered off state ALIVE, Executor and also release the worker
// according to the remaining amount of CPU core descending
Val usableWorkers = workers.toArray.filter (_. State == WorkerState.ALIVE)
.filter (canLaunchExecutor (_, App .desc))
.sortBy (_. coresFree) .reverse
IF (waitingApps.length == && usableWorkers.isEmpty. 1) {
logWarning (S "$ {app.id} The requires the App More Resource Within last of the any Workers Could have have.")
}
    // TODO: defaults spreadOutApps scheduling algorithm, the application needs executor resources assigned to multiple worker up
      val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

// Now that we've decided how many cores to allocate on each worker, let's allocate them
for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
allocateWorkerResourceToExecutors(
app, assignedCores(pos), app.desc.coresPerExecutor, usableWorkers(pos))
}
}
}
}
判断一个 worker 是否可以发布 executor
DEF canLaunchExecutor Private (worker: WorkerInfo, desc: ApplicationDescription): Boolean = { 
canLaunch (
worker,
desc.memoryPerExecutorMB,
desc.coresPerExecutor.getOrElse (. 1),
desc.resourceReqsPerExecutor)
}
Let us look at the inside of the method canlunch
DEF canLaunch Private ( 
worker: WorkerInfo,
memoryReq: Int,
coresReq: Int,
resourceRequirements: Seq [ResourceRequirement])
: Boolean = {
// the worker free memory value is greater than the memory value is equal to the requested
val enoughMem = worker.memoryFree> = memoryReq
number of worker idle on the core // number greater than or equal to the requested core
Val enoughCores = worker.coresFree> = coresReq
// worker meets the requested resource executor
Val enoughResources = ResourceUtils.resourcesMeetRequirements (
worker.resourcesAmountFree, resourceRequirements)
enoughMem && enoughCores enoughResources &&
}

back to the top of scheduleExecutorsOnWorkers
DEF scheduleExecutorsOnWorkers Private ( 
App: ApplicationInfo,
usableWorkers: the Array [WorkerInfo],
spreadOutApps: Boolean): the Array [Int] = {
Val coresPerExecutor = app.desc.coresPerExecutor
Val minCoresPerExecutor coresPerExecutor.getOrElse = (. 1)
// default is on oneExecutorPerWorker mechanism, which is the default executor is only a start on a worker
// If you set coresPerExecutor the spark -submit script, in sufficient time resource worker, each worker will be on, start multiple executor
Val = coresPerExecutor.isEmpty oneExecutorPerWorker
Val memoryPerExecutor = app.desc.memoryPerExecutorMB
Val resourceReqsPerExecutor app.desc.resourceReqsPerExecutor =
Val = usableWorkers.length numUsable
val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker
val assignedExecutors = new Array[Int](numUsable) // Number of new executors on each worker
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)

Worker // node determines whether to initiate the Executor
DEF canLaunchExecutorForApp (POS: Int): Boolean = {

Val keepScheduling = coresToAssign> = minCoresPerExecutor
Val enoughCores = usableWorkers (POS) .coresFree - assignedCores (POS)> = minCoresPerExecutor
Val assignedExecutorNum = assignedExecutors (POS )

// the allow the If WE Multiple the Executors per worker, the then WE CAN Always Launch new new the Executors.
// the Otherwise, IF there already IS ON aN Executor the this worker, the Just give IT More Cores.

// If the spark -submit script set in coresPerExecutor value,
// and the current worker has not yet been allocated for this application Executor,
Val launchingNewExecutor = oneExecutorPerWorker assignedExecutorNum == 0 ||!
// TODO: can start a new Executor
if (launchingNewExecutor) {
val assignedMemory = assignedExecutorNum * memoryPerExecutor
val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
val assignedResources = resourceReqsPerExecutor.map {
req => req.resourceName -> req.amount * assignedExecutorNum
}.toMap
val resourcesFree = usableWorkers(pos).resourcesAmountFree.map {
case (rName, free) => rName -> (free - assignedResources.getOrElse(rName, 0))
}
val enoughResources = ResourceUtils.resourcesMeetRequirements(
resourcesFree, resourceReqsPerExecutor)
val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
keepScheduling && enoughCores && enoughMemory && enoughResources && underLimit
} else {
// We're adding cores to an existing executor, so no need
// to check memory and executor limits
// TODO: 不满足启动新的 Executor条件,则 在 老的 Executor 上 追加 core 数
keepScheduling && enoughCores
}
}

// Keep launching executors until no more workers can accommodate any
// more executors, or if we have reached this application's limits

var freeWorkers = (0 until numUsable).filter(canLaunchExecutorForApp)
while (freeWorkers.nonEmpty) {
freeWorkers.foreach { pos =>
var keepScheduling = true
the while (keepScheduling && canLaunchExecutorForApp (POS)) {
coresToAssign - = minCoresPerExecutor
assignedCores (POS) + = minCoresPerExecutor

// Launching the If WE are Executor One per worker, the then Every. 1 Iteration The assigns Core
// The Executor to the Otherwise, Every Iteration The assigns Cores. a new new Executor to.
IF (oneExecutorPerWorker) {
// the TODO: If the Worker node can not start a new Executor, the assigned CPU core in a minCoresPerExecutor old Executor (in this case the default value. 1)
assignedExecutors (POS). 1 =
the else {}
// the TODO: If the Worker node may initiate a new Executor, minCoresPerExecutor allocating a CPU core (at which time the value is coresPerExecutor configured spark-submit the script) on the new Executor
assignedExecutors (POS) +. 1 =
}

// Spreading OUT OUT AN file application means spreading across AS ITS Executors
// MANY AS Possible Workers. WE are the If Not spreading OUT, WE Should Keep the then
// Scheduling Executors ON WE an until the this worker use of ITS All Resources.
// the Otherwise , the just oN the Move to the next worker at the.
IF (spreadOutApps) {
// TODO: incoming here keepScheduling = false, the worker is assigned only once every core, and then go on to the next worker assigned core, until the worker
// TODO: a complete traversal
keepScheduling to false =
}
}
}
freeWorkers = freeWorkers.filter (canLaunchExecutorForApp)
}
// returns each Worker node number assigned CPU core
assignedCores
}

Let's analyze allocateWorkerResourceToExecutors
private def allocateWorkerResourceToExecutors(
app: ApplicationInfo,
assignedCores: Int,
coresPerExecutor: Option[Int],
worker: WorkerInfo): Unit = {
// If the number of cores per executor is specified, we divide the cores assigned
// to this worker evenly among the executors with no remainder.
// Otherwise, we launch a single executor that grabs all the assignedCores on this worker.
val numExecutors = coresPerExecutor.map { assignedCores / _ }.getOrElse(1)
val coresToAssign = coresPerExecutor.getOrElse(assignedCores)
for (i <- 1 to numExecutors) {
val allocated = worker.acquireResources(app.desc.resourceReqsPerExecutor)
// TODO: This application additionally this time the Executor
Val Exec = app.addExecutor (worker, coresToAssign, allocated)
// TODO: launchExecutor send commands to the worker thread
launchExecutor (worker, Exec)
app.state = ApplicationState.RUNNING
}
}
OK, At this point, spark spark-3.0 the latest version of Application scheduling algorithm analysis is complete! ! !

Guess you like

Origin www.cnblogs.com/guodong1789/p/11976774.html