SparkCore — the best location for Task

Task best location

  After the stage is divided in the previous blog, the stage is submitted, using the submitMissingTasks() method to create a TaskSet for each stage, and then submit it to the corresponding worker's executor to run. Let's analyze the specific process:

 private def submitMissingTasks(stage: Stage, jobId: Int) {
    // .....
    // 获取需要创建的task的数量
    val partitionsToCompute: Seq[Int] = stage.findMissingPartitions()
    // 给stage创建一个内部累加器,暂时不知道做什么的
    if (stage.internalAccumulators.isEmpty || stage.numPartitions == partitionsToCompute.size) {
      stage.resetInternalAccumulators()
    }
	// 获取job的优先级
    val properties = jobIdToActiveJob(jobId).properties
    // 将stage加入,runningStages队列
    runningStages += stage  
    // 省略部分代码
    .....
    
    // 先计算当前stage创建的task的最佳位置
    // 针对ShuffleMapStage和ResultStage,计算它们的task的最佳位置
    val taskIdToLocations: Map[Int, Seq[TaskLocation]] = try {
      stage match {
        case s: ShuffleMapStage =>
          partitionsToCompute.map { id => (id, getPreferredLocs(stage.rdd, id))}.toMap
        case s: ResultStage =>
          val job = s.activeJob.get
          partitionsToCompute.map { id =>
            val p = s.partitions(id)
            (id, getPreferredLocs(stage.rdd, p))
          }.toMap
      }
    } catch {
      // .....
    }
    //  将创建task的广播变量,把RDD和算子函数广播到要计算的节点上去
    //  省略部分代码 .....

    // 创建指定数量的task
    val tasks: Seq[Task[_]] = try {
      stage match {
        case stage: ShuffleMapStage =>
          partitionsToCompute.map { id =>
            // 给每个partition创建一个task。
            // 获取task的最佳位置
            val locs = taskIdToLocations(id)
            val part = stage.rdd.partitions(id)
            // 创建ShuffleMapTask
            new ShuffleMapTask(stage.id, stage.latestInfo.attemptId,
              taskBinary, part, locs, stage.internalAccumulators)
          }

        case stage: ResultStage =>
          val job = stage.activeJob.get
          partitionsToCompute.map { id =>
            val p: Int = stage.partitions(id)
            val part = stage.rdd.partitions(p)
            val locs = taskIdToLocations(id)
            new ResultTask(stage.id, stage.latestInfo.attemptId,
              taskBinary, part, locs, id, stage.internalAccumulators)
          }
      }
    } catch {
      // 处理异常
     .......
    }

    if (tasks.size > 0) {
      logInfo("Submitting " + tasks.size + " missing tasks from " + stage + " (" + stage.rdd + ")")
      stage.pendingPartitions ++= tasks.map(_.partitionId)
      logDebug("New pending partitions: " + stage.pendingPartitions)
      // 最后针对stage的task,创建taskset对象,调用TaskScheduler的submitTasks()方法,提交TaskSet
      taskScheduler.submitTasks(new TaskSet(
        tasks.toArray, stage.id, stage.latestInfo.attemptId, jobId, properties))
      stage.latestInfo.submissionTime = Some(clock.getTimeMillis())
    } else {
    	// 省略代码
    }
  }

 The above code has two important points, one is to calculate the best position of the task, and the other is to encapsulate the tasks created by the current Stage into a TaskSet, and submit it to the TaskScheduler to allocate and execute the tasks. First look at the more important task optimal location calculation method getPreferredLocs(), which calls getPreferredLocsInternal:
 

/**
    *    计算每个task对应的partition的最佳位置
    *    就是从stage的最后一个RDD开始,去找哪个RDD的partition被cache或者checkpoint了,
    *    那么task的最佳位置就是缓存或者checkpoint的partition的位置,
    *    因为这样的话task就在那个节点上执行,不需要计算之前的RDD了。
    *
    */
  private def getPreferredLocsInternal(
      rdd: RDD[_],
      partition: Int,
      visited: HashSet[(RDD[_], Int)]): Seq[TaskLocation] = {
    // partition是否已经被访问过了
    if (!visited.add((rdd, partition))) {
      // Nil has already been returned for previously visited partitions.
      return Nil
    }
    // 当前RDD的partition是否被缓存了
    val cached = getCacheLocs(rdd)(partition)
    if (cached.nonEmpty) {
      return cached
    }
    // 当前RDD的partition是否被checkpoint了
    val rddPrefs = rdd.preferredLocations(rdd.partitions(partition)).toList
    if (rddPrefs.nonEmpty) {
      return rddPrefs.map(TaskLocation(_))
    }
    // 假如上面两种情况都没有,那么递归调用自己,寻找RDD的父RDD是否被cache或checkpoint了
    // 只要有被cache或者checkpoint,那么就返回
    rdd.dependencies.foreach {
      case n: NarrowDependency[_] =>
        for (inPart <- n.getParents(partition)) {
          val locs = getPreferredLocsInternal(n.rdd, inPart, visited)
          if (locs != Nil) {
            return locs
          }
        }

      case _ =>
    }
    // 如果stage从最后一个RDD到最开始的RDD,partition都没有被缓存或checkpoint
    // 那么task就没有最佳位置,返回Nil
    Nil
  }

The calculation of the best position of the above task, in general, is: starting from the last RDD of the stage, find which RDD in this stage is cached or checkpointed, then the best position of the task is the position of the cache or checkpoint, because of this If the task is executed on that node, there is no need to calculate the previous RDD. If none of the RDDs of this stage are cached or checkpointed, then Nil (that is, empty) is returned.
 

Guess you like

Origin blog.csdn.net/qq_32445015/article/details/115307963