Spark basic workflow and YARN cluster mode principle

Please indicate the source for reprint: http://www.cnblogs.com/BYRans/

Spark Basic Workflow

Explanation of related terms

Several terms related to Spark applications:

  • Worker: Any node in the cluster that can run Application code, similar to the NodeManager node in YARN. In Spark on Yarn mode, it refers to the NodeManager node;
  • Executor: A process that Application runs on the Worker node. This process is responsible for running Task and storing data in memory or on disk. Each Application has its own batch of Executors.
  • SparkContext: Started by the user program, communicates with the Executor through the resource scheduling module.
  • Driver: Run the main() function of Application and create SparkContext. The purpose of creating SparkContext is to prepare the running environment of Spark application. In Spark, the SparkContext is responsible for communicating with the ClusterManager, applying for resources, assigning and monitoring tasks, etc.; when the Executor part runs, the Driver is responsible for closing the SparkContext. Usually use SparkContext to represent Drive;

Basic operation process

Spark applications can run in multiple modes. The core code implementation of the two parts of SparkContext and Executor are common in various operation modes. Above these two parts, according to the different operation and deployment modes (for example: Local[N], Yarn cluster, etc.), there are different The scheduling module and the corresponding adaptation code.

Figure 3-1 Basic frame diagram

Specifically, using SparkContext as the general entry for program running, Spark will create two-level scheduling modules for DAGScheduler job and TaskScheduler task scheduling respectively during the initialization process of SparkContext.

其中作业调度模块是基于任务阶段的高层调度模块,它为每个Spark作业计算具有依赖关系的多个调度阶段(通常根据shuffle来划分),然后为每个阶段构建出一组具体的任务(通常会考虑数据的本地性等),然后以TaskSets(任务组)的形式提交给任务调度模块来具体执行。而任务调度模块则负责具体启动任务、监控和汇报任务运行情况。

详细的运行流程为:

  1. 构建Spark Application的运行环境(启动SparkContext),SparkContext向资源管理器(可以是Standalone、Mesos或YARN)注册并申请运行Executor资源;
  2. 资源管理器分配Executor资源并启动StandaloneExecutorBackend,Executor运行情况将随着心跳发送到资源管理器上;
  3. SparkContext构建成DAG图,将DAG图分解成Stage,并把Taskset发送给Task Scheduler。Executor向SparkContext申请Task,Task Scheduler将Task发放给Executor运行同时SparkContext将应用程序代码发放给Executor。
  4. Task在Executor上运行,运行完毕释放所有资源。

sparkProcessDetail

作业调度模块和具体的部署运行模式无关,在各种运行模式下逻辑相同。不同运行模式的区别主要体现在任务调度模块。不同的部署和运行模式,根据底层资源调度方式的不同,各自实现了自己特定的任务调度模块,用来将任务实际调度给对应的计算资源。接下来重点介绍下YARN cluster模式的实现原理和实现细节。

YARN cluster运行模式的内部实现原理

Spark有多种运行模式,在这里主要介绍下YARN cluster模式的内部实现原理。如下图是YARN cluster模式的原理框图,相对于其他模式,该模式比较特殊的是它需要由外部程序辅助启动APP。用户的应用程序通过辅助的YARN Client类启动。YARN cluster模式和YARN client模式的区别在于:YARN client模式的AM是运行在提交任务的节点,而YARN cluster模式的AM是由YARN在集群中选取一个节点运行,不一定是在提交任务的节点运行。例如spark-shell如果需要使用YARN模式运行,只能为yarn-client模式,启动命令可以使用spark-shell --master yarn-client

Figure 3-9 Logical architecture diagram of YARN cluster mode

Client类通过YARN Client API提交请求,在Hadoop集群上启动一个Spark ApplicationMaster,Spark ApplicationMaster首先注册自己为一个YARN ApplicationMaster,之后启动用户程序,SparkContext在用户程序中初始化时,使用CoarseGrainedSchedulerBackend配合YARNClusterScheduler,YARNClusterScheduler只是对TaskSchedulerImpl的一个简单包装,增加了对Executor的等待逻辑等。

根据Client类传递的参数,Spark ApplicationMaster通过YARN ResourceManager/NodeManager的接口在集群中启动若干个Container,用于运行CoarseGrainedExecutorBackend.CoarseGrainedExecutorBackend在启动过程中会向CoarseGrainedSchedulerBackend注册。

CoarseGrainedSchedulerBackend是一个基于Akka Actor实现的粗粒度的资源调度类,在整个Spark作业运行期间,CoarseGrainedSchedulerBackend主要负责如下功能:

  • 监听并持有注册给它的Executor资源
  • 根据现有的Executor资源,进行Executor的注册、状态更新、相应Scheduler的请求等任务的调度

模式的实现细节

Spark的各种运行模式虽然在启动方式、运行为之、调度手段上有所不同,但它们所要完成的任务基本是一致的,就是在合适的位置安全可靠的根据用户的配置和作业的需要管理和运行任务,在运行调度过程中需要考虑的问题主要为:

  • 环境变量的传递
  • JAR包和各种依赖文件的分发
  • 任务的管理和序列化等
  • 用户参数配置
  • 用户及权限控制

环境变量的传递

Spark的运行参数有很大一部分是通过环境变量来设置的,例如Executor的内存设置、Library路径等。在Cluster模式下就涉及到环境变量在各个Worker节点的传递问题。不同的运行模式有不同的传递方式。需要指出的是,在Local模式下,不存在环境变量的传递问题。

在这里主要说明一下再YARN相关模式下的参数传递。在YARN相关模式中,这些环境变量首先要通过YARN client设置到Spark AM的运行环境中,之后Spark AM在启动Executor时再将环境变量设置到Executor中。

JAR包和依赖文件的分发

Spark程序的运行主要有两类依赖:

  • Spark运行库及其依赖
  • 应用程序自身的额外依赖

在Local模式下,不存在JAR包分发的问题。在这里主要介绍下YARN模式下的文件分发。

在YARN相关模式中,运行库和程序运行所以来的其他文件首先通过HDFS客户端API上传到作业的.sparkStaing目录下,然后将对应的文件和URL映射关系通知YARN,YARN的Node Manager在启动Container的时候会从指定URL处下载相关文件作为运行环境的一部分。

For files that need to be further distributed to the Executor running environment, the Spark YARN client packs the relevant attributes of the files to be distributed (for example: URL, timestamp, size, etc.) into strings, and passes them to Spark through specific environment variables (SPARK_YARN_CACHE_XXXX) AM, Spark AM restores each file in a specific environment when creating the Container of the Executor, and initializes the Container by calling the setLocalResources function.

Task management and serialization

The problems to be solved in the running of Spark tasks are:

  • Run tasks in the correct order, manage and dispatch tasks efficiently
  • Efficiently send tasks and related data required for operation to remote ends
  • Collect run results

Spark tasks are dispatched by calling TaskScheduler.submitTasks through DAGScheduler, which submits and schedules a related set of tasks together.

The running result of the task is serialized on the Executor side and sent back to the SchedulerBackend. Due to the limitation of the Akka frame size, if the running result data is too large, the result will be stored in the BlockManager. At this time, the BlockID of the corresponding data is sent to the SchedulerBackend, TaskScheduler Finally, TaskResultGetter will be called to read the result asynchronously in the thread pool, and TaskSetManager will update the task status (such as failure retry, etc.) according to the running result and report it to DAGScheduler, etc.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327048040&siteId=291194637