【Oozie】（二）Oozie 架构及运行模型介绍

文章目录

一、Oozie 框架简介
二、Oozie 主要功能
三、Oozie 内部结构简单分析 (Oozie Internals)
四、Oozie 的水平可扩展性和垂直可扩展性
五、Oozie 的Action执行模型(Action Execution Model)

一、Oozie 框架简介

Oozie单词释义：驯象人

一个基于工作流引擎的开源框架，由Cloudera公司贡献给Apache，提供对Hadoop Mapreduce、Pig Jobs的任务调度与协调。Oozie需要部署到Java Servlet容器中运行。

以xml的形式写调度流程，可以调度mr，pig，hive，shell，jar等。

二、Oozie 主要功能

Workflow：顺序执行流程节点，支持fork（分支多个节点），join（合并多个节点为一个）

Coordinator，定时触发workflow （HUE4 改名叫Schedule）

Bundle Job，绑定多个coordinator（Schdule）

关系图：

在这里插入图片描述

Oozie 架构图：

在这里插入图片描述
Oozie节点：

控制流节点（Control Flow Nodes）：

控制流节点一般都是定义在工作流开始或者结束的位置，比如start,end,kill等。以及提供工作流的执行路径机制，如decision,fork,join等。

动作节点（Action Nodes）：

简而不能再简的言之，就是主要就是执行一些动作，比如FS ACTION，可以删除HDFS上的文件，创建文件夹等等。

总结

oozie调度框架的学习，如果概念不了解，可以先在似懂非懂的状态下把例子学会，再回顾知识点，自然就理解了。

三、Oozie 内部结构简单分析 (Oozie Internals)

Oozie是Hadoop的工作流管理系统，正如论文《Oozie: towards a scalable workflow management system for Hadoop》所说：工作流提供了一种声明式的框架来有效地管理各种各样的作业，有四个大的需求：可扩展性、多租户、Hadoop 安全性、可操作性。

Oozie的架构图如下：

在这里插入图片描述
Oozie提供了RESTful API接口来接受用户的提交请求(提交工作流作业)。其实，在命令行使用oozie -job xxx命令提交作业，本质上也是发HTTP请求向OozieServer提交作业。

After the workflow submission to Oozie, workflow engine layer drives the execution and associated transitions. 
The workflow engine accomplishes these through a set of pre-defined internal sub-tasks called Commands.

当提交了workflow后，由工作流引擎负责workflow的执行以及状态的转换。比如，从一个Action执行到下一个Action，或者workflow状态由Suspend变成KILLED

Most  of  the  commands  are  stored  in  an  internal  priority  queue from where a pool of worker threads picks up
and executes those commands. There are two types of commands: 
some are executed when  the  user  submits  the  request  and  others  are  executed asynchronously.

这里有两种类型的Commands，一种是同步执行的，另一种是异步执行的。

用户在HDFS上部署好作业(MR作业)，然后向Oozie提交Workflow，Oozie以异步方式将作业(MR作业)提交给Hadoop。这也是为什么当调用Oozie 的RESTful接口提交作业之后能立即返回一个jobId的原因，用户程序不必等待作业执行完成（因为有些大作业可能会执行很久(几个小时甚至几天)）。Oozie在后台以异步方式，再将workflow对应的Action提交给hadoop执行。

Oozie  splits  larger  workflow  management tasks  (not  Hadoop  jobs)  into  smaller  manageable  subtasks
and asynchronously  processes  them  using  a  pre-defined  state transition model.

此外，Oozie提供了一个access layer访问底层的集群资源。这是Hadoop Security的一个方面吧。

Oozie  provides  a  generic  Hadoop  access  layer restricted through Kerberos authentication to 
access Hadoop’s Job Tracker and Name Node components.

四、Oozie 的水平可扩展性和垂直可扩展性

水平可扩展性体现在以下几个方面：

①具体的作业执行上下文不是在Oozie Server process中。这个在Oozie的Action执行模型中会提到。也就是说：Oozie Server只负责执行workflow，而workflow中的Action，比如MapReduce Action或者Java Action的执行是以集群的方式执行的。Oozie Server只负责查询这些Action的执行状态和结果，从而降低了Oozie Server的负载。

Oozie needs to execute different types of jobs as part of workflow processing. If the jobs are executed in the context of the server process, 
there will be twoissues: 
1) fewer jobs could run simultaneously  due  to  limited  resources  in  a  server  process causing  significant  penalty  in  scalability  and  
2)  the  user application  could  directly  impact  the  Oozie  server  performance.

通过将实际作业(MR Action or JAVA Action)的运行交给Hadoop来管理并执行，Oozie Server只负责查询作业的状态…如果用户提交的workflow增多了，只需要简单地增加Oozie Server 即可。

②作业的状态持久化到关系数据库中(以后考虑使用Zookeeper)，由于作业(比如MR Action状态)状态存储在数据库中，而不是在单机的内存中，故很扩容。此外，上面还提到了，实际作业的具体执行是由Hadoop执行的。

 Oozie stores the job states into a persistent store. This approach  enables 
 multiple  Oozie  servers  to  run  simultaneously from  different  machines.

垂直可扩展性体现在：

①线程池以及队列中的Commands的正确配置与使用。

②异步作业提交模型–减少线程的阻塞

Oozie  often uses a pre-defined timeout for any external communication. Oozie follows an asynchronous job execution pattern for interaction with 
external  systems.  For  example,  when  a  job  is  submitted  to  the Hadoop  Job  Tracker,  Oozie  does  not  wait  for  the  job  to  finish 
since  it  may  take  a  long  time.  Instead  Oozie  quickly  returns  the worker  thread  back  to  the  thread  pool  and  
later  checks  for  job completion  in  a  separate  interaction  using  a  different  thread.

③使用内存锁的事务模型而不是 persistent model?–有点不懂

 In order to maximize resource usage, the persistent store connections are held for the shortest possible duration. To this end,
 we chose a memory lock based transaction model instead of a persistent store based one; the latter is often more expensive to hold for long time.

最后看看Oozie是怎么从Hadoop集群中获取作业的执行结果的？— 回调和轮询并用

回调是为了降低开销，轮询是为了保证可靠性。

When Oozie starts a MapReduce job, it provides a unique callback URL as part of the MapReduce  job  configuration;  the Hadoop  Job  Tracker 
invokes the  given  URL  to  notify the completion of the job. For  cases where the Job Tracker failed to invoke the callback URL for any reason
(i.e. a transient network failure), the system has a mechanism to poll the Job Tracker for determining the completion of the MapReduce job.

五、Oozie 的Action执行模型(Action Execution Model)

A fundamental design principle in Oozie is that the Oozie server never runs user code other than the execution of the workflow itself. 
This ensures better service stability byisolating user code away from Oozie’s code. The Oozie server is also stateless and the launcher job 
makes it possible for it to stay that way. By leveraging Hadoop for running the launcher,
handling job failures and recoverability becomes easier for the stateless Oozie server.

①Oozie never run user code other than the execution of the workflow itself. 比如，这里的usercode就是用户编写的MapReduce程序。

② The Oozie server is also stateless and the launcher job…

oozie server的无状态其实就是它把作业的执行信息持久化到数据库了。

Action的执行模型图如下：

在这里插入图片描述

Oozie runs the actual actions through a launcher job, which itself is a Hadoop 
Map‐Reduce job that runs on the Hadoop cluster. The launcher is a map-only 
job that runs only one mapper.

Oozie通过 launcher job 运行某个具体的Action。launcher job是一个 map-only的MR作业，而且并不知道它将在集群的哪台机器上执行这个MR作业。

在上图中，Oozie Client提交了一个workflow给Oozie Server。这个workflow里面要执行具体的Hive作业（Hive Action）

首先Oozie Server会启动一个MR作业，也就是launcher job，由launcher job来发起具体的Hive作业。（Hive作业本质上是MR作业）

而我们知道：launcher job是个MR作业，它需要占用slot，也就是说：每提交一个workflow作业，都会创建一个launcher job并占用一个slot，如果底层Hadoop集群slot个数很少，而Oozie提交的作业又很多，launcher job把 slot用完了，使得实际执行Action已经没有slot可用了，这就会导致死锁。当然，可以通过配置Oozie的相关参数来避免Oozie发起太多的launcher job

另外，对于MR Action（Hive Action），launcher job 并不需要等到它发起的Action执行完毕后才退出。事实上：MR Action的launcher并不会等待MR作业执行完毕后才退出。

The  <map-reduce>launcher is the exception and it exits right after launching the actual job instead of waiting for it to complete.

另外，正是由于这个“launcher job 机制”，当需要将作业交给Oozie来管理运行时，需要将作业相关的配置文件先在HDFS上部署好，然后向Oozie Server发RESTful请求提交作业。

云祁°

发布了339 篇原创文章 · 获赞 244 · 访问量 11万+

私信关注