Pentaho项目主要分三个部分：

ü pentaho引擎（这部分以后基本很少变动）

ü pentaho-solution(解决方案，也就是以后根据不同的需求重点建设的部分)

ü pentaho-style（这是一个独立的应用，专职负责显示的样式）

Pentaho首页研究笔记（Home.jsp）

首页的模板是 ${solution-path}/system/custom/template-home.html

采取模板页面的好处使得显示的样式和显示的内容分开，在pentaho中专门负责显示样式是专门的

1. 首先初始化PentahoSession, PentahoSession 对HttpSession进行了一次封装。

2. 根据PentahoSystem的工具方法getUITemplater(IPentahoSession session)获得模板核心接口IUITemplater（IUITemplater经常用到，配置在pentaho.xml文件中，在整个global域内缓存）

3. 模板核心类UIUtil（IUITemplater的实现）借助breakTemplate("template-home.html", "", userSession )方法，获得${solution-path}/system/custom/template-home.html文件，并将该文件转化成字符串。并将模板按照{content}为分界线一分为二，分别为 intro 和 footer

Pentaho关键接口及其类研究

1. IPentahoSession

封装HttpSession 和PortletSession，除了提供这些Session的常用方法之外，还提供了如下的方法：

u getLocale() 获得session的Locale对象

u isAuthenticated() 判断当前session是否被认证

u setAuthenticated(String name) 设置session的名字，并且表明该session是被认证的，如果是HTTP或者Portlet session，name应该是登陆的用户名(request.getRemoteUser())

u setNotAuthenticated() 设置登陆的session为未授权

u setBackgroundExecutionAlert()

u getBackgroundExexutionAlert() 检验后台执行任务的状态，如果后台的执行触发了一个警告（alert）则返回 true

u resetBackgroundExecutuionAlert()

2. ISolutionEngine

一个request有一个solution engine,处理一个或多个action sequences

u setParameterProvider(String name,IParameterProvider parameterProvider) 设置输入参数的源

u public IRuntimeContext execute() 让IRuntimeContext执行 action sequence （有三个execute方法）

u public void setlistener(IActionCompleteListener listener) 设置action 完成的监听器，action 完成后将被监听器监听

u public void setSession(IPentahoSession session) 设置solution engine的session

u public IRuntimeContext getExecutionContext() 返回执行的运行环境

u int Status() 返回执行的当前状态

u public void init(IPentahoSession session) 初始化SolutionEngine，对象被构造之后（或者solution engines以不同的方式重用的时候）立即执行该方法

u public void setForcePrompt(boolean forcePrompt) 设置强制执行promp page

u public void setParameterXsl(String xsl) 为当前的component设置xsl文件用于生成 parameter page，parameter的路径必须是以“/”开始的，从solution 根开始的完整路径，或者是当前的action sequence的相对路径

3. IRuntimeContext

该接口定义了一些方法和常量，用于在action execution的过程中解析参数，输入，输出，资源和持久化运行时数据，

u public int executeSequence(IActionCompleteListener listener, boolean async) 执行action sequence

u public static Map createComponentClassMap() 静态方法中维护了组件名和组件类的对应关系

4. ISequenceDefinition

SequenceDefinition 表示包含有一个ActionSequence对象的定义，是action sequence文档运行是对象。（对应.xaction配置文件），一个SequenceDefinition可以包含一个到多个ActionDefinition，流程会自动一个接着一个ActionDefinition执行。

result-type 包括“none” “report” “rule” “process”

public Map getInputDefinitions() 获得定义文件中定义的输入参数

public Map getInputDefinitionsForParameterProvider(String parameterProviderName) 获得为特定paramerer provider定义的输入参数。如果输入参数名为”REGION”可能来自 request域的regn参数，则调用该方法，并将request作为parameterProviderName传递进去

public Map getOutputDefinitions() 获得定义文件中的输出参数

public Map getResourceDefinitions()获得定义文件中的资源

public String getSequenceName()获得action sequence文件的名称

public String getResultType()返回执行action sequence文件中action的返回结果类型

Pentaho解决方案运行的过程

1. PentahoSystem.systemEntryPoint()

2. 获得IPentahoSession

3. 如果不需要在后台运行(doBackgroundExecution(request, response, userSession))，则执行下面的工作，否则直接跳转到最后一步

4. 初始化OutPutStream

5. 从pentaho.xml配置的作用域中获得ISolutionRepository，并进行初始化

6. 从request的访问参数中获得solution,path,action，ISolutionRepository根据solution,path,action获得IActionSequence

7. 如果actionSequence存在title属性则走第9步，否则执行第8步

8. 啊

9. 初始化HttpOutputHandler

10. 初始化HttpMimeTypeListener

11. 在HttpOutputHandler中注册HttpMimeTypeListener监听器

12. 初始化SimpleUrlFactory

13. 初始化IParameterProvider

14. setupOutputHandler(outputHandler, requestParameters);

15. 如果handleSubscriptions为假，则初始化HttpServletRequestHandler并handleActionRequest

16. PentahoSystem.systemExitPoint()

第15步的分解

初始化IRuntimeContext

获得ISolutionEngine,并进行初始化，ISolutionEngine执行execute获得IRuntimeContext

Pentho运行过程

SolutionContextListener监听ServletContext初始化（在public void contextInitialized(ServletContextEvent event)方法中）后依次：

² 根据web.xml文件设置编码格式；

² 根据web.xml文件设置text-direction

² 根据web.xml文件设置国际化信息

² 获得solutionPath

² 获得base-url

² 根据solutionPath和base-url等信息实例化IApplicationContext 对象

² 将servletContext内的initParameter拷贝到 applicationContext对象中。

² 如果web.xml文件中配置了pentaho-system-cfg，则用该值设置System的SYSTEM_CFG_PATH_KEY值（SYSTEM_CFG_PATH_KEY被LiberatedSystemSettings类用来决定系统的配置文件，一般为pentaho.xml）,如果没有配置pentaho-system-cfg则用默认的。

² 初始化PentahoSystem(PentahoSystem.init(applicationContext))

² 提示Pentaho是否初始化成功

初始化PentahoSystem的过程

Ø 从applicationContext中获得solutionPath并设置到System的properties中（System.setProperty("pentaho.solutionpath", propertyPath)）

Ø 初始化SystemSettings（子类PathBasedSystemSettings ）

Ø 通过pentaho.xml文件的acl-files标签设置ACL文件后缀

Ø 设置缓存管理器，并方到PentahoSystem的globalObjectsMap中

Ø 根据pentaho.xml文件的 xml-factories 结点初始化xmlFactory，否则用默认的XMLFactory

Ø 检验pentaho.xml文件中配置的audit类是否存在

Ø 初始化StandaloneSession

Ø to guarantee hostnames in SSL mode are not being spoofed

Ø 根据pentaho.xml文件中配置的publishers初始化 PentahoSystem的 publishers

Ø 根据pentaho.xml文件初始化PentahoSystem的listerers

Ø 根据pentaho.xml文件初始化session中需要创建的action列表，

Action Sequence

Action Sequence XML

1、定义input

Action Sequence 文档可以识别三种类型的参数：inputs,outputs,resources .inputs和outputs是一种特定类型（string,property-map等）的变量，resources和inputs很像，但是有特定的mimie type和path。而且Resources也没有默认值。Resources典型的表示数据量比较大的，比如报表定义（report definition）或图片。

参数可以从runtime,request,session,global和default五种途径获得

Ø Runtime 参数：存储在RuntimeContext内的参数

Ø Request 参数：在URL中以name-value对的形式的参数

Ø Session 参数：用户Session 中的参数，每个用户的值不一样

Ø Global 参数：和session中的参数类似，但是所有的用户是公用的

Ø Default 参数：在Action Sequence文档中定义的，只有前四个地方获取不了的情况下，才用Default参数值

例如：

<request>REGION</request>

<runtime>aRegion</runtime>

</sources>

<default-value>Central</default-value>

</region>

</inputs>

例子中显示在执行的过程中，Action Sequence文档需要一个名为region(大小写敏感)的参数，RuntimeContext首先从request域中查找时候有名为REGION的参数（URL中以REGION=xxx），如果找到了则将xxx赋给region，如果找不到，再从runtimeContext域中找是否有名为aRegion值，如果再 RuntimeContext中还没有找到，则最后将默认值Central赋给region

注意：

ü RuntimeContext从Action Sequence文档sources的先后顺序赋值，最后是default-value，如果赋值不成功，则Action Sequence会抛出错误并且返回；

ü 有两个隐形参数instance-id和solution-id，不用声明和配置，对inputs或者outputs都是可见的

2、数据类型（Data Types）

Pentaho BI平台目前支持的类型

ü content 组件内数据量大的数据，例如reporting组件生成的PDF文件，content可以是任何类型，而且内部是字节流的形式，所以没有default-value

ü long Java Long对象

ü property-map-list map组成的list，而且map内是Java String类型的值

ü property-map Java String 类型组成的map

ü string 标准的Java String

ü string-list Java String 对象组成的list

3、资源类型（resources type）

ü file 系统内的绝对路径,

<file>

<location>D:/samples/reporting/MyReport.rptdesign</location>

<mime-type>text/xml</mime-type>

</file>

ü solution-file ${solution-path}/system目录的相对路径

<solution-file>

<location>MyReport.rptdesign</location>

<mime-type>text/xml</mime-type>

</solution-file>

ü url

<file>

<location>http://www.myserver.com/logo.png</location>

<mime-type>image/png</mime-type>

</file>

4、Actions

Action Sequence文档是描述文件，RuntimeContext提供运行环境，Component是商业逻辑，一个Component执行一个独立的功能。Component有两个主要的职责：校验和执行，校验是检验inputs和resources是否有效，执行就是实际操作。

Action Sequence文档中action-definition结点描述了组件应该有那些功能，并且存放组件执行是需要的inputs,outputs以及其他的配置信息。

4、1Action-inputs

action-inputs和 action-resoueces定义组件执行时需要的参数，组件所必须的参数如果在运行时获取不了，则会出现运行时错误。有多种方法给运行时参数赋值，name赋值，mapping赋值，常量硬编码，有时可以靠提示输入

4、2 Action-outputs

action-outputs定义被保存在RuntimeContext内的变量，当该组件执行完毕之后，该组件的action-outputs对于其他的组件是可见的，而且可以做为其他组件的action-inputs.

5、XML Schema

· <action-sequence> REQUIRED – 根结点

o <name> NOT REQUIRED – Action Sequence文件名称，如example.xaction

o <version> NOT USED – 文档版本

o <title> NOT REQUIRED – 文档比较好记的名字，用于显示

o <logging-level> NOT REQUIRED – 包括TRACE, DEBUG, INFO, WARN, ERROR and FATAL. 默认是ERROR

o <documentation> NOT REQUIRED – 生成文档的描述信息

§ <author> - NOT REQUIRED – 作者

§ <description> - NOT REQUIRED –描述信息， navigation component 显示用

§ <help> - NOT REQUIRED – 给最终用户的操作描述

§ <result-type> - NOT REQUIRED - Type of output this Action Sequence will generate. It is used by the solution navigation component to generate its display. Action Sequences without a result-type will not be displayed by the navigation component. Valid values are: Report, Process, Rule, View and None.

§ <icon> - NOT REQUIRED - Thumbnail image that the navigation component will use for generating its display. The path to the image is relative to the directory that the ActionSequence document is in. For example: Example1_image.png

o <inputs> - NOT REQUIRED - Collection of input parameters.

§ <param-name type="data-type" > - NOT REQUIRED - param-nameis the name of a parameter that the Action Sequence is expecting to be available at run time. The type attribute specifies the data type of this parameter. See below for valid data types.

§ <default-value> - NOT REQUIRED - Allows the input parameter to specify a default value if a value has not been supplied. If the default-value node is present but has no value specified, the user will be prompted for the value if possible.

§ <sources> - NOT REQUIRED - list of parameter providers in the order they should be queried to obtain a parameter. Valid values are request, session and runtime. Note: if a param-name is set but default-value and sources are both not specified, a validation error will occur.

o <outputs> - NOT REQUIRED - Collection of output parameters.

§ <param-name type="data-type" > - NOT REQUIRED - param-nameis the name of a parameter that the Action Sequence is expecting will be set by the time all action-definitions have executed. The type attribute specifies the data type of this parameter. See below for valid data types.

o <logging-level> NOT REQUIRED - Sets the logging level during this execution of the action-definition. Valid values are: TRACE, DEBUG, INFO, WARN, ERROR and FATAL. If no logging level is set, ERROR will be used.

o <resources> - NOT REQUIRED - Collection of resource parameters.

§ <resource-name > - NOT REQUIRED - resource-nameis the name of a resource that the Action Sequence is expecting to use. The type attribute specifies the data type of this parameter. See below for valid data types.

§ <resource-type> - REQUIRED - The name of the type of resource required. Valid values are: solution-file, file and url.

§ <location> - REQUIRED - The path to the resource. For a resource-type of "solution-file", the location is a pathname relative to the top level of the current solution. If the resource-type is "file" then the location is assumed to be the a fully qualified path. For resource-type of "url" the location is assumed to be a fully qualified URL.

§ <mime-type> - NOT REQUIRED - Gives a hint about the mime type of the resource.
<*actions [loop-on="parameter-name"] > - REQUIRED - The actions node contains "action-definition" nodes and optionally more "actions" nodes. The loop-on attribute is optional. When it is used, the nodes within "actions" will be executed multiple times. It is necessary to specify a parameter that is of type list (string-list or property-map-list) and the group of nodes that will be executed once for each element in the list. An input parameter will be generated with the same name as the loop-on attribute but it will have the value of one element in the list. For example: if a loop-on attribute named "department" is a string-list with department names, then a parameter named department will be available and be set to a different department name for each iteration.

o <actions [loop-on="parameter-name"] > - NOT REQUIRED - Since a single level of looping is not very fun, actions nodes can be nested within actions nodes to any level desired - no matter how silly it may be to do so.

o <action-definition> - REQUIRED (At least 1) - It defines one complete call to a component for execution of a task.

o <action-inputs> - NOT REQUIRED - Collection of action-input parameters.

§ <input-name type="data-type" mapping="param"> - NOT REQUIRED - input-name is the name of a parameter that the Action Definition is expecting to be available at run time. The type attribute specifies the data type of this parameter. See 3 - Data Types for valid data types. The mapping attribute allows this input to be mapped to an Action Sequence input or a previous action-definition output with a different name.

o <action-outputs> - NOT REQUIRED - Collection of action-output parameters.

§ <output-name type="data-type" > - NOT REQUIRED - output-nameis the name of a parameter that the Component will have set by the time it finishes executing. The type attribute specifies the data type of this parameter. See below for valid data types.

§ <component-name> - REQUIRED - The name of the java class that executes the action definition.

§ <component-definition> - REQUIRED - The component specific XML definition. See the documentation for the specific component for more information. This node may be empty but it must exist or a validation error will occur.

Executing a Action Sequence

执行解决方案的方式：Design Studio,URL,Java Code或者Web Service访问

URL

通过ViewAction，例如：http://localhost:8080/pentaho/ViewAction?&solution=samples&path=getting-started&action=HelloWorld.xaction

参数包括：

Ø solution,path,action* 用于加载Action Sequence文档

Ø instance_id* 上一个Runtime Context的示例ID

Ø debug* 设置为“true”，将debug信息写入运行日志中

Web Service调用

通过ServiceAction调用，例如：http://localhost:8080/pentaho/ServiceAction?solution=samples&path=getting-started&action=HelloWorld.action.xml

返回结果是一个XML SOAP Response

参数包括

Ø solution,path,action* 用于加载Action Sequence文档

Ø instance_id* 上一个Runtime Context的示例ID

Ø debug* 设置为“true”，将debug信息写入运行日志中

Java 调用

通过org.pentaho.test.RuntimeTest直接调用

Actions and Component Reference

Action Sequence中的每个action描述了Solution Engine执行的一个特定类型的任务。例如：SQL Query action描述一个SQL查询，需要用到JNDI或者JDBC的connection,而且需要保存组件执行完之后结果action output的名字。

1. BIRT Reports

2. 调用外部Action Sequence, 同步执行

ComponentName:SubActionComponent

Inputs:

solution 必须

path 必须

action 必须

action-url-component ：可选，指定action url 替换”ViewAction?”

session-proxy：可选, -- the name of the input field to use as a new session name in the sub action

Outputs:

pentaho源码分析