1. JAVA calls Kettle conversion
Before writing Java
the program, first use Spoon
to design the conversion process, here is an example of pulling CSDN
the article list into txt
the text:
The pulled interface ishttps://blog.csdn.net/community/home-api/v1/get-business-list?page=1&size=20&businessType=blog&orderby=&noMore=false&year=&month=&username=qq_43692950
The return format is as follows:
{
"code": 200,
"message": "success",
"traceId": "b1e8ccb0-2e39-4834-bacd-b52a260bb521",
"data": {
"list": [
{
"articleId": 130450076,
"title": "ETL工具 - Kettle 查询、连接、统计、脚本算子介绍",
"description": "连接算子一般将多个数据集通过关键字进行连接,类似 `SQL` 中的连接操作,统计算子可以提供数据的采样和统计功能,脚本算子可以通过程序代码完成一些复杂的操作",
"url": "https://xiaobichao.blog.csdn.net/article/details/130450076",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 313,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=130450076",
"postTime": "2023-04-30 23:12:13",
"diggCount": 1,
"formatTime": "前天 23:12",
"picList": [
"https://img-blog.csdnimg.cn/2e817e14046f4cba9663c89978198f12.png"
]
}
],
"total": 287
}
}
1.1 Converting the design process
Here url
it is passed in in the form of variables, and the overall conversion design is as follows:
Get variables:
REST client:
JSON input:
Field selection:
Text file output:
After designing, save ktr
the script:
1.2 Java call conversion script
Create a new Mavne
project and pom
introduce the following dependencies into it:
<dependencies>
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle-core</artifactId>
<version>9.6.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle-engine</artifactId>
<version>9.6.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.pentaho.di.plugins</groupId>
<artifactId>pdi-core-plugins-impl</artifactId>
<version>9.6.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>pentaho</groupId>
<artifactId>pentaho-capability-manager</artifactId>
<version>9.6.0.0-SNAPSHOT</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>com.sun.jersey.contribs</groupId>
<artifactId>jersey-apache-client4</artifactId>
<version>1.9.1</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-core</artifactId>
<version>1.19.1</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-client</artifactId>
<version>1.19.1</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-bundle</artifactId>
<version>1.19.1</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>pentaho-public</id>
<name>Pentaho Public</name>
<url>https://repo.orl.eng.hitachivantara.com/artifactory/pnt-mvn/</url>
<releases>
<enabled>true</enabled>
<updatePolicy>daily</updatePolicy>
</releases>
<snapshots>
<enabled>true</enabled>
<updatePolicy>interval:15</updatePolicy>
</snapshots>
</repository>
</repositories>
resources
Create a new file under kettle-password-encoder-plugins.xml
, with the following content:
<password-encoder-plugins>
<password-encoder-plugin id="Kettle">
<description>Kettle Password Encoder</description>
<classname>org.pentaho.di.core.encryption.KettleTwoWayPasswordEncoder</classname>
</password-encoder-plugin>
</password-encoder-plugins>
Java
Call logic:
public class RunTrans {
public static void main(String[] args) {
try {
// 指定插件位置,注意改为你的安装目录
StepPluginType.getInstance().getPluginFolders().
add(new PluginFolder("D:/data-integration_9_3/plugins/", false, true));
// 初始化 kettle 环境
KettleEnvironment.init();
} catch (KettleException e) {
e.printStackTrace();
}
String ktrPath = "D:/data/job/trans.ktr";
String url = "https://blog.csdn.net/community/home-api/v1/get-business-list?page=1&size=20&businessType=blog&orderby=&noMore=false&year=&month=&username=qq_43692950";
// 添加变量
Map<String, String> variableMap = new HashMap<>();
variableMap.put("url", url);
Boolean res = runTrans(ktrPath, variableMap, null);
System.out.println("转换执行结果:" + res);
}
private static Boolean runTrans(String ktrPath, Map<String, String> variableMap, Map<String, String> parameterMap) {
try {
// 加载 ktr 文件
TransMeta transMeta = new TransMeta(ktrPath, (Repository) null);
Trans trans = new Trans(transMeta);
trans.setLogLevel(LogLevel.MINIMAL);
// 变量
if (Objects.nonNull(variableMap) && !variableMap.isEmpty()) {
variableMap.forEach(trans::setVariable);
}
// 参数
if (Objects.nonNull(parameterMap) && !parameterMap.isEmpty()) {
parameterMap.forEach((k, v) -> {
try {
trans.setParameterValue(k, v);
} catch (UnknownParamException e) {
e.printStackTrace();
}
});
}
// 监听执行日志
KettleLogStore.getAppender().addLoggingEventListener(new KettleLoggingEventListener() {
@Override
public void eventAdded(KettleLoggingEvent logs) {
System.out.println("Kettle 日志:level = " + logs.getLevel() + " , time = " + logs.getTimeStamp() + " , message = " + logs.getMessage());
}
});
// 执行转换
trans.execute(new String[0]);
// 等待执行完成
trans.waitUntilFinished();
// 是否执行成功
return trans.getErrors() == 0;
} catch (Exception e) {
e.printStackTrace();
}
return false;
}
}
Go to the output directory to view the results:
2. JAVA calls the Kettle task
The task calls the above conversion and tests:
Save kjb
the file:
Java
Call logic:
public class RunJob {
public static void main(String[] args) {
try {
// 指定插件位置
StepPluginType.getInstance().getPluginFolders().
add(new PluginFolder("D:/data-integration_9_3/plugins/", false, true));
// 初始化 kettle 环境
KettleEnvironment.init();
} catch (KettleException e) {
e.printStackTrace();
}
String kjbPath = "D:/data/job/job.kjb";
String url = "https://blog.csdn.net/community/home-api/v1/get-business-list?page=2&size=20&businessType=blog&orderby=&noMore=false&year=&month=&username=qq_43692950";
// 添加变量
Map<String, String> variableMap = new HashMap<>();
variableMap.put("url", url);
Boolean res = runJob(kjbPath, variableMap, null);
System.out.println("转换执行结果:" + res);
}
private static Boolean runJob(String kjbPath, Map<String, String> variableMap, Map<String, String> parameterMap) {
try {
JobMeta jobMeta = new JobMeta(kjbPath, null);
Job job = new Job(null, jobMeta);
job.setLogLevel(LogLevel.MINIMAL);
// 变量
if (Objects.nonNull(variableMap) && !variableMap.isEmpty()) {
variableMap.forEach(job::setVariable);
}
// 参数
if (Objects.nonNull(parameterMap) && !parameterMap.isEmpty()) {
parameterMap.forEach((k, v) -> {
try {
job.setParameterValue(k, v);
} catch (UnknownParamException e) {
e.printStackTrace();
}
});
}
// 监听执行日志
KettleLogStore.getAppender().addLoggingEventListener(new KettleLoggingEventListener() {
@Override
public void eventAdded(KettleLoggingEvent logs) {
System.out.println("Kettle 日志:level = " + logs.getLevel() + " , time = " + logs.getTimeStamp() + " , message = " + logs.getMessage());
}
});
// 执行作业
job.start();
// 等待执行完成
job.waitUntilFinished();
// 是否执行成功
return job.getErrors() == 0;
} catch (Exception e) {
e.printStackTrace();
}
return false;
}
}
Go to the output directory to view the results: