linux上kettle转换定时调度
编程语言
2023-04-07 12:46:14
阅读次数: 0
data-integration 安装配置
1、下载源码(版本号:8.2.0.0)
cd /usr/local/src
wget https://jaist.dl.sourceforge.net/project/pentaho/Pentaho 8.2/client-tools/pdi-ce-8.2.0.0-342.zip
2.解压至 /usr/local/src
unzip pdi-ce-8.2.0.0-342.zip
3.添加数据库连接jar包(mysql-connector-java-5.1.47.jar)到/usr/local/src/data-integration/lib
4.添加配置文件kettle.properties,也可以添加自定义变量,使用格式为${YOUR_PARAMETER}
vi kettle.properties
-
## This file was generated by Pentaho Data Integration version 7.0.0.0-25.
#
# Here are a few examples of variables to set:
#
# PRODUCTION_SERVER = hercules
# TEST_SERVER = zeus
# DEVELOPMENT_SERVER = thor
#
# Note: lines like these with a # in front of it are comments
#
#
#Mon Dec 17 15:24:56 CST 2018
KETTLE_COMPATIBILITY_IMPORT_PATH_ADDITION_ON_VARIABLES=N
KETTLE_REDIRECT_STDERR=N
KETTLE_SHARED_OBJECTS=
KETTLE_METRICS_LOG_DB=
KETTLE_DEFAULT_DATE_FORMAT=
KETTLE_JOB_LOG_SCHEMA=
KETTLE_DEFAULT_INTEGER_FORMAT=
KETTLE_AGGREGATION_MIN_NULL_IS_VALUED=N
KETTLE_PLUGIN_CLASSES=
KETTLE_LOG_MARK_MAPPINGS=N
KETTLE_CORE_JOBENTRIES_FILE=
KETTLE_ROWSET_GET_TIMEOUT=50
KETTLE_METRICS_LOG_SCHEMA=
KETTLE_PLUGIN_PACKAGES=
KETTLE_TRANS_LOG_DB=
KETTLE_CHANNEL_LOG_DB=
KETTLE_MAX_LOG_TIMEOUT_IN_MINUTES=1440
KETTLE_JOB_LOG_DB=
KETTLE_JOB_LOG_TABLE=
KETTLE_DISABLE_CONSOLE_LOGGING=N
KETTLE_TRANS_PAN_JVM_EXIT_CODE=
KETTLE_COMPATIBILITY_PUR_OLD_NAMING_MODE=N
KETTLE_DEFAULT_TIMESTAMP_FORMAT=
KETTLE_TRANS_PERFORMANCE_LOG_TABLE=
KETTLE_STEP_LOG_SCHEMA=
KETTLE_ROWSET_PUT_TIMEOUT=50
KETTLE_MAX_JOB_ENTRIES_LOGGED=5000
KETTLE_COMPATIBILITY_DB_IGNORE_TIMEZONE=N
KETTLE_JOBENTRY_LOG_TABLE=
KETTLE_MAX_LOGGING_REGISTRY_SIZE=10000
KETTLE_TRANS_LOG_SCHEMA=
KETTLE_JNDI_ROOT=
KETTLE_COMPATIBILITY_MERGE_ROWS_USE_REFERENCE_STREAM_WHEN_IDENTICAL=N
KETTLE_HIDE_DEVELOPMENT_VERSION_WARNING=N
KETTLE_LENIENT_STRING_TO_NUMBER_CONVERSION=N
#空值转换为空字符串
KETTLE_EMPTY_STRING_DIFFERS_FROM_NULL=Y
vfs.sftp.userDirIsRoot=false
KETTLE_TRANS_PERFORMANCE_LOG_DB=
KETTLE_STEP_PERFORMANCE_SNAPSHOT_LIMIT=0
KETTLE_FAIL_ON_LOGGING_ERROR=N
KETTLE_MAX_JOB_TRACKER_SIZE=5000
KETTLE_LAZY_REPOSITORY=true
KETTLE_JOBENTRY_LOG_SCHEMA=
KETTLE_BATCHING_ROWSET=N
KETTLE_STEP_LOG_TABLE=
KETTLE_CARTE_OBJECT_TIMEOUT_MINUTES=1440
KETTLE_CHANNEL_LOG_SCHEMA=
KETTLE_JOBENTRY_LOG_DB=
KETTLE_CARTE_JETTY_RES_MAX_IDLE_TIME=
KETTLE_CARTE_JETTY_ACCEPT_QUEUE_SIZE=
PENTAHO_METASTORE_FOLDER=
KETTLE_STEP_LOG_DB=
KETTLE_CARTE_JETTY_ACCEPTORS=
KETTLE_SPLIT_FIELDS_REMOVE_ENCLOSURE=false
KETTLE_DEFAULT_NUMBER_FORMAT=
KETTLE_PASSWORD_ENCODER_PLUGIN=Kettle
KETTLE_LOG_SIZE_LIMIT=0
KETTLE_REDIRECT_STDOUT=N
KETTLE_MAX_LOG_SIZE_IN_LINES=5000
KETTLE_CORE_STEPS_FILE=
KETTLE_DEFAULT_SERVLET_ENCODING=
KETTLE_SYSTEM_HOSTNAME=
KETTLE_CHANNEL_LOG_TABLE=
KETTLE_DEFAULT_BIGNUMBER_FORMAT=
KETTLE_TRANS_LOG_TABLE=
KETTLE_METRICS_LOG_TABLE=
KETTLE_AGGREGATION_ALL_NULLS_ARE_ZERO=N
KETTLE_TRANS_PERFORMANCE_LOG_SCHEMA=
KETTLE_COMPATIBILITY_TEXT_FILE_OUTPUT_APPEND_NO_HEADER=N
:wq 保存
kettle 作业配置
1.在/usr/local/sbin 下新建新建kettle项目路径
2.上传kettle作业文件,包含kjb文件和ktr文件,kjb文件引用ktr时,通过自定义变量配置文件路径
3.添加kettle执行脚本
vi kettle.sh
-
#!/bin/bash
#每天都要执行
export JAVA_HOME=/usr/local/jdk1.8
export JAVA_BIN=$JAVA_HOME/bin
export JAVA_LIB=$JAVA_HOME/lib
export PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_LIB/tools.jar:$JAVA_LIB/dt.jar
export KETTLE_HOME=/usr/local/src/data-integration
#抽取微尘4.0数据到customize
/usr/local/src/data-integration/kitchen.sh -file=/usr/local/sbin/kettle/customize/weichen_sync.kjb level=basic>>/var/log/kettle_log/weichen_sync.log
4.给kettle.sh 文件增加可执行权限
chmod u+x kettle.sh
设置定时任务
1.使用linux自带的crontab,添加定时任务,每天00:30执行脚本
crontab -e
`30 0 * * * * * /usr/local/sbin/kettle/kettle.sh`
2.重新加载任务,若未启动状态,则执行`service crond start`命令启动
service crond reload
转载自blog.csdn.net/weixin_51981189/article/details/128567764