目录
在对Spring Batch深入学习前,先写一个小的demo,增加对Spring Batch的了解
相关信息
开发环境:
JDK 1.8.0_201
apache-maven-3.3.9
IntelliJ IDEA 2019.2.1 (Community Edition)
版本信息
Spring Batch 3.0.10.RELEASE
Spring test 4.3.21.RELEASE
相关概念
领域对象 | 描述 |
---|---|
Job Repository | 作业仓库,负责Job,Step执行过程中的状态保存 |
Job launcher | 作业调度器,提供执行Job的入口 |
Job | 作业,由多个Step组成,封装整个批处理操作 |
Step | 作业步,Job的一个执行环节,由多个或者一个Step组装成Job |
Tasklet | Step中具体执行逻辑的操作,可以重复执行,可以设置具体的同步、异步操作等 |
Chunk | 给定数量Item的集合,可以定义对Chunk的读操作、处理操作、写操作,提交间隔等,这是Spring Batch框架的一个重要特征 |
Item | 一条数据记录 |
ItemReader | 从数据源(文件系统、数据库、队列等)中读取Item |
ItemProcessor | 在Item写入数据源之前,对数据进行处理(如:数据清洗、数据转换、数据过滤、数据校验等) |
ItemWriter | 将Item批量写入数据源(文件系统、数据库、队列等) |
demo背景
通过Spring Batch来读取csv文件,读完后再将读到的数据写入另一个csv文件
代码展示
项目结构
配置pom.xml文件
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.et</groupId>
<artifactId>spring-batch-demo</artifactId>
<version>1.0-SNAPSHOT</version>
<name>spring-batch-demo</name>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-core</artifactId>
<version>3.0.10.RELEASE</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-test</artifactId>
<version>4.3.21.RELEASE</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<pluginManagement><!-- lock down plugins versions to avoid using Maven defaults (may be moved to parent pom) -->
<plugins>
<!-- clean lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#clean_Lifecycle -->
<plugin>
<artifactId>maven-clean-plugin</artifactId>
<version>3.1.0</version>
</plugin>
<!-- default lifecycle, jar packaging: see https://maven.apache.org/ref/current/maven-core/default-bindings.html#Plugin_bindings_for_jar_packaging -->
<plugin>
<artifactId>maven-resources-plugin</artifactId>
<version>3.0.2</version>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
</plugin>
<plugin>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.1</version>
</plugin>
<plugin>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.2</version>
</plugin>
<plugin>
<artifactId>maven-install-plugin</artifactId>
<version>2.5.2</version>
</plugin>
<plugin>
<artifactId>maven-deploy-plugin</artifactId>
<version>2.8.2</version>
</plugin>
<!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
<plugin>
<artifactId>maven-site-plugin</artifactId>
<version>3.7.1</version>
</plugin>
<plugin>
<artifactId>maven-project-info-reports-plugin</artifactId>
<version>3.0.0</version>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
因为Spring batch自己的pom中有依赖很多Spring核心包,所以在pom中没有配置。
定义领域对象
根据要需求定义相关对象
public class StudentInfo {
private String name;
private int age;
private String gender;
private String birthday;
private String address;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public String getGender() {
return gender;
}
public void setGender(String gender) {
this.gender = gender;
}
public String getBirthday() {
return birthday;
}
public void setBirthday(String birthday) {
this.birthday = birthday;
}
public String getAddress() {
return address;
}
public void setAddress(String address) {
this.address = address;
}
}
配置job
在Spring配置文件中定义批处理任务。
需要在xml文件头添加spring batch相关申明
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context.xsd
http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-3.0.xsd">
定义job基础配置
<!--采用了内存方式记录job执行期产生的状态信息-->
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"></bean>
<!--作业调度器,用来启动Job-->
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"></bean>
定义job
<batch:job id="studentJob">
<batch:step id="studentStep">
<batch:tasklet transaction-manager="transactionManager">
<!--commit-interval=2 表示提交间隔的大小,即每处理2条数据,进行一次写操作-->
<batch:chunk reader="csvItemReader" writer="csvItemWriter" processor="studentProcessor"
commit-interval="2"></batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
配置ItemReader
<bean id="csvItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="classpath:basic/test.csv"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer" ref="lineTokenizer"/>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="studentInfo"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="lineTokenizer" class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names">
<list>
<value>name</value>
<value>age</value>
<value>gender</value>
<value>birthday</value>
<value>address</value>
</list>
</property>
</bean>
配置ItemProcessor
<bean id="studentProcessor" scope="step" class="com.et.basic.StudentInfoProcessor"></bean>
新建java类,实现process功能
import org.springframework.batch.item.ItemProcessor;
public class StudentInfoProcessor implements ItemProcessor<StudentInfo, StudentInfo> {
@Override
public StudentInfo process(StudentInfo studentInfo) throws Exception {
System.out.println(studentInfo.toString());
return studentInfo;
}
}
配置ItemWriter
<bean id="csvItemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="file:target/basic/outputFile.csv"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value=","/>
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="name,age,gender,birthday,address"/>
</bean>
</property>
</bean>
</property>
</bean>
执行job
准备好测试文件,就可以通过下面的方式进行测试,同时需要观察console框中的输出信息。
java调用
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;
public class JobLaunch {
public static void main(String[] args) {
ApplicationContext context = new ClassPathXmlApplicationContext("basic/job.xml");
JobLauncher launcher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("studentJob");
try {
JobExecution result = launcher.run(job, new JobParameters());
System.out.println(result.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
}
JUnit单元测试
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = ("/basic/job.xml"))
public class JobLaunchTest {
@Autowired
private JobLauncher jobLauncher;
@Autowired
@Qualifier("studentJob")
private Job job;
@Before
public void setUp() throws Exception {
}
@After
public void tearDown() throws Exception {
}
@Test
public void batchTest() throws Exception {
JobExecution result = jobLauncher.run(job, new JobParameters());
System.out.println(result.toString());
}
}
总结
从代码上看,整个过程,我认为还是挺清楚地,所以没有做太多的解释。
参考链接:
完整的案例,已经上传至github。