Click to follow the official account, Java dry goods will be delivered in time
There is no one of the strongest microservice frameworks in China!
Covers almost all Spring Boot operations!
2023 New Java Interview Questions (2500+)
Source: blog.csdn.net/qq_35387940/article/details/108193473
foreword
I won’t say much about the concept words. Let me briefly introduce that spring batch is a more robust batch processing framework that is easy to use.
Why is it convenient to use, because this is a framework based on spring, with simple access, easy to understand, and clear processes.
Why it is relatively sound, because it provides log tracking, transaction granularity allocation, controllable execution, failure mechanism, retry mechanism, data reading and writing, etc. that we need to consider when processing large batches of data.
text
So back to the article, what will our article bring to you? (It is of course explained with examples)
From the realization of business scenarios, there are the following two:
Read data from csv file, perform business processing and store
Read data from the database, perform business processing and then store
That is, data cleaning or data filtering that is often encountered in daily life, or data migration and backup, etc. For a large amount of data, there are too many things to consider when implementing batch processing by yourself, and you are not at ease, so using the Spring Batch framework is a good choice.
First of all, before entering the example tutorial, let's take a look at this example. We use springboot to integrate the spring batch framework. What are the things to code?
Through a simple diagram to understand:
Maybe when you see this picture, do you think of the timed task framework more or less? It does look a bit like that, but I have to tell you here that this is a batch processing framework, not a schuedling framework. But as mentioned earlier, it provides executable control, that is to say, when the execution is controllable, then obviously it can be expanded by itself and combined with the timing task framework to realize what you think.
ok, back to the topic, I believe you can see from the picture simply and clearly what we need to achieve in this example. So I will not describe the various widgets in large quantities.
So without further ado, let's start our tutorial by example.
First prepare a database, and build a simple table in it for writing, storing or reading instance data. Recommended learning: Jumping a trough increased 8K, which shocked me. .
bloginfo表
Relevant table creation sql statement:
CREATE TABLE `bloginfo` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键',
`blogAuthor` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '博客作者标识',
`blogUrl` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '博客链接',
`blogTitle` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '博客标题',
`blogItem` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '博客栏目',
PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 89031 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
Core dependencies in the pom file:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- spring batch -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<!-- hibernate validator -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-validator</artifactId>
<version>6.0.7.Final</version>
</dependency>
<!-- mybatis -->
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>2.0.0</version>
</dependency>
<!-- mysql -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<!-- druid数据源驱动 1.1.10解决springboot从1.0——2.0版本问题-->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-starter</artifactId>
<version>1.1.18</version>
</dependency>
yml file:
I won’t introduce the basics of Spring Boot. It is recommended to watch this practical project:
https://github.com/javastacks/spring-boot-best-practice
spring:
batch:
job:
#设置为 false -需要jobLaucher.run执行
enabled: false
initialize-schema: always
# table-prefix: my-batch
datasource:
druid:
username: root
password: root
url: jdbc:mysql://localhost:3306/hellodemo?useSSL=false&useUnicode=true&characterEncoding=UTF-8&serverTimezone=GMT%2B8&zeroDateTimeBehavior=convertToNull
driver-class-name: com.mysql.cj.jdbc.Driver
initialSize: 5
minIdle: 5
maxActive: 20
maxWait: 60000
timeBetweenEvictionRunsMillis: 60000
minEvictableIdleTimeMillis: 300000
validationQuery: SELECT 1 FROM DUAL
testWhileIdle: true
testOnBorrow: false
testOnReturn: false
poolPreparedStatements: true
maxPoolPreparedStatementPerConnectionSize: 20
useGlobalDataSourceStat: true
connectionProperties: druid.stat.mergeSql=true;druid.stat.slowSqlMillis=5000
server:
port: 8665
ps: Here we use the druid database connection pool, but there is actually a small pit, which will be discussed later in the article.
Because after the final data processing of our instance this time, it is written to the database storage (of course, you can also output to a file, etc.).
So we also built a table earlier, and we also integrated mybatis in the pom file, so before we integrate the main code of spring batch, let's briefly go through these general information about the database.
Recommended learning: Spring Boot's latest learning roadmap is here, with 16 modules, novices can learn it!
pojo layers
BlogInfo.java :
/**
* @Author : JCccc
* @Description :
**/
public class BlogInfo {
private Integer id;
private String blogAuthor;
private String blogUrl;
private String blogTitle;
private String blogItem;
@Override
public String toString() {
return "BlogInfo{" +
"id=" + id +
", blogAuthor='" + blogAuthor + '\'' +
", blogUrl='" + blogUrl + '\'' +
", blogTitle='" + blogTitle + '\'' +
", blogItem='" + blogItem + '\'' +
'}';
}
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getBlogAuthor() {
return blogAuthor;
}
public void setBlogAuthor(String blogAuthor) {
this.blogAuthor = blogAuthor;
}
public String getBlogUrl() {
return blogUrl;
}
public void setBlogUrl(String blogUrl) {
this.blogUrl = blogUrl;
}
public String getBlogTitle() {
return blogTitle;
}
public void setBlogTitle(String blogTitle) {
this.blogTitle = blogTitle;
}
public String getBlogItem() {
return blogItem;
}
public void setBlogItem(String blogItem) {
this.blogItem = blogItem;
}
}
mapper layer
BlogMapper.java :
ps: You can see that I used annotations in this example, haha, to save trouble, and I didn’t write the servcie layer and impl layer, also to save trouble, because the focus of this article is not on these, so don’t learn from these bad ones .
import com.example.batchdemo.pojo.BlogInfo;
import org.apache.ibatis.annotations.*;
import java.util.List;
import java.util.Map;
/**
* @Author : JCccc
* @Description :
**/
@Mapper
public interface BlogMapper {
@Insert("INSERT INTO bloginfo ( blogAuthor, blogUrl, blogTitle, blogItem ) VALUES ( #{blogAuthor}, #{blogUrl},#{blogTitle},#{blogItem}) ")
@Options(useGeneratedKeys = true, keyProperty = "id")
int insert(BlogInfo bloginfo);
@Select("select blogAuthor, blogUrl, blogTitle, blogItem from bloginfo where blogAuthor < #{authorId}")
List<BlogInfo> queryInfoById(Map<String , Integer> map);
}
Next, the highlight, we started to code the various components involved in the previous picture.
First create a configuration class, MyBatchConfig.java
:
From my name, we can know that this is basically some configuration components involved in our integration of spring batch will be written here.
First of all, let's look at the picture above, which contains:
JobRepository job的注册/存储器
JobLauncher job的执行器
Job job任务,包含一个或多个Step
Step 包含(ItemReader、ItemProcessor和ItemWriter)
ItemReader 数据读取器
ItemProcessor 数据处理器
ItemWriter 数据输出器
First, add annotations before the MyBatchConfig class:
@Configuration
It is used to tell spring that our class is a custom configuration class, and many beans in it need to be loaded into the spring container
@EnableBatchProcessing
Enable batch support
Then start to write various widgets in the MyBatchConfig class.
In addition, if you are preparing for an interview to change jobs in the near future, it is recommended to brush up questions online in the Java interview library applet, covering 2000+ Java interview questions, covering almost all mainstream technical interview questions.
JobRepository
Written in the MyBatchConfig class
/**
* JobRepository定义:Job的注册容器以及和数据库打交道(事务管理等)
* @param dataSource
* @param transactionManager
* @return
* @throws Exception
*/
@Bean
public JobRepository myJobRepository(DataSource dataSource, PlatformTransactionManager transactionManager) throws Exception{
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDatabaseType("mysql");
jobRepositoryFactoryBean.setTransactionManager(transactionManager);
jobRepositoryFactoryBean.setDataSource(dataSource);
return jobRepositoryFactoryBean.getObject();
}
JobLauncher
Written in the MyBatchConfig class
/**
* jobLauncher定义:job的启动器,绑定相关的jobRepository
* @param dataSource
* @param transactionManager
* @return
* @throws Exception
*/
@Bean
public SimpleJobLauncher myJobLauncher(DataSource dataSource, PlatformTransactionManager transactionManager) throws Exception{
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
// 设置jobRepository
jobLauncher.setJobRepository(myJobRepository(dataSource, transactionManager));
return jobLauncher;
}
Job
Written in the MyBatchConfig class
/**
* 定义job
* @param jobs
* @param myStep
* @return
*/
@Bean
public Job myJob(JobBuilderFactory jobs, Step myStep){
return jobs.get("myJob")
.incrementer(new RunIdIncrementer())
.flow(myStep)
.end()
.listener(myJobListener())
.build();
}
For the operation of Job, it is possible to configure the listener
JobListener
Written in the MyBatchConfig class
/**
* 注册job监听器
* @return
*/
@Bean
public MyJobListener myJobListener(){
return new MyJobListener();
}
This is our own custom listener, so it is created separately MyJobListener.java
:
/**
* @Author : JCccc
* @Description :监听Job执行情况,实现JobExecutorListener,且在batch配置类里,Job的Bean上绑定该监听器
**/
public class MyJobListener implements JobExecutionListener {
private Logger logger = LoggerFactory.getLogger(MyJobListener.class);
@Override
public void beforeJob(JobExecution jobExecution) {
logger.info("job 开始, id={}",jobExecution.getJobId());
}
@Override
public void afterJob(JobExecution jobExecution) {
logger.info("job 结束, id={}",jobExecution.getJobId());
}
}
Step(ItemReader ItemProcessor ItemWriter)
The step contains the implementation of three small components: data reader, data processor, and data outputter.
We also disassemble one by one to write.
As mentioned earlier in the article, this article implements two scenarios, one is to read a large amount of data from a csv file for processing, and the other is to read a large amount of data from a database table for processing.
Click to follow the official account, Java dry goods will be delivered in time
Read data from CSV file
ItemReader
Written in the MyBatchConfig class
/**
* ItemReader定义:读取文件数据+entirty实体类映射
* @return
*/
@Bean
public ItemReader<BlogInfo> reader(){
// 使用FlatFileItemReader去读cvs文件,一行即一条数据
FlatFileItemReader<BlogInfo> reader = new FlatFileItemReader<>();
// 设置文件处在路径
reader.setResource(new ClassPathResource("static/bloginfo.csv"));
// entity与csv数据做映射
reader.setLineMapper(new DefaultLineMapper<BlogInfo>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[]{"blogAuthor","blogUrl","blogTitle","blogItem"});
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<BlogInfo>() {
{
setTargetType(BlogInfo.class);
}
});
}
});
return reader;
}
Simple code analysis:
For the data reader ItemReader, we arrange a read listener for it, create MyReadListener.java
:
/**
* @Author : JCccc
* @Description :
**/
public class MyReadListener implements ItemReadListener<BlogInfo> {
private Logger logger = LoggerFactory.getLogger(MyReadListener.class);
@Override
public void beforeRead() {
}
@Override
public void afterRead(BlogInfo item) {
}
@Override
public void onReadError(Exception ex) {
try {
logger.info(format("%s%n", ex.getMessage()));
} catch (Exception e) {
e.printStackTrace();
}
}
}
ItemProcessor
Written in the MyBatchConfig class
/**
* 注册ItemProcessor: 处理数据+校验数据
* @return
*/
@Bean
public ItemProcessor<BlogInfo, BlogInfo> processor(){
MyItemProcessor myItemProcessor = new MyItemProcessor();
// 设置校验器
myItemProcessor.setValidator(myBeanValidator());
return myItemProcessor;
}
The data processor is customized by us, which mainly contains our business logic for data processing, and we have set up some data validators. Here we use the Validator of JSR-303 as the validator.
Validator
Written in the MyBatchConfig class
/**
* 注册校验器
* @return
*/
@Bean
public MyBeanValidator myBeanValidator(){
return new MyBeanValidator<BlogInfo>();
}
create MyItemProcessor.java
:
ps: My data processing logic is to get the blogItem field of each piece of data in the read data. If it is springboot, then replace the title field value.
In fact, it is to simulate a simple data processing scenario.
import com.example.batchdemo.pojo.BlogInfo;
import org.springframework.batch.item.validator.ValidatingItemProcessor;
import org.springframework.batch.item.validator.ValidationException;
/**
* @Author : JCccc
* @Description :
**/
public class MyItemProcessor extends ValidatingItemProcessor<BlogInfo> {
@Override
public BlogInfo process(BlogInfo item) throws ValidationException {
/**
* 需要执行super.process(item)才会调用自定义校验器
*/
super.process(item);
/**
* 对数据进行简单的处理
*/
if (item.getBlogItem().equals("springboot")) {
item.setBlogTitle("springboot 系列还请看看我Jc");
} else {
item.setBlogTitle("未知系列");
}
return item;
}
}
Create MyBeanValidator.java:
import org.springframework.batch.item.validator.ValidationException;
import org.springframework.batch.item.validator.Validator;
import org.springframework.beans.factory.InitializingBean;
import javax.validation.ConstraintViolation;
import javax.validation.Validation;
import javax.validation.ValidatorFactory;
import java.util.Set;
/**
* @Author : JCccc
* @Description :
**/
public class MyBeanValidator<T> implements Validator<T>, InitializingBean {
private javax.validation.Validator validator;
@Override
public void validate(T value) throws ValidationException {
/**
* 使用Validator的validate方法校验数据
*/
Set<ConstraintViolation<T>> constraintViolations =
validator.validate(value);
if (constraintViolations.size() > 0) {
StringBuilder message = new StringBuilder();
for (ConstraintViolation<T> constraintViolation : constraintViolations) {
message.append(constraintViolation.getMessage() + "\n");
}
throw new ValidationException(message.toString());
}
}
/**
* 使用JSR-303的Validator来校验我们的数据,在此进行JSR-303的Validator的初始化
* @throws Exception
*/
@Override
public void afterPropertiesSet() throws Exception {
ValidatorFactory validatorFactory =
Validation.buildDefaultValidatorFactory();
validator = validatorFactory.usingContext().getValidator();
}
}
ps: In fact, this article does not use this data validator. If you want to use it, you can add some validator annotations @NotNull @Max @Email and so on to the entity class. I prefer to process directly in the processor, and I want to write all the code about data processing in one piece.
ItemWriter
Written in the MyBatchConfig class
/**
* ItemWriter定义:指定datasource,设置批量插入sql语句,写入数据库
* @param dataSource
* @return
*/
@Bean
public ItemWriter<BlogInfo> writer(DataSource dataSource){
// 使用jdbcBcatchItemWrite写数据到数据库中
JdbcBatchItemWriter<BlogInfo> writer = new JdbcBatchItemWriter<>();
// 设置有参数的sql语句
writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<BlogInfo>());
String sql = "insert into bloginfo "+" (blogAuthor,blogUrl,blogTitle,blogItem) "
+" values(:blogAuthor,:blogUrl,:blogTitle,:blogItem)";
writer.setSql(sql);
writer.setDataSource(dataSource);
return writer;
}
Simple code analysis:
Also for the data reader ItemWriter, we also arranged an output listener for it, creating MyWriteListener.java
:
import com.example.batchdemo.pojo.BlogInfo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.batch.core.ItemWriteListener;
import java.util.List;
import static java.lang.String.format;
/**
* @Author : JCccc
* @Description :
**/
public class MyWriteListener implements ItemWriteListener<BlogInfo> {
private Logger logger = LoggerFactory.getLogger(MyWriteListener.class);
@Override
public void beforeWrite(List<? extends BlogInfo> items) {
}
@Override
public void afterWrite(List<? extends BlogInfo> items) {
}
@Override
public void onWriteError(Exception exception, List<? extends BlogInfo> items) {
try {
logger.info(format("%s%n", exception.getMessage()));
for (BlogInfo message : items) {
logger.info(format("Failed writing BlogInfo : %s", message.toString()));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
ItemReader
, ItemProcessor
, ItemWriter
, these three widgets are here, we have implemented them all, so the next step is to bind these three widgets to our step.
Written in the MyBatchConfig class
/**
* step定义:
* 包括
* ItemReader 读取
* ItemProcessor 处理
* ItemWriter 输出
* @param stepBuilderFactory
* @param reader
* @param writer
* @param processor
* @return
*/
@Bean
public Step myStep(StepBuilderFactory stepBuilderFactory, ItemReader<BlogInfo> reader,
ItemWriter<BlogInfo> writer, ItemProcessor<BlogInfo, BlogInfo> processor){
return stepBuilderFactory
.get("myStep")
.<BlogInfo, BlogInfo>chunk(65000) // Chunk的机制(即每次读取一条数据,再处理一条数据,累积到一定数量后再一次性交给writer进行写入操作)
.reader(reader).faultTolerant().retryLimit(3).retry(Exception.class).skip(Exception.class).skipLimit(2)
.listener(new MyReadListener())
.processor(processor)
.writer(writer).faultTolerant().skip(Exception.class).skipLimit(2)
.listener(new MyWriteListener())
.build();
}
This Step, a little explanation.
As mentioned earlier, the spring batch framework provides mechanisms for transaction control, restart, detection skipping, and so on.
Well, the realization of these things lies in the setting of this step link.
First of all, see the first setting in our code, chunk( 6500 )
, the mechanism of Chunk (that is, read one piece of data each time, and then process one piece of data, accumulate a certain amount and then hand it over to the writer for writing operation.
That's right, for the entire step link, it is the reading of data, processing and finally output.
In this chunk mechanism, the 6500 we pass in is to tell it to read and process the data, and the accumulated data reaches 6500 for a batch processing to perform the write operation.
This value transfer depends on the specific business. It can be 500 records at a time, 1000 records at a time, or 20 records at a time, or 50 records at a time.
Use a simple small picture to help understand:
In our large amount of data processing, whether it is reading or writing, it will definitely involve some unknown or known factors that cause a piece of data to fail.
So if we say that we do not set anything and fail a piece of data, then we will treat it as a failure? . Obviously this is too inhumane, so spring batch provides two settings of retry and skip (in fact, there is also restart), through these two settings to humanely solve some data operation failure scenarios.
retryLimit(3).retry(Exception.class)
That's right, this is to set retries, how many times to retry when an exception occurs. We set it to 3, that is to say, when a data operation fails, we will retry the data 3 times, and if it fails, it will be regarded as a failure. If we have configured skip (recommended configuration), then this data will fail The record will be left for skip to process.
skip(Exception.class).skipLimit(2)
skip, skip, that is to say, if we set 3, then the failure of 3 pieces of data can be tolerated. Only when the failure data reaches 3 times, we interrupt this step.
For the failed data, we made relevant listeners and exception information records for subsequent manual remediation.
Then remember that we start to call this batch job, we trigger this batch event through the interface, and create a new Controller TestController.java
:
/**
* @Author : JCccc
* @Description :
**/
@RestController
public class TestController {
@Autowired
SimpleJobLauncher jobLauncher;
@Autowired
Job myJob;
@GetMapping("testJob")
public void testJob() throws JobParametersInvalidException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException {
// 后置参数:使用JobParameters中绑定参数 addLong addString 等方法
JobParameters jobParameters = new JobParametersBuilder().toJobParameters();
jobLauncher.run(myJob, jobParameters);
}
}
By the way, I prepared a csv file bloginfo.csv
with about 80,000 pieces of data in it for batch testing:
The path of this file must be consistent with the path read in our data reader.
Our database currently looks like this,
Next, we started our project, took another look at the database, and generated some batches to track and record some data tables of the job:
Let's call the testJob interface,
Then look at the database, all the data that can be seen have been processed and inserted into the database:
So far, we have actually completed the integration of spring batch with Springboot, and also realized the business scenario of reading data from csv files, processing and storing them.
read data from database
ps: The front row prompts that there are pitfalls in using druid. I will talk about it later.
Then the next step is to realize the scene, read the data from the database table, process and output it to the new table.
Then based on our above integration, we have achieved
JobRepository job的注册/存储器
JobLauncher job的执行器
Job job任务,包含一个或多个Step
Step 包含(ItemReader、ItemProcessor和ItemWriter)
ItemReader 数据读取器
ItemProcessor 数据处理器
ItemWriter 数据输出器
job 监听器
reader 监听器
writer 监听器
process 数据校验器
So for us to write a new job to complete a new scene, do we need to rewrite it all?
Obviously not necessary, of course, it is also possible to write a completely new set.
Then in this article, for a new scene, read data from csv file to read data from database table, we re-create:
Data Reader: Previously used
FlatFileItemReader
, we are now usingMyBatisCursorItemReader
Data processor: In new scenarios, in order to expand business, it is best for our processor to create a new one
Data exporter: In a new scenario, in order to expand the business, it is best to create a new one for our data exporter
Step binding settings: new scenarios, business expansion, so it is best for our step to create a new one
Job: Of course I have to write a new one
For the rest, we can just use the original ones, JobRepository, JobLauncher and various listeners, and we will not rebuild them for the time being.
New MyItemProcessorNew.java
:
import org.springframework.batch.item.validator.ValidatingItemProcessor;
import org.springframework.batch.item.validator.ValidationException;
/**
* @Author : JCccc
* @Description :
**/
public class MyItemProcessorNew extends ValidatingItemProcessor<BlogInfo> {
@Override
public BlogInfo process(BlogInfo item) throws ValidationException {
/**
* 需要执行super.process(item)才会调用自定义校验器
*/
super.process(item);
/**
* 对数据进行简单的处理
*/
Integer authorId= Integer.valueOf(item.getBlogAuthor());
if (authorId<20000) {
item.setBlogTitle("这是都是小于20000的数据");
} else if (authorId>20000 && authorId<30000){
item.setBlogTitle("这是都是小于30000但是大于20000的数据");
}else {
item.setBlogTitle("旧书不厌百回读");
}
return item;
}
}
Then other redefined widgets are written in the MyBatchConfig class:
/**
* 定义job
* @param jobs
* @param stepNew
* @return
*/
@Bean
public Job myJobNew(JobBuilderFactory jobs, Step stepNew){
return jobs.get("myJobNew")
.incrementer(new RunIdIncrementer())
.flow(stepNew)
.end()
.listener(myJobListener())
.build();
}
@Bean
public Step stepNew(StepBuilderFactory stepBuilderFactory, MyBatisCursorItemReader<BlogInfo> itemReaderNew,
ItemWriter<BlogInfo> writerNew, ItemProcessor<BlogInfo, BlogInfo> processorNew){
return stepBuilderFactory
.get("stepNew")
.<BlogInfo, BlogInfo>chunk(65000) // Chunk的机制(即每次读取一条数据,再处理一条数据,累积到一定数量后再一次性交给writer进行写入操作)
.reader(itemReaderNew).faultTolerant().retryLimit(3).retry(Exception.class).skip(Exception.class).skipLimit(10)
.listener(new MyReadListener())
.processor(processorNew)
.writer(writerNew).faultTolerant().skip(Exception.class).skipLimit(2)
.listener(new MyWriteListener())
.build();
}
@Bean
public ItemProcessor<BlogInfo, BlogInfo> processorNew(){
MyItemProcessorNew csvItemProcessor = new MyItemProcessorNew();
// 设置校验器
csvItemProcessor.setValidator(myBeanValidator());
return csvItemProcessor;
}
@Autowired
private SqlSessionFactory sqlSessionFactory;
@Bean
@StepScope
//Spring Batch提供了一个特殊的bean scope类(StepScope:作为一个自定义的Spring bean scope)。这个step scope的作用是连接batches的各个steps。这个机制允许配置在Spring的beans当steps开始时才实例化并且允许你为这个step指定配置和参数。
public MyBatisCursorItemReader<BlogInfo> itemReaderNew(@Value("#{jobParameters[authorId]}") String authorId) {
System.out.println("开始查询数据库");
MyBatisCursorItemReader<BlogInfo> reader = new MyBatisCursorItemReader<>();
reader.setQueryId("com.example.batchdemo.mapper.BlogMapper.queryInfoById");
reader.setSqlSessionFactory(sqlSessionFactory);
Map<String , Object> map = new HashMap<>();
map.put("authorId" , Integer.valueOf(authorId));
reader.setParameterValues(map);
return reader;
}
/**
* ItemWriter定义:指定datasource,设置批量插入sql语句,写入数据库
* @param dataSource
* @return
*/
@Bean
public ItemWriter<BlogInfo> writerNew(DataSource dataSource){
// 使用jdbcBcatchItemWrite写数据到数据库中
JdbcBatchItemWriter<BlogInfo> writer = new JdbcBatchItemWriter<>();
// 设置有参数的sql语句
writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<BlogInfo>());
String sql = "insert into bloginfonew "+" (blogAuthor,blogUrl,blogTitle,blogItem) "
+" values(:blogAuthor,:blogUrl,:blogTitle,:blogItem)";
writer.setSql(sql);
writer.setDataSource(dataSource);
return writer;
}
Points to note about the code
data readerMyBatisCursorItemReader
The corresponding mapper method:
Data processor MyItemProcessorNew:
Data exporter, newly inserted into other database tables, specially for testing:
Of course, in order to test this scenario, our database also created a new table, the bloginfonew table.
Next, we write a new interface to execute the new job:
@Autowired
SimpleJobLauncher jobLauncher;
@Autowired
Job myJobNew;
@GetMapping("testJobNew")
public void testJobNew(@RequestParam("authorId") String authorId) throws JobParametersInvalidException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException {
JobParameters jobParametersNew = new JobParametersBuilder().addLong("timeNew", System.currentTimeMillis())
.addString("authorId",authorId)
.toJobParameters();
jobLauncher.run(myJobNew,jobParametersNew);
}
ok, let's call some of this interface:
Take a look at the console:
That's right, this is a failure, because it is related to druid, and a database function is not supported. This is an error reported when the data is read.
My preliminary test is that MyBatisCursorItemReader
the druid database connection pool does not support it.
Then, we just need to:
Comment out the druid connection pool jar dependency
Replace the connection pool configuration in yml
In fact, we do not configure other connection pools. The springboot 2.X version has integrated the default connection pool HikariCP for us.
In the Springboot2.X version, the database connection pool is officially recommended to use HikariCP
If it is not for the background monitoring data of druid, sql analysis, etc., HikariCP is completely preferred.
The official words:
We preferHikariCPfor its performance and concurrency. If HikariCP is available, we always choose it.
translate:
We prefer hikaricpf for performance and concurrency. If HikariCP is available, we always choose it.
So we don't deserve any connection pool, and use the default HikariCP connection pool.
Recommend an open source and free Spring Boot practical project:
https://github.com/javastacks/spring-boot-best-practice
Of course, if you want to match, it is also possible:
So after we get rid of the druid connection pool, let's call the new interface again:
It can be seen that obtaining data from the database and performing batch processing to write the job is successful:
The data inserted in the new table has been processed by the logic written by myself:
Well, springboot integrates the spring batch batch processing framework, and that's it.
If you want to learn Spring Boot systematically, I recommend my " Spring Boot Core Technology Course ", based on the latest 3.x version, covering almost all core knowledge points of Spring Boot , one-time payment, permanent learning...
Those who are interested scan the code to contact and subscribe to learn:
If you want to try/preview, add Brother R on WeChat
Add WeChat, please note: 399
Click to read the original text to get 2000+ interview questions!