Spring Batch batch processing framework

1. Introduction to SpringBatch

Spring Batch is a lightweight, comprehensive batch processing framework designed to support the development of robust batch processing applications that are critical to the day-to-day operation of enterprise systems. Spring Batch builds on the expected Spring Framework features (productivity, POJO-based development approach, and general ease of use) while making it easy for developers to access and use higher-level enterprise services when necessary.

Spring Batch is not a scheduling framework. There are many good enterprise schedulers (eg Quartz, Tivoli, Control-M, etc.) in both the commercial and open source worlds. Spring Batch is intended to be used in conjunction with the scheduler, not to replace it.
insert image description here

2. Business scenarios

We often encounter this situation in business development:
insert image description here

Spring Batch supports the following business scenarios:

  • Submit batches periodically.
  • Concurrent batch processing: Process jobs in parallel.
  • Staged enterprise message-driven processing.
  • Massively parallel batch processing.
  • Manual or scheduled restart after failure.
  • Sequential processing of related steps (extended to workflow-driven batches).
  • Partial processing: Skip records (e.g. when rolling back).
  • Entire batch of transactions, applicable to small batches or existing stored procedures or scripts.

3. Basic knowledge

3.1. Overall structure

Official documentation: https://docs.spring.io/spring-batch/docs/current/reference/html/index-single.html#domainLanguageOfBatch
insert image description here

name effect
JobRepository Provides a persistence mechanism for all prototypes (Job, JobInstance, Step)
JobLauncher JobLauncher represents a simple interface for launching a Job given a collection of JobParameters
Job Job is an entity that encapsulates the entire batch process
Step Step is a domain object that encapsulates an independent sequential stage of a batch job

3.2. Core interface

  • ItemReader: is an abstraction that represents the output of a Step,
    one batch or chunk of items at a time
  • ItemProcessor:an abstraction that represents the business processing
    of an item.
  • ItemWriter: is an abstraction that represents the output of a Step,
    one batch or chunk of items at a time.
    insert image description here
    Generally, it is input → data processing → output. A Job defines multiple Steps and processing procedures. A Step usually covers ItemReader, ItemProcessor , ItemWriter

4. Basic practice

4.0, introduce SpringBatch

The pom file introduces springboot

<parent>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-parent</artifactId>
  <version>2.2.5.RELEASE</version>
  <relativePath/> <!-- lookup parent from repository -->
</parent>

The pom file introduces spring-batch and related dependencies

<dependencies>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-batch</artifactId>
    </dependency>

    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-validation</artifactId>
    </dependency>

    <dependency>
      <groupId>mysql</groupId>
      <artifactId>mysql-connector-java</artifactId>
    </dependency>

    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-jdbc</artifactId>
    </dependency>
  </dependencies>


insert image description here
Mysql creates the jar package path of the dependent library table sql script: ...\maven\repository\org\springframework\batch\spring-batch-core\4.2.1.RELEASE\spring-batch-core- 4.2.1.RELEASE.jar !\org\springframework\batch\core\schema-mysql.sql

Start class flag @EnableBatchProcessing

@SpringBootApplication
@EnableBatchProcessing
public class SpringBatchStartApplication
{
    
    
    public static void main(String[] args) {
    
    
        SpringApplication.run(SpringBatchStartApplication.class, args);
    }
}

FirstJobDemo

@Component
public class FirstJobDemo {
    
    

    @Autowired
    private JobBuilderFactory jobBuilderFactory;
    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Bean
    public Job firstJob() {
    
    
        return jobBuilderFactory.get("firstJob")
                .start(step())
                .build();
    }

    private Step step() {
    
    
        return stepBuilderFactory.get("step")
                .tasklet((contribution, chunkContext) -> {
    
    
                    System.out.println("执行步骤....");
                    return RepeatStatus.FINISHED;
                }).build();
    }
}

4.1. Process control

A. Multi-step tasks

@Bean
public Job multiStepJob() {
    
    
    return jobBuilderFactory.get("multiStepJob2")
            .start(step1())
            .on(ExitStatus.COMPLETED.getExitCode()).to(step2())
            .from(step2())
            .on(ExitStatus.COMPLETED.getExitCode()).to(step3())
            .from(step3()).end()
            .build();
}


private Step step1() {
    
    
    return stepBuilderFactory.get("step1")
            .tasklet((stepContribution, chunkContext) -> {
    
    
                System.out.println("执行步骤一操作。。。");
                return RepeatStatus.FINISHED;
            }).build();
}

private Step step2() {
    
    
    return stepBuilderFactory.get("step2")
            .tasklet((stepContribution, chunkContext) -> {
    
    
                System.out.println("执行步骤二操作。。。");
                return RepeatStatus.FINISHED;
            }).build();
}

private Step step3() {
    
    
    return stepBuilderFactory.get("step3")
            .tasklet((stepContribution, chunkContext) -> {
    
    
                System.out.println("执行步骤三操作。。。");
                return RepeatStatus.FINISHED;
            }).build();
}

B. Parallel execution
Two flows are created: flow1 (including step1 and step2) and flow2 (including step3). Then through the split method of JobBuilderFactory, specify an asynchronous executor to execute flow1 and flow2 asynchronously (that is, in parallel)

@Component
public class SplitJobDemo {
    
    

    @Autowired
    private JobBuilderFactory jobBuilderFactory;
    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Bean
    public Job splitJob() {
    
    
        return jobBuilderFactory.get("splitJob")
                .start(flow1())
                .split(new SimpleAsyncTaskExecutor()).add(flow2())
                .end()
                .build();

    }

    private Step step1() {
    
    
        return stepBuilderFactory.get("step1")
                .tasklet((stepContribution, chunkContext) -> {
    
    
                    System.out.println("执行步骤一操作。。。");
                    return RepeatStatus.FINISHED;
                }).build();
    }

    private Step step2() {
    
    
        return stepBuilderFactory.get("step2")
                .tasklet((stepContribution, chunkContext) -> {
    
    
                    System.out.println("执行步骤二操作。。。");
                    return RepeatStatus.FINISHED;
                }).build();
    }

    private Step step3() {
    
    
        return stepBuilderFactory.get("step3")
                .tasklet((stepContribution, chunkContext) -> {
    
    
                    System.out.println("执行步骤三操作。。。");
                    return RepeatStatus.FINISHED;
                }).build();
    }

    private Flow flow1() {
    
    
        return new FlowBuilder<Flow>("flow1")
                .start(step1())
                .next(step2())
                .build();
    }

    private Flow flow2() {
    
    
        return new FlowBuilder<Flow>("flow2")
                .start(step3())
                .build();
    }
}

C. Task decision
The role of the decision maker is to specify the program to run different task processes in different situations. For example, if today is a weekend, let the task execute step1 and step2; if it is a weekday, then execute step1 and step3.

@Component
public class MyDecider implements JobExecutionDecider {
    
    
    @Override
    public FlowExecutionStatus decide(JobExecution jobExecution, StepExecution stepExecution) {
    
    
        LocalDate now = LocalDate.now();
        DayOfWeek dayOfWeek = now.getDayOfWeek();

        if (dayOfWeek == DayOfWeek.SATURDAY || dayOfWeek == DayOfWeek.SUNDAY) {
    
    
            return new FlowExecutionStatus("weekend");
        } else {
    
    
            return new FlowExecutionStatus("workingDay");
        }
    }
}
@Bean
public Job deciderJob() {
    
    
 return jobBuilderFactory.get("deciderJob")
   .start(step1())
   .next(myDecider)
   .from(myDecider).on("weekend").to(step2())
   .from(myDecider).on("workingDay").to(step3())
   .from(step3()).on("*").to(step4())
   .end()
   .build();
}
private Step step1() {
    
    
 return stepBuilderFactory.get("step1")
   .tasklet((stepContribution, chunkContext) -> {
    
    
    System.out.println("执行步骤一操作。。。");
    return RepeatStatus.FINISHED;
   }).build();
}

private Step step2() {
    
    
 return stepBuilderFactory.get("step2")
   .tasklet((stepContribution, chunkContext) -> {
    
    
    System.out.println("执行步骤二操作。。。");
    return RepeatStatus.FINISHED;
   }).build();
}

private Step step3() {
    
    
 return stepBuilderFactory.get("step3")
   .tasklet((stepContribution, chunkContext) -> {
    
    
    System.out.println("执行步骤三操作。。。");
    return RepeatStatus.FINISHED;
   }).build();
}


private Step step4() {
    
    
 return stepBuilderFactory.get("step4")
   .tasklet((stepContribution, chunkContext) -> {
    
    
    System.out.println("执行步骤四操作。。。");
    return RepeatStatus.FINISHED;
   }).build();
}

D. Task nesting
In addition to the task Job can be composed of Step or Flow, we can also convert multiple tasks Job into a special Step, and then assign it to another task Job, which is the nesting of tasks.

@Component
public class NestedJobDemo {
    
    

    @Autowired
    private JobBuilderFactory jobBuilderFactory;
    @Autowired
    private StepBuilderFactory stepBuilderFactory;
    @Autowired
    private JobLauncher jobLauncher;
    @Autowired
    private JobRepository jobRepository;
    @Autowired
    private PlatformTransactionManager platformTransactionManager;

    // 父任务
    @Bean
    public Job parentJob() {
    
    
        return jobBuilderFactory.get("parentJob")
                .start(childJobOneStep())
                .next(childJobTwoStep())
                .build();
    }


    // 将任务转换为特殊的步骤
    private Step childJobOneStep() {
    
    
        return new JobStepBuilder(new StepBuilder("childJobOneStep"))
                .job(childJobOne())
                .launcher(jobLauncher)
                .repository(jobRepository)
                .transactionManager(platformTransactionManager)
                .build();
    }

    // 将任务转换为特殊的步骤
    private Step childJobTwoStep() {
    
    
        return new JobStepBuilder(new StepBuilder("childJobTwoStep"))
                .job(childJobTwo())
                .launcher(jobLauncher)
                .repository(jobRepository)
                .transactionManager(platformTransactionManager)
                .build();
    }

    // 子任务一
    private Job childJobOne() {
    
    
        return jobBuilderFactory.get("childJobOne")
                .start(
                    stepBuilderFactory.get("childJobOneStep")
                            .tasklet((stepContribution, chunkContext) -> {
    
    
                                System.out.println("子任务一执行步骤。。。");
                                return RepeatStatus.FINISHED;
                            }).build()
                ).build();
    }

    // 子任务二
    private Job childJobTwo() {
    
    
        return jobBuilderFactory.get("childJobTwo")
                .start(
                    stepBuilderFactory.get("childJobTwoStep")
                            .tasklet((stepContribution, chunkContext) -> {
    
    
                                System.out.println("子任务二执行步骤。。。");
                                return RepeatStatus.FINISHED;
                            }).build()
                ).build();
    }
}

4.2, read data

Define Model TestData, the same below

@Data
public class TestData {
    
    
    private int id;
    private String field1;
    private String field2;
    private String field3;
}

Read data includes: text data read, database data read, XML data read, JSON data read, etc., check the information yourself.

Text data reading Demo

@Component
public class FileItemReaderDemo {
    
    

    // 任务创建工厂
    @Autowired
    private JobBuilderFactory jobBuilderFactory;
    // 步骤创建工厂
    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Bean
    public Job fileItemReaderJob() {
    
    
        return jobBuilderFactory.get("fileItemReaderJob2")
                .start(step())
                .build();
    }

    private Step step() {
    
    
        return stepBuilderFactory.get("step")
                .<TestData, TestData>chunk(2)
                .reader(fileItemReader())
                .writer(list -> list.forEach(System.out::println))
                .build();
    }

    private ItemReader<TestData> fileItemReader() {
    
    
        FlatFileItemReader<TestData> reader = new FlatFileItemReader<>();
        reader.setResource(new ClassPathResource("reader/file")); // 设置文件资源地址
        reader.setLinesToSkip(1); // 忽略第一行

        // AbstractLineTokenizer的三个实现类之一,以固定分隔符处理行数据读取,
        // 使用默认构造器的时候,使用逗号作为分隔符,也可以通过有参构造器来指定分隔符
        DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();

        // 设置属性名,类似于表头
        tokenizer.setNames("id", "field1", "field2", "field3");

        // 将每行数据转换为TestData对象
        DefaultLineMapper<TestData> mapper = new DefaultLineMapper<>();
        // 设置LineTokenizer
        mapper.setLineTokenizer(tokenizer);

        // 设置映射方式,即读取到的文本怎么转换为对应的POJO
        mapper.setFieldSetMapper(fieldSet -> {
    
    
            TestData data = new TestData();
            data.setId(fieldSet.readInt("id"));
            data.setField1(fieldSet.readString("field1"));
            data.setField2(fieldSet.readString("field2"));
            data.setField3(fieldSet.readString("field3"));
            return data;
        });
        reader.setLineMapper(mapper);
        return reader;
    }

}

4.3. Output data

Output data also includes: text data reading, database data reading, XML data reading, JSON data reading, etc.

@Component
public class FileItemWriterDemo {
    
    

    @Autowired
    private JobBuilderFactory jobBuilderFactory;
    @Autowired
    private StepBuilderFactory stepBuilderFactory;
    @Resource(name = "writerSimpleReader")
    private ListItemReader<TestData> writerSimpleReader;

    @Bean
    public Job fileItemWriterJob() throws Exception {
    
    
        return jobBuilderFactory.get("fileItemWriterJob")
                .start(step())
                .build();
    }

    private Step step() throws Exception {
    
    
        return stepBuilderFactory.get("step")
                .<TestData, TestData>chunk(2)
                .reader(writerSimpleReader)
                .writer(fileItemWriter())
                .build();
    }

    private FlatFileItemWriter<TestData> fileItemWriter() throws Exception {
    
    
        FlatFileItemWriter<TestData> writer = new FlatFileItemWriter<>();

        FileSystemResource file = new FileSystemResource("D:/code/spring-batch-demo/src/main/resources/writer/writer-file");
        Path path = Paths.get(file.getPath());
        if (!Files.exists(path)) {
    
    
            Files.createFile(path);
        }
        // 设置输出文件路径
        writer.setResource(file);

        // 把读到的每个TestData对象转换为JSON字符串
        LineAggregator<TestData> aggregator = item -> {
    
    
            try {
    
    
                ObjectMapper mapper = new ObjectMapper();
                return mapper.writeValueAsString(item);
            } catch (JsonProcessingException e) {
    
    
                e.printStackTrace();
            }
            return "";
        };

        writer.setLineAggregator(aggregator);
        writer.afterPropertiesSet();
        return writer;
    }

}

4.5. Processing data

@Component
public class ValidatingItemProcessorDemo {
    
    

    @Autowired
    private JobBuilderFactory jobBuilderFactory;
    @Autowired
    private StepBuilderFactory stepBuilderFactory;
    @Resource(name = "processorSimpleReader")
    private ListItemReader<TestData> processorSimpleReader;

    @Bean
    public Job validatingItemProcessorJob() throws Exception {
    
    
        return jobBuilderFactory.get("validatingItemProcessorJob3")
                .start(step())
                .build();
    }

    private Step step() throws Exception {
    
    
        return stepBuilderFactory.get("step")
                .<TestData, TestData>chunk(2)
                .reader(processorSimpleReader)
                .processor(beanValidatingItemProcessor())
                .writer(list -> list.forEach(System.out::println))
                .build();
    }

//    private ValidatingItemProcessor<TestData> validatingItemProcessor() {
    
    
//        ValidatingItemProcessor<TestData> processor = new ValidatingItemProcessor<>();
//        processor.setValidator(value -> {
    
    
//            // 对每一条数据进行校验
//            if ("".equals(value.getField3())) {
    
    
//                // 如果field3的值为空串,则抛异常
//                throw new ValidationException("field3的值不合法");
//            }
//        });
//        return processor;
//    }

    private BeanValidatingItemProcessor<TestData> beanValidatingItemProcessor() throws Exception {
    
    
        BeanValidatingItemProcessor<TestData> beanValidatingItemProcessor = new BeanValidatingItemProcessor<>();
        // 开启过滤,不符合规则的数据被过滤掉;
//        beanValidatingItemProcessor.setFilter(true);
        beanValidatingItemProcessor.afterPropertiesSet();
        return beanValidatingItemProcessor;
    }

}

4.6. Task scheduling

It can cooperate with quartz or xxljob to realize timing task execution

@RestController
@RequestMapping("job")
public class JobController {
    
    

    @Autowired
    private Job job;
    @Autowired
    private JobLauncher jobLauncher;

    @GetMapping("launcher/{message}")
    public String launcher(@PathVariable String message) throws Exception {
    
    
        JobParameters parameters = new JobParametersBuilder()
                .addString("message", message)
                .toJobParameters();
        // 将参数传递给任务
        jobLauncher.run(job, parameters);
        return "success";
    }
}

Guess you like

Origin blog.csdn.net/weixin_43114209/article/details/131527661