Spring Batch 2.1.8 中文文档(二)

Chapter 2 Spring Batch 新特性

Spring Batch 2.0正式版特性分为6个主题：

Java 5
不连续的分布执行
面向Chunk处理
强化元数据
扩展性
可配置

2.1 JAVA 5

Spring 1.x都是基于Java 1.4开发的，因此这个系列版本的框架都无法使用Java 5提供的一些增强特性，如泛型，参数化类型等。升级后的框架充分利用了这些特性，但是也无法支持Java 1.4。大部分开发者所打交道的接口都升级为泛型形式。例如ItemReader接口在1.1是如下形式

public interface ItemReader {

    Object read() throws Exception;

    void mark() throws MarkFailedException;
 
    void reset() throws ResetFailedException;
}

这个版本下read方法返回Object类型。到了2.0版本：

public interface ItemReader<T> {

    T read() throws Exception, UnexpectedInputException, ParseException;

}

ItemReader现在支持了返回泛型T，也可以看到mark方法和reset方法被去除了，这是由于分布执行的策略改变了，许多其他接口也有类似改变。下文将会逐一讨论。

2.2 面向Chunk处理

之前Spring Batch默认提供的处理策略是面向Item处理：

在面向item处理中，ItemReader会返回一个对象(即Item)给ItemWriter进行处理，item数量达到提交间隔的要求时提交计算结果。例如，如果提交所要求的item数量为5时，ItemReader和ItemWriter分别会被调用5次。使用简单的代码展示如下：

for(int i = 0; i < commitInterval; i++){
    Object item = itemReader.read();
    itemWriter.write(item);
}

ItemReader和ItemWriter的接口是完全配合这种解决方案：

public interface ItemReader {

    Object read() throws Exception;

    void mark() throws MarkFailedException;
 
    void reset() throws ResetFailedException;
}

public interface ItemWriter {

    void write(Object item) throws Exception;

    void flush() throws FlushFailedException;

    void clear() throws ClearFailedException;
}

由于处理的范围是一个Item，如果要支持回滚场景就需要额外的方法，此时mark,reset,flush和clear就派上了用场。例如，在成功读写了2个item之后，在写第三个item时发生了错误，整个事务就需要回滚，writer中的clear方法会被调用，用于清空缓存，Itemreader中的reset被调用，用于把mark方法所指向的数据游标复原。

在2.0中，这个策略改变为面向chunk方式:

还是上面那个例子，如果提交间隔是5次，那么读操作被调用5次，写操作被调用一次。读Item被汇总到列表中，最终被统一写出，可以简单的表示如下：

List items = new Arraylist();
for(int i = 0; i < commitInterval; i++){
    items.add(itemReader.read());
}
itemWriter.write(items);

这个方案不仅更加简单更有扩展性，同时也让ItemReader和ItemWriter接口更加简洁:

public interface ItemReader<T> {

    T read() throws Exception, UnexpectedInputException, ParseException;

}

public interface ItemWriter<T> {

    void write(List<? extends T> items) throws Exception;

}

如代码所示，接口不再包含mark,reset,flush和clear方法，这让读和写对象的创建更加直接。ItemReader例子中，接口非常简单，框架会为开发者把读取的item缓存起来，以防rollback情况的发生(针对事务的情况下可能发生例外，查看：Section 5.1.7.1 事务Readers)。ItemWriter也很简单，不再一次一个item的拿取，而是一次把整个item'块'都拿到，在把控制权交还给step前决定资源(如文件或是hibernate的session)的写入。更多面向chunk处理的信息可以在"Section 5.1 面向Chunk处理"中找到。Reader和Writer的实现信息可以查看“Chapter 6 ItemReaders和ItemWriters”。

2.2.1 ItemProcessor

最早，Step只依赖两个对象，ItemReader和ItemWriter:

这个配置是相当健壮的。但是，许多场景下item在写入之前需要转换格式，1.x使用了组合模式

这个模式管用，但是需要在读写与step之间需要额外加入一层。此外，ItemWriter需要和Step单独注册为ItemStream。因此，ItemTransformer被重新命名为ItemProcessor后移到和ItemReader与ItemWriter相同层级:

Spring Batch 2.1.8 中文文档(二)

猜你喜欢