Best practices for implementing pipeline mode based on Spring

This article is the second article of design patterns, and the first article shows the best routine of design patterns-use the strategy pattern happily

The Pipeline Pattern  is one of the commonly used variants of the Chain of Responsibility Pattern . In the pipeline mode, the pipeline plays the role of a pipeline, passing data to a processing sequence. After the data is processed in each step, it is passed to the next step for processing until all steps are processed.

PS: In the pure chain of responsibility mode, only one processor is used to process data in the chain, while multiple processors in the pipeline mode will process data.

When to use pipeline mode


The task code is more complicated and needs to be split into multiple sub-steps, especially when you may add new sub-steps, delete old sub-steps, and exchange the sub-step sequence at any position in the future, you can consider using the pipeline mode.

Use pipeline mode happily


▐Background  playback 

At the beginning of the model platform, the function of creating a model instance included: "input data verification -> create model instance based on input -> save model instance to related DB table", a total of three steps, not too complicated, so at that time The code looks like this:

public class ModelServiceImpl implements ModelService {


    /**
     * 提交模型(构建模型实例)
     */
    public CommonReponse<Long> buildModelInstance(InstanceBuildRequest request) {
        // 输入数据校验
        validateInput(request);
        // 根据输入创建模型实例
        ModelInstance instance = createModelInstance(request);
        // 保存实例到相关 DB 表
        saveInstance(instance);
    }
}

However, it didn't take long before we found that the format of the form input data did not fully meet the input requirements of the model, so we wanted to add "preprocessing of form data". This function hasn't started yet, and some business parties have suggested that they also need to process data (for example, according to the merchant's form input, generate some other business data as model input).

Therefore, after "input data verification", "form input and output preprocessing" and "business side custom data processing (optional)" need to be added. At this time, I was faced with a choice: whether to continue to implement these new processing steps by adding new methods to buildModelInstance? The advantage is that you can be lazy now, but the disadvantage is:

  1. ModelService should only be used to receive HSF requests, and should not carry business logic. If the logic of submitting the model is written in this class, it violates a single responsibility and will cause the  class code to explode.

  2. In the future, every time a new processing step is added or a step is deleted, I will modify the buildModelInstance method, which should be very cohesive, which violates the principle of  opening and closing.

Therefore, in order not to dig pits for myself in the future, I think I must think about a comprehensive plan. At this time, my little head started to spin, and suddenly it flashed through the ChannelPipeline in Netty - yes, the pipeline mode , isn't it just what I need!

There are many ways to implement the pipeline mode. Next, based on the previous background, I will share my current "best routine" for implementing the pipeline mode based on Spring (if you have a better routine, please enlighten me and discuss it together).

▐Define  the context of pipeline processing 

/**
 * 传递到管道的上下文
 */
@Getter
@Setter
public class PipelineContext {


    /**
     * 处理开始时间
     */
    private LocalDateTime startTime;


    /**
     * 处理结束时间
     */
    private LocalDateTime endTime;


    /**
     * 获取数据名称
     */
    public String getName() {
        return this.getClass().getSimpleName();
    }
}

▐Define the  context processor 


/**
 * 管道中的上下文处理器
 */
public interface ContextHandler<T extends PipelineContext> {


    /**
     * 处理输入的上下文数据
     *
     * @param context 处理时的上下文数据
     * @return 返回 true 则表示由下一个 ContextHandler 继续处理,返回 false 则表示处理结束
     */
    boolean handle(T context);
}

For the convenience of explanation, we now define  the context and related processors of the earliest version of [ Submit Model Logic] :

/**
 * 模型实例构建的上下文
 */
@Getter
@Setter
public class InstanceBuildContext extends PipelineContext {


    /**
     * 模型 id
     */
    private Long modelId;


    /**
     * 用户 id
     */
    private long userId;


    /**
     * 表单输入
     */
    private Map<String, Object> formInput;


    /**
     * 保存模型实例完成后,记录下 id
     */
    private Long instanceId;


    /**
     * 模型创建出错时的错误信息
     */
    private String errorMsg;


    // 其他参数


    @Override
    public String getName() {
        return "模型实例构建上下文";
    }
}

Processor-Input data verification:

@Component
public class InputDataPreChecker implements ContextHandler<InstanceBuildContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(InstanceBuildContext context) {
        logger.info("--输入数据校验--");


        Map<String, Object> formInput = context.getFormInput();


        if (MapUtils.isEmpty(formInput)) {
            context.setErrorMsg("表单输入数据不能为空");
            return false;
        }


        String instanceName = (String) formInput.get("instanceName");


        if (StringUtils.isBlank(instanceName)) {
            context.setErrorMsg("表单输入数据必须包含实例名称");
            return false;
        }


        return true;
    }
}

Processor-Create model instance based on input:

@Component
public class ModelInstanceCreator implements ContextHandler<InstanceBuildContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(InstanceBuildContext context) {
        logger.info("--根据输入数据创建模型实例--");


        // 假装创建模型实例


        return true;
    }
}

Processor-save the model instance to the relevant DB table:

@Component
public class ModelInstanceSaver implements ContextHandler<InstanceBuildContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(InstanceBuildContext context) {
        logger.info("--保存模型实例到相关DB表--");


        // 假装保存模型实例


        return true;
    }
}

At this point, a question arises: what method should be used to connect the ContextHandler of the same Context in series as a pipeline? think for a while:

  1. Add a setNext method to ContextHandler, and each implementation class must specify its next handler. The disadvantages are also obvious. If you add a new ContextHandler in the middle of the current pipeline, it is necessary to modify the setNext method of the previous ContextHandler; in addition, the code is written for people to read, so it is impossible to intuitively know the entire pipeline at a glance To process the link, you have to enter each related ContextHandler to check it.

  2. Add the @Order annotation to the ContextHandler, and determine the sequence of each ContextHandler according to the number given in @Order. At the beginning, the interval between each number can be larger (such as 10, 20, 30), and when a new ContextHandler is added later , You can specify the number as (11, 21, 31), then you can avoid the problem of modifying the code in the above scheme, but it is still unavoidable to enter each related ContextHandler to check to know the pipeline processing link problem.

  3. Write a routing table in advance, specify the mapping of "Context -> Pipeline" (the pipeline is represented by List<ContextHandler>), and the order of the processors in the pipeline. Spring builds a Map based on this routing table at startup. The key of the Map is the type of Context, and the value is a pipeline (ie List<ContextHandler>). In this case, if you want to know the processing link of each pipeline, just look at this routing table directly, which is clear at a glance. The disadvantage is that every time a new ContextHandler is added, the routing table also needs to be slightly changed in the corresponding pipeline-but if the code can be read more clearly, I think such a modification is worthwhile and acceptable~

▐Construction  of pipeline routing table 

Based on Spring's Java Bean configuration, we can easily build the routing table of the pipeline:

/**
 * 管道路由的配置
 */
@Configuration
public class PipelineRouteConfig implements ApplicationContextAware {


    /**
     * 数据类型->管道中处理器类型列表 的路由
     */
    private static final
    Map<Class<? extends PipelineContext>,
        List<Class<? extends ContextHandler<? extends PipelineContext>>>> PIPELINE_ROUTE_MAP = new HashMap<>(4);


    /*
     * 在这里配置各种上下文类型对应的处理管道:键为上下文类型,值为处理器类型的列表
     */
    static {
        PIPELINE_ROUTE_MAP.put(InstanceBuildContext.class,
                               Arrays.asList(
                                       InputDataPreChecker.class,
                                       ModelInstanceCreator.class,
                                       ModelInstanceSaver.class
                               ));


        // 将来其他 Context 的管道配置
    }


    /**
     * 在 Spring 启动时,根据路由表生成对应的管道映射关系
     */
    @Bean("pipelineRouteMap")
    public Map<Class<? extends PipelineContext>, List<? extends ContextHandler<? extends PipelineContext>>> getHandlerPipelineMap() {
        return PIPELINE_ROUTE_MAP.entrySet()
                                 .stream()
                                 .collect(Collectors.toMap(Map.Entry::getKey, this::toPipeline));
    }


    /**
     * 根据给定的管道中 ContextHandler 的类型的列表,构建管道
     */
    private List<? extends ContextHandler<? extends PipelineContext>> toPipeline(
            Map.Entry<Class<? extends PipelineContext>, List<Class<? extends ContextHandler<? extends PipelineContext>>>> entry) {
        return entry.getValue()
                    .stream()
                    .map(appContext::getBean)
                    .collect(Collectors.toList());
    }


    private ApplicationContext appContext;


    @Override
    public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
        appContext = applicationContext;
    }
}

▐Define  pipeline executor 


The last step is to define the pipeline executor. The pipeline executor finds the corresponding pipeline according to the type of the incoming context data, and then puts the context data into the pipeline for processing.

/**
 * 管道执行器
 */
@Component
public class PipelineExecutor {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    /**
     * 引用 PipelineRouteConfig 中的 pipelineRouteMap
     */
    @Resource
    private Map<Class<? extends PipelineContext>,
                List<? extends ContextHandler<? super PipelineContext>>> pipelineRouteMap;


    /**
     * 同步处理输入的上下文数据<br/>
     * 如果处理时上下文数据流通到最后一个处理器且最后一个处理器返回 true,则返回 true,否则返回 false
     *
     * @param context 输入的上下文数据
     * @return 处理过程中管道是否畅通,畅通返回 true,不畅通返回 false
     */
    public boolean acceptSync(PipelineContext context) {
        Objects.requireNonNull(context, "上下文数据不能为 null");
        // 拿到数据类型
        Class<? extends PipelineContext> dataType = context.getClass();
        // 获取数据处理管道
        List<? extends ContextHandler<? super PipelineContext>> pipeline = pipelineRouteMap.get(dataType);


        if (CollectionUtils.isEmpty(pipeline)) {
            logger.error("{} 的管道为空", dataType.getSimpleName());
            return false;
        }


        // 管道是否畅通
        boolean lastSuccess = true;


        for (ContextHandler<? super PipelineContext> handler : pipeline) {
            try {
                // 当前处理器处理数据,并返回是否继续向下处理
                lastSuccess = handler.handle(context);
            } catch (Throwable ex) {
                lastSuccess = false;
                logger.error("[{}] 处理异常,handler={}", context.getName(), handler.getClass().getSimpleName(), ex);
            }


            // 不再向下处理
            if (!lastSuccess) { break; }
        }


        return lastSuccess;
    }
}


▐Use  pipeline mode 


At this point, we can modify the initial buildModelInstance to:

public CommonResponse<Long> buildModelInstance(InstanceBuildRequest request) {
    InstanceBuildContext data = createPipelineData(request);
    boolean success = pipelineExecutor.acceptSync(data);


    // 创建模型实例成功
    if (success) {
        return CommonResponse.success(data.getInstanceId());
    }


    logger.error("创建模式实例失败:{}", data.getErrorMsg());
    return CommonResponse.failed(data.getErrorMsg());
}

Let's simulate the process of creating a model instance:

When the parameters are normal:


When the parameter is wrong:

At this time, we add two new ContextHandlers to InstanceBuildContext: FormInputPreprocessor (form input data preprocessing) and BizSideCustomProcessor (business side custom data processing).

@Component
public class FormInputPreprocessor implements ContextHandler<InstanceBuildContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(InstanceBuildContext context) {
        logger.info("--表单输入数据预处理--");


        // 假装进行表单输入数据预处理


        return true;
    }
}
@Component
public class BizSideCustomProcessor implements ContextHandler<InstanceBuildContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(InstanceBuildContext context) {
        logger.info("--业务方自定义数据处理--");


        // 先判断是否存在自定义数据处理,如果没有,直接返回 true


        // 调用业务方的自定义的表单数据处理


        return true;
    }
}

At this time, buildModelInstance does not need to make any changes. We only need to add these two ContextHandlers to the pipeline associated with InstanceBuildContext in the "routing table". When Spring starts, it will automatically help us build the pipeline corresponding to each Context:

Simulate the creation process of the model instance again:


▐Asynchronous  processing 


In the pipeline executor PipelineExecutor, acceptSync is a synchronous method.

Xiaomi: You know by looking at the name that you are quietly ambushing.

For tasks with many steps, in many cases we need asynchronous processing more, such as certain time-consuming tasks. Asynchronization of pipeline processing is very simple, we first define a thread pool, such as:

<!-- 专门用于执行管道任务的线程池 -->
<bean id="pipelineThreadPool"
      class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
    <property name="corePoolSize" value="4" /> <!-- 核心线程数 -->
    <property name="maxPoolSize" value="8" />  <!-- 最大线程数 -->
    <property name="keepAliveSeconds" value="960" />  <!-- 线程最大空闲时间/秒(根据管道使用情况指定)-->
    <property name="queueCapacity" value="256" />     <!-- 任务队列大小(根据管道使用情况指定)-->
    <property name="threadNamePrefix" value="pipelineThreadPool" />
    <property name="rejectedExecutionHandler">
        <bean class="java.util.concurrent.ThreadPoolExecutor.CallerRunsPolicy" />
    </property>
</bean>

Then add the asynchronous processing method to the PipelineExecutor:

/**
 * 管道线程池
 */
@Resource
private ThreadPoolTaskExecutor pipelineThreadPool;


/**
 * 异步处理输入的上下文数据
 *
 * @param context  上下文数据
 * @param callback 处理完成的回调
 */
public void acceptAsync(PipelineContext context, BiConsumer<PipelineContext, Boolean> callback) {
    pipelineThreadPool.execute(() -> {
        boolean success = acceptSync(context);


        if (callback != null) {
            callback.accept(context, success);
        }
    });
}


▐General  processing 


For example, we want to record the processing time of each pipeline, and print related logs before and after processing. Then we can provide two general ContextHandlers, placed at the head and tail of each pipeline:

@Component
public class CommonHeadHandler implements ContextHandler<PipelineContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(PipelineContext context) {
        logger.info("管道开始执行:context={}", JSON.toJSONString(context));


        // 设置开始时间
        context.setStartTime(LocalDateTime.now());


        return true;
    }
}
@Component
public class CommonTailHandler implements ContextHandler<PipelineContext> {


    private final Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public boolean handle(PipelineContext context) {
        // 设置处理结束时间
        context.setEndTime(LocalDateTime.now());


        logger.info("管道执行完毕:context={}", JSON.toJSONString(context));


        return true;
    }
}

The general head and tail processors can be placed in the routing table, but each time a new PipelineContext is added, it seems unnecessary-we directly modify the acceptSync method in the pipeline executor PipelineExecutor:

@Component
public class PipelineExecutor {


    ......


    @Autowired
    private CommonHeadHandler commonHeadHandler;


    @Autowired
    private CommonTailHandler commonTailHandler;


    public boolean acceptSync(PipelineContext context) {
        ......


        // 【通用头处理器】处理
        commonHeadHandler.handle(context);


        // 管道是否畅通
        boolean lastSuccess = true;


        for (ContextHandler<? super PipelineContext> handler : pipeline) {
            try {
                // 当前处理器处理数据,并返回是否继续向下处理
                lastSuccess = handler.handle(context);
            } catch (Throwable ex) {
                lastSuccess = false;
                logger.error("[{}] 处理异常,handler={}", context.getName(), handler.getClass().getSimpleName(), ex);
            }


            // 不再向下处理
            if (!lastSuccess) { break; }
        }


        // 【通用尾处理器】处理
        commonTailHandler.handle(context);


        return lastSuccess;
    }
}



to sum up


Through the pipeline mode, we have greatly reduced the coupling of the system and improved the degree of cohesion and scalability:

  • ModelService is only responsible for processing HSF requests and does not care about specific business logic

  • PipelineExecutor only does execution work, don’t care about specific pipeline details

  • Each ContextHandler is only responsible for its own part of the business logic, does not need to know the structure of the pipeline, and is decoupled from the business logic of other ContextHandlers

  • When adding, deleting or exchanging sub-steps, you only need to manipulate the configuration of the routing table instead of modifying the original calling code

Tao Department Technical Department-Global Marketing Team-Recruiting Talents

Fighting in the core area of ​​Alibaba e-commerce, responsible for connecting both ends of supply and demand, supporting various products, platforms and solutions in the field of e-commerce marketing, including Juhuasuan, tens of billions of subsidies, Tmall Uxian, Tmall Black Box, Tmall new product incubation, brand name and other heavyweight businesses. We are deeply involved in annual big promotions such as Double 11, 618, and 99, and continue to challenge the limits of technology! The team members have diverse backgrounds. There are old drivers who are deeply involved in e-commerce, and there are also vigorous Xiao Mengxin, and even sweet and sweet young ladies. We look forward to joining you who are curious and thinking!

[Recruitment position] Java technology expert, data engineer

If you are interested, please send your resume to [email protected] or add the author's WeChat  wx_zhou_mi for detailed consultation, welcome to pick up~

✿ Further   reading

Author| Zhou Mi (Leaf)

Edit| Orange

Produced| Alibaba's new retail technology

Guess you like

Origin blog.csdn.net/Taobaojishu/article/details/110211924