Geek Time-The Beauty of Design Patterns How to implement a sensitive information filtering framework that can flexibly expand algorithms?

Today, we mainly explain the principle and realization of the chain of responsibility model . In addition, I will also use the chain of responsibility model to take you to implement a sensitive word filtering framework that can flexibly expand the algorithm. In the next lesson, we will be closer to the actual combat, by analyzing Servlet Filter and Spring Interceptor, how to use the chain of responsibility model to implement filters and interceptors commonly used in the framework.

Principle and realization of responsibility chain model

The English translation of the chain of responsibility model is Chain Of Responsibility Design Pattern. In GoF's "Design Patterns", it is defined as follows:

Avoid coupling the sender of a request to its receiver by giving more than one object a chance to handle the request. Chain the receiving objects and pass the request along the chain until an object handles it.

Translated into Chinese is to decouple the sending and receiving of the request, so that multiple recipients have the opportunity to process the request. String these receiving objects into a chain, and pass the request along this chain until a certain receiving object on the chain can handle it.

This is more abstract, so I will interpret it further in words that are easier to understand.

In the chain of responsibility model, multiple processors (that is, the "receiving object" in the definition just now) process the same request in turn. A request is processed by the A processor, and then the request is passed to the B processor. After the B processor is processed, it is passed to the C processor, and so on, forming a chain. Each processor in the chain assumes its own processing responsibilities, so it is called the chain of responsibility model.

Regarding the chain of responsibility model, let's take a look at its code implementation first. Combined with code implementation, you will more easily understand its definition. There are many ways to implement the chain of responsibility model. Here we introduce two more commonly used ones.

The first implementation is shown below. Among them, Handler is the abstract parent class of all processor classes, and handle() is an abstract method. The code structure of the handle() function of each specific processor class (HandlerA, HandlerB) is similar. If it can handle the request, it will not continue to pass down; if it can’t be processed, it will be handled by the subsequent processor ( That is to call successor.handle()). HandlerChain is a processor chain. From the perspective of data structure, it is a linked list that records the chain head and chain tail. Among them, the record chain tail is for the convenience of adding a processor.


public abstract class Handler {
    
    
  protected Handler successor = null;

  public void setSuccessor(Handler successor) {
    
    
    this.successor = successor;
  }

  public abstract void handle();
}

public class HandlerA extends Handler {
    
    
  @Override
  public void handle() {
    
    
    boolean handled = false;
    //...
    if (!handled && successor != null) {
    
    
      successor.handle();
    }
  }
}

public class HandlerB extends Handler {
    
    
  @Override
  public void handle() {
    
    
    boolean handled = false;
    //...
    if (!handled && successor != null) {
    
    
      successor.handle();
    } 
  }
}

public class HandlerChain {
    
    
  private Handler head = null;
  private Handler tail = null;

  public void addHandler(Handler handler) {
    
    
    handler.setSuccessor(null);

    if (head == null) {
    
    
      head = handler;
      tail = handler;
      return;
    }

    tail.setSuccessor(handler);
    tail = handler;
  }

  public void handle() {
    
    
    if (head != null) {
    
    
      head.handle();
    }
  }
}

// 使用举例
public class Application {
    
    
  public static void main(String[] args) {
    
    
    HandlerChain chain = new HandlerChain();
    chain.addHandler(new HandlerA());
    chain.addHandler(new HandlerB());
    chain.handle();
  }
}

In fact, the above code implementation is not elegant enough. The handle() function of the processor class not only contains its own business logic, but also contains the call to the next processor, which is successor.handle() in the code. A programmer who is not familiar with this code structure may forget to call successor.handle() in the handle() function when adding a new processor class, which will lead to bugs in the code.

In response to this problem, we refactored the code and used the template pattern to separate the logic of calling successor.handle() from the concrete processor class and put it in the abstract parent class. Such a specific processor class only needs to implement its own business logic. The code after refactoring is as follows:


public abstract class Handler {
    
    
  protected Handler successor = null;

  public void setSuccessor(Handler successor) {
    
    
    this.successor = successor;
  }

  public final void handle() {
    
    
    boolean handled = doHandle();
    if (successor != null && !handled) {
    
    
      successor.handle();
    }
  }

  protected abstract boolean doHandle();
}

public class HandlerA extends Handler {
    
    
  @Override
  protected boolean doHandle() {
    
    
    boolean handled = false;
    //...
    return handled;
  }
}

public class HandlerB extends Handler {
    
    
  @Override
  protected boolean doHandle() {
    
    
    boolean handled = false;
    //...
    return handled;
  }
}

// HandlerChain和Application代码不变

Let's look at the second implementation, the code is as follows. This implementation is simpler. The HandlerChain class uses an array instead of a linked list to store all the processors, and it needs to call the handle() function of each processor in the handle() function of HandlerChain in turn.


public interface IHandler {
    
    
  boolean handle();
}

public class HandlerA implements IHandler {
    
    
  @Override
  public boolean handle() {
    
    
    boolean handled = false;
    //...
    return handled;
  }
}

public class HandlerB implements IHandler {
    
    
  @Override
  public boolean handle() {
    
    
    boolean handled = false;
    //...
    return handled;
  }
}

public class HandlerChain {
    
    
  private List<IHandler> handlers = new ArrayList<>();

  public void addHandler(IHandler handler) {
    
    
    this.handlers.add(handler);
  }

  public void handle() {
    
    
    for (IHandler handler : handlers) {
    
    
      boolean handled = handler.handle();
      if (handled) {
    
    
        break;
      }
    }
  }
}

// 使用举例
public class Application {
    
    
  public static void main(String[] args) {
    
    
    HandlerChain chain = new HandlerChain();
    chain.addHandler(new HandlerA());
    chain.addHandler(new HandlerB());
    chain.handle();
  }
}

In the definition given by GoF, if a processor in the processor chain can handle this request, it will not continue to pass the request down. In fact, there is a variant of the chain of responsibility model, that is, the request will be processed by all processors, and there is no halfway termination. There are also two implementations for this variant: using a linked list to store the processor and using an array to store the processor, which are similar to the above two implementations and only need to be modified slightly.

I only give one of them here, as shown below. In the other way, you can modify it yourself according to the above implementation.


public abstract class Handler {
    
    
  protected Handler successor = null;

  public void setSuccessor(Handler successor) {
    
    
    this.successor = successor;
  }

  public final void handle() {
    
    
    doHandle();
    if (successor != null) {
    
    
      successor.handle();
    }
  }

  protected abstract void doHandle();
}

public class HandlerA extends Handler {
    
    
  @Override
  protected void doHandle() {
    
    
    //...
  }
}

public class HandlerB extends Handler {
    
    
  @Override
  protected void doHandle() {
    
    
    //...
  }
}

public class HandlerChain {
    
    
  private Handler head = null;
  private Handler tail = null;

  public void addHandler(Handler handler) {
    
    
    handler.setSuccessor(null);

    if (head == null) {
    
    
      head = handler;
      tail = handler;
      return;
    }

    tail.setSuccessor(handler);
    tail = handler;
  }

  public void handle() {
    
    
    if (head != null) {
    
    
      head.handle();
    }
  }
}

// 使用举例
public class Application {
    
    
  public static void main(String[] args) {
    
    
    HandlerChain chain = new HandlerChain();
    chain.addHandler(new HandlerA());
    chain.addHandler(new HandlerB());
    chain.handle();
  }
}

Examples of application scenarios of the chain of responsibility model

Now that the principle and implementation of the chain of responsibility pattern are finished, let's use a practical example to learn the application scenarios of the chain of responsibility pattern.

For applications (such as forums) that support UGC (User Generated Content), user-generated content (such as posts published in the forum) may contain some sensitive words (such as pornography, advertising, reactionary, etc.) vocabulary). For this application scenario, we can use the chain of responsibility model to filter these sensitive words.

For content that contains sensitive words, we have two ways to deal with it, one is to directly prohibit publishing, and the other is to mosaic sensitive words (for example, use *** to replace sensitive words) before publishing. The first processing method conforms to the definition of the chain of responsibility pattern given by GoF, and the second processing method is a variant of the chain of responsibility pattern.

We only give a code example of the first implementation method here, as shown below, and we only give the skeleton of the code implementation, and the specific sensitive word filtering algorithm is not given.


public interface SensitiveWordFilter {
    
    
  boolean doFilter(Content content);
}

public class SexyWordFilter implements SensitiveWordFilter {
    
    
  @Override
  public boolean doFilter(Content content) {
    
    
    boolean legal = true;
    //...
    return legal;
  }
}

// PoliticalWordFilter、AdsWordFilter类代码结构与SexyWordFilter类似

public class SensitiveWordFilterChain {
    
    
  private List<SensitiveWordFilter> filters = new ArrayList<>();

  public void addFilter(SensitiveWordFilter filter) {
    
    
    this.filters.add(filter);
  }

  // return true if content doesn't contain sensitive words.
  public boolean filter(Content content) {
    
    
    for (SensitiveWordFilter filter : filters) {
    
    
      if (!filter.doFilter(content)) {
    
    
        return false;
      }
    }
    return true;
  }
}

public class ApplicationDemo {
    
    
  public static void main(String[] args) {
    
    
    SensitiveWordFilterChain filterChain = new SensitiveWordFilterChain();
    filterChain.addFilter(new AdsWordFilter());
    filterChain.addFilter(new SexyWordFilter());
    filterChain.addFilter(new PoliticalWordFilter());

    boolean legal = filterChain.filter(new Content());
    if (!legal) {
    
    
      // 不发表
    } else {
    
    
      // 发表
    }
  }
}

After reading the above implementation, you might say that I can also implement the sensitive word filtering function like the following, and the code is simpler. Why do I have to use the chain of responsibility model? Is this over design?


public class SensitiveWordFilter {
    
    
  // return true if content doesn't contain sensitive words.
  public boolean filter(Content content) {
    
    
    if (!filterSexyWord(content)) {
    
    
      return false;
    }

    if (!filterAdsWord(content)) {
    
    
      return false;
    }

    if (!filterPoliticalWord(content)) {
    
    
      return false;
    }

    return true;
  }

  private boolean filterSexyWord(Content content) {
    
    
    //....
  }

  private boolean filterAdsWord(Content content) {
    
    
    //...
  }

  private boolean filterPoliticalWord(Content content) {
    
    
    //...
  }
}

As we have said many times before, the application design pattern is mainly to deal with the complexity of the code, let it satisfy the open and close principle, and improve the scalability of the code. The chain of responsibility model is no exception here. In fact, when we explained the strategy mode, we also talked about similar questions, for example, why use the strategy mode? The reason given at that time is almost the same as the reason for applying the chain of responsibility model. You can take a look at it in conjunction with the explanation at the time.

First, let's look at how the chain of responsibility model responds to the complexity of the code.

Splitting large blocks of code logic into functions and large categories into small categories are common methods to deal with code complexity. Applying the chain of responsibility model, we continue to separate each sensitive word filtering function and design it into an independent class, which further simplifies the SensitiveWordFilter class, so that the code of the SensitiveWordFilter class is not too much and too complicated.

Secondly, let's look at how the chain of responsibility model allows the code to meet the open and close principle and improve the scalability of the code.

When we want to extend the new filtering algorithm, for example, we also need to filter special symbols. According to the code implementation of the non-responsibility chain mode, we need to modify the code of SensitiveWordFilter, which violates the open and closed principle. However, such amendments are fairly concentrated and acceptable. The implementation of the chain of responsibility model is more elegant. You only need to add a new Filter class and add it to the FilterChain through the addFilter() function. The other code does not need to be modified at all.

However, you might say that even if you use the chain of responsibility model to implement, when adding a new filtering algorithm, you still have to modify the client code (ApplicationDemo), which does not fully comply with the open and closed principle.

In fact, in detail, we can divide the above code into two categories: framework code and client code. Among them, ApplicationDemo belongs to the client code, that is, the code that uses the framework. The codes other than ApplicationDemo belong to the sensitive word filtering framework code.

Assuming that the sensitive word filtering framework is not developed and maintained by us, but a third-party framework we introduced, we need to extend a new filtering algorithm, and it is impossible to directly modify the source code of the framework. At this time, the chain of responsibility model can be used to achieve what we said at the beginning, without modifying the framework source code, based on the extension points provided by the chain of responsibility model to extend new functions. In other words, we have implemented the open and closed principle within the code scope of the framework.

In addition, the use of the chain of responsibility model has another advantage over the implementation of the chain of responsibility, that is, the configuration filter algorithm is more flexible, and you can only choose to use a few filter algorithms.