Summary of the beauty of design patterns (design principles)


title: Summary of the Beauty of Design Patterns (Design Principles)
date: 2022-10-27 17:31:42
tags:

  • Design pattern
    categories:
  • Design mode
    cover: https://cover.png
    feature: false

Article directory


See the previous article:

The previous article introduced object-oriented related knowledge. Next, introduce some classic design principles, including SOLID, KISS, YAGNI, DRY, LOD, etc.

1. Single Responsibility Principle (SRP)

1.1 How to understand the single responsibility principle?

In fact, the SOLID principle is not just one principle, but is composed of five design principles, which are: single responsibility principle, opening and closing principle, Li style replacement principle, interface isolation principle and dependency inversion principle, corresponding in turn The five English letters S, O, L, I, D in SOLID

The English of the single responsibility principle is Single Responsibility Principle, abbreviated as SRP

A class or module should have a single reponsibility.
A class or module is only responsible for completing one responsibility (or function)

The object described by this principle contains two, one is a class (Class) and the other is a module (Module). There are two ways of understanding these two concepts.

  • One understanding is: think of a module as a more abstract concept than a class, and a class can also be regarded as a module
  • Another understanding is: think of a module as a coarser-grained code block than a class, a module contains multiple classes, and multiple classes form a module

Regardless of the way of understanding, the principle of single responsibility is the same when it is applied to these two description objects. Next, only from the perspective of "class" design, we will settle how to apply this design principle. For the "module", it can be extended by itself

The definition and description of the single responsibility principle is very simple and not difficult to understand. A class is only responsible for completing one responsibility or function. In other words, don't design large and comprehensive classes, but design classes with small granularity and single function. From another perspective, a class contains two or more functions that are not related to the business, that is to say, its responsibility is not single enough, and it should be split into multiple classes with more single functions and finer granularity

For example, a class contains both some operations of the order and some operations of the user. Orders and users are two independent business domain models. Putting two irrelevant functions into the same class violates the principle of single responsibility. In order to meet the principle of single responsibility, this class needs to be split into two classes with finer granularity and more single functions: order class and user class

1.2 How to judge whether the responsibility of the class is single enough?

From the above example, the Single Responsibility Principle does not seem difficult to apply. That's because the example given is rather extreme, and it can be seen at a glance that the order has nothing to do with the user. But in most cases, it is not so easy to determine whether the methods in the class are classified into the same type of function, or into two unrelated types of functions. In real software development, it is difficult to judge whether a class has a single responsibility. For example:

In a social product, use the following UserInfo class to record user information. Do you think the design of the UserInfo class satisfies the single responsibility principle?

public class UserInfo {
    
    
    private long userId;
    private String username;
    private String email;
    private String telephone;
    private long createTime;
    private long lastLoginTime;
    private String avatarUrl;
    private String provinceOfAddress; // 省
    private String cityOfAddress; // 市
    private String regionOfAddress; // 区
    private String detailedAddress; // 详细地址
    // ... 省略其他属性和方法...
}

There are two different views on this question. One point of view is that the UserInfo class contains information related to the user, and all attributes and methods belong to the business model of the user, which satisfies the principle of single responsibility; another point of view is that the address information is in the UserInfo class, all It accounts for a relatively high proportion and can continue to be split into an independent UserAddress class. UserInfo only retains other information except Address. The responsibilities of the two classes after splitting are more single

In fact, to make a choice, you cannot break away from the specific application scenario. If in this social product, the user's address information is just used for display like other information, then the current design of UserInfo is reasonable. However, if this social product develops better, and an e-commerce module is added to the product later, and the user's address information will also be used in e-commerce logistics, it is best to separate the address information from UserInfo and independently Into user logistics information (or address information, receiving information, etc.)

Take it a step further. If the company making this social product develops better and better, more and more other products (which can be understood as other apps) will be developed within the company. The company hopes to support a unified account system, that is, a user can log in to all products within the company with one account. At this time, it is necessary to continue to split UserInfo, and extract information related to identity authentication (such as Email, Telephone, etc.) into independent classes

From this, it can be concluded that in different application scenarios and different stages of demand backgrounds, the determination of whether the responsibility of the same class is single may be different. In a certain application scenario or the current demand background, the design of a class may already meet the single responsibility principle, but if another application scenario or a certain future demand background is used, it may not be satisfied, and it needs to continue to split into finer-grained classes

In addition, looking at the design of the same class from different business levels, there will be different understandings of whether the class has a single responsibility. For example, the UserInfo class in the example. From the perspective of the business level of "user", the information contained in UserInfo belongs to the user, which satisfies the principle of single responsibility. If you look at the finer-grained business level of "user display information", "address information", "login authentication information", etc., then UserInfo should continue to be split

To sum up, there is not a very clear and quantifiable standard for evaluating whether the responsibility of a class is single enough. It can be said that this is a very subjective thing, and the benevolent has different opinions. In fact, in real software development, there is no need to be too proactive and over-designed. Therefore, you can first write a coarse-grained class to meet business needs. With the development of the business, if the coarse-grained class becomes larger and more coded, at this time, the coarse-grained class can be split into several finer-grained classes. This is called continuous refactoring

There are also some tips here, which can help you very well, and judge whether the responsibility of a class is single enough from the side. Moreover, I personally think that the following judgment principles are more instructive and executable than subjectively thinking about whether a class has a single responsibility:

  • Too many lines of code, functions or attributes in a class will affect the readability and maintainability of the code, so you need to consider splitting the class
  • The class depends on too many other classes, or there are too many other classes that depend on the class, which does not conform to the design idea of ​​high cohesion and low coupling, so it is necessary to consider splitting the class
  • If there are too many private methods, it is necessary to consider whether the private methods can be separated into new classes and set as public methods for more classes to use, thereby improving code reusability
  • It is difficult to give a suitable name to the class, it is difficult to summarize it with a business term, or it can only be named with some general words such as Manager and Context, which means that the definition of class responsibilities may not be clear enough
  • A large number of methods in the class focus on certain attributes in the class. For example, in the UserInfo example, if half of the methods are operating address information, then you can consider splitting these attributes and corresponding methods come out

At this time, there may be such doubts: in the above judgment principle, if there are too many lines of code, functions or attributes in the class, it may not meet the principle of single responsibility. How many lines of code is too many lines? How many functions and attributes can be called too many?

In fact, this question is not easy to answer quantitatively. A relatively broad and quantifiable standard is that the number of lines of code for a class should not exceed 200, and the number of functions and attributes should not exceed 10. In fact, from another point of view, when you read the code of a class, it makes you feel dizzy, and you don’t know which function to use when implementing a certain function, and you can’t find which function you want to use. When a small function needs to be introduced into the entire class (the class contains many functions that are not related to the implementation of this function), it means that the number of lines, functions, and attributes of the class are too many

1.3 Is the responsibility of the class as simple as possible?

In order to meet the principle of single responsibility, is it better to split the class into smaller pieces? the answer is negative. In the following example, the Serialization class implements the serialization and deserialization functions of a simple protocol. The specific code is as follows:

public class Serialization {
    
    
    private static final String IDENTIFIER_STRING = "UEUEUE;";
    private Gson gson;

    public Serialization() {
    
    
        this.gson = new Gson();
    }
    public String serialize(Map<String, String> object) {
    
    
        StringBuilder textBuilder = new StringBuilder();
        textBuilder.append(IDENTIFIER_STRING);
        textBuilder.append(gson.toJson(object));
        return textBuilder.toString();
    }
    public Map<String, String> deserialize(String text) {
    
    
        if (!text.startsWith(IDENTIFIER_STRING)) {
    
    
            return Collections.emptyMap();
        }
        String gsonStr = text.substring(IDENTIFIER_STRING.length());
        return gson.fromJson(gsonStr, Map.class);
    }
}

If you want to make the responsibility of the class more single, further split the Serialization class into a Serializer class that is only responsible for serialization and another Deserializer class that is only responsible for deserialization. The specific code after splitting is as follows:

public class Serializer {
    
    
    private static final String IDENTIFIER_STRING = "UEUEUE;";
    private Gson gson;

    public Serializer() {
    
    
        this.gson = new Gson();
    }
    public String serialize(Map<String, String> object) {
    
    
        StringBuilder textBuilder = new StringBuilder();
        textBuilder.append(IDENTIFIER_STRING);
        textBuilder.append(gson.toJson(object));
        return textBuilder.toString();
    }
}

public class Deserializer {
    
    
    private static final String IDENTIFIER_STRING = "UEUEUE;";
    private Gson gson;

    public Deserializer() {
    
    
        this.gson = new Gson();
    }
    public Map<String, String> deserialize(String text) {
    
    
        if (!text.startsWith(IDENTIFIER_STRING)) {
    
    
            return Collections.emptyMap();
        }
        String gsonStr = text.substring(IDENTIFIER_STRING.length());
        return gson.fromJson(gsonStr, Map.class);
    }
}

Although after splitting, the responsibilities of the Serializer class and the Deserializer class are more single, but it also brings new problems. If the format of the protocol is modified, the data identifier is changed from "UEUEUE" to "DFDFDF", or the serialization method is changed from JSON to XML, then both the Serializer class and the Deserializer class need to be modified accordingly, and the cohesion of the code is obviously not as good as before. Serialization is high. Moreover, if you only modify the protocol of the Serializer class, but forget to modify the code of the Deserializer class, it will cause a mismatch between serialization and deserialization, and the program will run incorrectly. That is to say, after the split, the maintainability of the code sex worse

In fact, whether it is to apply design principles or design patterns, the ultimate goal is to improve the readability, scalability, reusability, and maintainability of the code. When considering whether it is reasonable to apply a certain design principle, it can also be used as the final consideration

2. Open-Closed Principle (OCP)

The author personally feels that the principle of opening and closing is the most difficult to understand and master in SOLID, but also the most useful principle

  • The reason why this principle is difficult to understand is because, "What kind of code changes are defined as 'extensions'? What kind of code changes are defined as 'modifications'? How do they satisfy or violate the 'open-closed principle'? Does modifying the code necessarily mean violating the 'open-closed principle'?" These questions are difficult to understand
  • The reason why this principle is difficult to grasp is because, "How to achieve 'closed for extension development and modification'? How to flexibly apply the 'open-closed principle' in the project to avoid affecting the code while pursuing scalability Readability?" and so on, these questions are more difficult to grasp
  • The reason why this principle is most useful is because scalability is one of the most important measures of code quality. Among the 23 classic design patterns, most of the design patterns exist to solve the problem of code scalability, and the main design principle to be followed is the principle of opening and closing

2.1 How to understand "open for extension, closed for modification"?

The English full name of the opening and closing principle is Open Closed Principle, abbreviated as OCP

Software entities (modules, classes, functions, etc.) should be open for extension, but closed for modification.
Software entities (modules, classes, methods, etc.) should be "open for extension, closed for modification"

This description is relatively brief. If you express it in detail, adding a new function should be to expand the code on the basis of the existing code (adding modules, classes, methods, etc.), rather than modifying the existing code (modifying the module, class, method, etc.). As an example, this is a piece of code for monitoring alarms on the API interface

Among them, AlertRule stores alert rules and can be set freely. Notification is an alarm notification category that supports multiple notification channels such as email, SMS, WeChat, and mobile phones. NotificationEmergencyLevel indicates the urgency of the notification, including SEVERE (severe), URGENCY (urgent), NORMAL (normal), and TRIVIAL (irrelevant). Different urgency levels correspond to different delivery channels

public class Alert {
    
    
    private AlertRule rule;
    private Notification notification;

    public Alert(AlertRule rule, Notification notification) {
    
    
        this.rule = rule;
        this.notification = notification;
    }
    public void check(String api, long requestCount, long errorCount, long duration) {
    
    
        long tps = requestCount / durationOfSeconds;
        if (tps > rule.getMatchedRule(api).getMaxTps()) {
    
    
            notification.notify(NotificationEmergencyLevel.URGENCY, "...");
        }
        if (errorCount > rule.getMatchedRule(api).getMaxErrorCount()) {
    
    
            notification.notify(NotificationEmergencyLevel.SEVERE, "...");
        }
    }
}

The above code is very simple, and the business logic is mainly concentrated in check()the function . When the TPS of the interface exceeds a preset maximum value, and when the number of interface request errors exceeds a certain maximum allowable value, an alarm will be triggered to notify the relevant person in charge or team of the interface

Now, if you need to add a function, when the number of interface timeout requests per second exceeds a preset maximum threshold, an alarm will also be triggered to send a notification. At this time, how to change the code? There are two main changes: the first is to modify the input parameters of check()the function , and add a new statistical data timeoutCount, indicating the number of timeout interface requests; the second is check()to add new alarm logic to the function. The specific code changes are as follows:

public class Alert {
    
    
    // ... 省略 AlertRule/Notification 属性和构造函数...
    // 改动一:添加参数 timeoutCount
    public void check(String api, long requestCount, long errorCount, long timeoutCount) {
    
    
        long tps = requestCount / durationOfSeconds;
        if (tps > rule.getMatchedRule(api).getMaxTps()) {
    
    
            notification.notify(NotificationEmergencyLevel.URGENCY, "...");
        }
        if (errorCount > rule.getMatchedRule(api).getMaxErrorCount()) {
    
    
            notification.notify(NotificationEmergencyLevel.SEVERE, "...");
        }
        // 改动二:添加接口超时处理逻辑
        long timeoutTps = timeoutCount / durationOfSeconds;
        if (timeoutTps > rule.getMatchedRule(api).getMaxTimeoutTps()) {
    
    
            notification.notify(NotificationEmergencyLevel.URGENCY, "...");
        }
    }
}

There are actually quite a few problems with such code modification. On the one hand, the interface has been modified, which means that the code that calls this interface must be modified accordingly. On the other hand, if check()the function , the corresponding unit tests need to be modified

The above code changes are based on the "modification" method to achieve new functions. If you follow the principle of opening and closing, that is, "open for extension, closed for modification". So how to achieve the same function through "extension"?

First, refactor the previous Alert code to make it more scalable. The refactoring content mainly includes two parts:

  1. check()Encapsulate multiple input parameters of the function into the ApiStatInfo class;
  2. Introduce the concept of handler, and disperse the if judgment logic in each handler

The specific code implementation is as follows:

public class Alert {
    
    
    private List<AlertHandler> alertHandlers = new ArrayList<>();

    public void addAlertHandler(AlertHandler alertHandler) {
    
    
        this.alertHandlers.add(alertHandler);
    }
    public void check(ApiStatInfo apiStatInfo) {
    
    
        for (AlertHandler handler : alertHandlers) {
    
    
            handler.check(apiStatInfo);
        }
    }
}
public class ApiStatInfo {
    
    // 省略 constructor/getter/setter 方法
    private String api;
    private long requestCount;
    private long errorCount;
    private long durationOfSeconds;
}

public abstract class AlertHandler {
    
    
    protected AlertRule rule;
    protected Notification notification;

    public AlertHandler(AlertRule rule, Notification notification) {
    
    
        this.rule = rule;
        this.notification = notification;
    }
    public abstract void check(ApiStatInfo apiStatInfo);
}
public class TpsAlertHandler extends AlertHandler {
    
    
    public TpsAlertHandler(AlertRule rule, Notification notification) {
    
    
        super(rule, notification);
    }
    @Override
    public void check(ApiStatInfo apiStatInfo) {
    
    
        long tps = apiStatInfo.getRequestCount()/ apiStatInfo.getDurationOfSeconds
        if (tps > rule.getMatchedRule(apiStatInfo.getApi()).getMaxTps()) {
    
    
            notification.notify(NotificationEmergencyLevel.URGENCY, "...");
        }
    }
}
public class ErrorAlertHandler extends AlertHandler {
    
    
    public ErrorAlertHandler(AlertRule rule, Notification notification){
    
    
        super(rule, notification);
    }
    @Override
    public void check(ApiStatInfo apiStatInfo) {
    
    
        if (apiStatInfo.getErrorCount() > rule.getMatchedRule(apiStatInfo.getApi()) {
    
    
            notification.notify(NotificationEmergencyLevel.SEVERE, "...");
        }
    }
}

The above code is a refactoring of Alert, how to use the refactored Alert? As follows, where ApplicationContext is a singleton class responsible for the creation, assembly (dependency injection of alertRule and notification) and initialization (adding handlers) of Alert

public class ApplicationContext {
    
    
    private AlertRule alertRule;
    private Notification notification;
    private Alert alert;

    public void initializeBeans() {
    
    
        alertRule = new AlertRule(/*. 省略参数.*/); // 省略一些初始化代码
        notification = new Notification(/*. 省略参数.*/); // 省略一些初始化代码
        alert = new Alert();
        alert.addAlertHandler(new TpsAlertHandler(alertRule, notification));
        alert.addAlertHandler(new ErrorAlertHandler(alertRule, notification));
    }
    public Alert getAlert() {
    
     return alert; }
    // 饿汉式单例
    private static final ApplicationContext instance = new ApplicationContext();
    private ApplicationContext() {
    
    
        instance.initializeBeans();
    }
    public static ApplicationContext getInstance() {
    
    
        return instance;
    }
}

public class Demo {
    
    
    public static void main(String[] args) {
    
    
        ApiStatInfo apiStatInfo = new ApiStatInfo();
        // ... 省略设置 apiStatInfo 数据值的代码
        ApplicationContext.getInstance().getAlert().check(apiStatInfo);
    }
}

Let’s look at it again. Based on the refactored code, if the new function mentioned above is added, an alarm will be issued when the number of interface timeout requests per second exceeds a certain maximum threshold. How to change the code? The main changes are as follows:

  1. Add new property timeoutCount in ApiStatInfo class
  2. Add new TimeoutAlertHander class
  3. In the initializeBeans() method of the ApplicationContext class, register a new timeoutAlertHandler in the alert object
  4. When using the Alert class, you need to set the value of timeoutCount for the input parameter apiStatInfo object of check()the function
public class Alert {
    
     // 代码未改动... }

public class ApiStatInfo {
    
    // 省略 constructor/getter/setter 方法
    private String api;
    private long requestCount;
    private long errorCount;
    private long durationOfSeconds;
    private long timeoutCount; // 改动一:添加新字段
}
public abstract class AlertHandler {
    
     // 代码未改动... }
public class TpsAlertHandler extends AlertHandler {
    
     // 代码未改动...}
public class ErrorAlertHandler extends AlertHandler {
    
     // 代码未改动...}

// 改动二:添加新的 handler
public class TimeoutAlertHandler extends AlertHandler {
    
     // 省略代码...}

public class ApplicationContext {
    
    
    private AlertRule alertRule;
    private Notification notification;
    private Alert alert;

    public void initializeBeans() {
    
    
        alertRule = new AlertRule(/*. 省略参数.*/); // 省略一些初始化代码
        notification = new Notification(/*. 省略参数.*/); // 省略一些初始化代码
        alert = new Alert();
        alert.addAlertHandler(new TpsAlertHandler(alertRule, notification));
        alert.addAlertHandler(new ErrorAlertHandler(alertRule, notification));
        // 改动三:注册 handler
        alert.addAlertHandler(new TimeoutAlertHandler(alertRule, notification));
    }
    //... 省略其他未改动代码...
}
public class Demo {
    
    
    public static void main(String[] args) {
    
    
        ApiStatInfo apiStatInfo = new ApiStatInfo();
        // ... 省略 apiStatInfo 的 set 字段代码
        apiStatInfo.setTimeoutCount(289); // 改动四:设置 tiemoutCount 值
        ApplicationContext.getInstance().getAlert().check(apiStatInfo);
    }
}

The refactored code is more flexible and extensible. If you want to add new alarm logic, you only need to create a new handler class based on the extension method, without changing the logic of the original check()function . Moreover, you only need to add unit tests for the new handler class, and the old unit tests will not fail and do not need to be modified

2.2 Does modifying the code mean violating the principle of opening and closing?

After reading the refactored code above, you may still have doubts: when adding new alarm logic, although change 2 (adding a new handler class) is done based on extension rather than modification, change 1, Three and four seem to be done not based on expansion but on modification. Doesn’t changing one, three and four violate the principle of opening and closing?

1. Change 1: Add a new attribute timeoutCount to the ApiStatInfo class

In fact, not only properties are added to the ApiStatInfo class, but corresponding getter/setter methods are also added. Then this question is transformed into: adding new properties and methods to the class, is it counted as "modification" or "extension"?

Definition of the Open-Closed Principle: Software entities (modules, classes, methods, etc.) should be "open for extension and closed for modification". From the definition, it can be seen that the principle of opening and closing can be applied to codes of different granularities, which can be modules, classes, or methods (and their attributes). The same code change can be identified as "modification" under the coarse code granularity, and can be identified as "extension" under the fine code granularity. For example, in Change 1, adding attributes and methods is equivalent to modifying the class. At the level of the class, this code change can be identified as "modification"; but this code change does not modify the existing attributes and methods. In the method (and its attributes ) at this level, it can be identified as "extended"

In fact, there is no need to worry about whether a code change is "modification" or "extension", let alone whether it violates the "open-close principle". Back to the original design intention of this principle: as long as it does not destroy the normal operation of the original code and the original unit test, it can be said that this is a qualified code change

2. Modification 3 and Modification 4: In initializeBeans()the method , register a new timeoutAlertHandler in the alert object; when using the Alert class, you need to check()set the value of timeoutCount for the apiStatInfo object of the function input parameter

These two changes are made inside the method, no matter from which level (module, class, method), they cannot be regarded as "extensions", but out-and-out "modifications". However, some modifications are unavoidable and acceptable. Why do you say that?

In the refactored Alert code, the core logic is concentrated in the Alert class and its handlers. When adding new alert logic, the Alert class does not need to be modified at all, but only needs to extend a new handler class. If the Alert class and each handler class are regarded as a "module", then the module itself fully satisfies the principle of opening and closing when adding new functions

Moreover, it is impossible to add a new function without "modifying" the code of any module, class, or method. This is impossible. A class needs to be created, assembled, and initialized before it can be built into a runnable program. The modification of this part of the code is inevitable. What needs to be done is to try to make the modification operations more concentrated, fewer, and higher-level, and try to make the core and most complex part of the logic code satisfy the principle of opening and closing

2.3 How to achieve "open for extension, close for modification"?

In the example just now, the principle of opening and closing is supported by introducing a set of handlers. If you don't have much experience in the design and development of complex code, such code design ideas may not be imagined. As you can imagine, it depends on theoretical knowledge and practical experience, which need to be learned and accumulated slowly.

In fact, the principle of opening and closing is about the scalability of the code, and it is the "gold standard" for judging whether a piece of code is easy to expand. If a certain piece of code can be "open to extension and closed to modification" when responding to changes in future requirements, it means that this piece of code has better scalability. Therefore, asking how to be "open to extension and closed to modification" is roughly equivalent to asking how to write code that is scalable

Before talking about the specific methodology, let's look at some more top-level guiding ideologies. In order to write code with good scalability as much as possible, you must always have awareness of extension, abstraction, and encapsulation. These "subconscious minds" may be more important than any development skills

After writing the code, take more time to think about what requirements may change in this code in the future, how to design the code structure, and reserve extension points in advance so that no changes are required when future requirements change With the overall structure of the code and minimum code changes, new code can be flexibly inserted into the extension point, so as to be "open to extension and closed to modification"

Also, after identifying the variable and immutable parts of the code, it is necessary to encapsulate the variable parts, isolate changes, and provide an abstract immutable interface for use by the upper-level system. When the specific implementation changes, it is only necessary to extend a new implementation based on the same abstract interface and replace the old implementation, and the code of the upstream system hardly needs to be modified.

After talking about some top-level guiding ideologies for realizing the principle of opening and closing, let’s look at some more specific methodologies that support the principle of opening and closing

As mentioned earlier, code scalability is one of the most important criteria for code quality evaluation. In fact, many design principles, design ideas, and design patterns are aimed at improving the scalability of the code. In particular, the 23 classic design patterns, most of which are summed up to solve the problem of code scalability, are based on the principle of opening and closing

Among the many design principles, ideas, and patterns, the most commonly used methods to improve code scalability are: polymorphism, dependency injection, programming based on interfaces rather than implementation, and most design patterns (such as decoration, strategy, template, chain of responsibility, state, etc.). Next, we will focus on how to use polymorphism, dependency injection, and programming based on interfaces instead of implementations to achieve "open to extension, closed to modification"

In fact, polymorphism, dependency injection, programming based on interfaces rather than implementation, and the aforementioned abstract consciousness all refer to the same design idea, but they are explained from different angles and levels. This also reflects the idea that "many design principles, ideas, and patterns are interlinked"

In the following example, asynchronous messages are sent through Kafka in the code. For the development of such a function, learn to abstract it into a set of asynchronous message interfaces that have nothing to do with the specific message queue (Kafka). All upper-level systems rely on this set of abstract interface programming and call them through dependency injection. When replacing a new message queue, such as replacing Kafka with RocketMQ, you can easily unplug the old message queue implementation and insert a new message queue implementation. The specific code is as follows:

// 这一部分体现了抽象意识
public interface MessageQueue {
    
     //... }
public class KafkaMessageQueue implements MessageQueue {
    
     //... }
public class RocketMQMessageQueue implements MessageQueue {
    
    //...}

public interface MessageFromatter {
    
     //... }
public class JsonMessageFromatter implements MessageFromatter {
    
    //...}
public class ProtoBufMessageFromatter implements MessageFromatter {
    
    //...}

public class Demo {
    
    
    private MessageQueue msgQueue; // 基于接口而非实现编程

    public Demo(MessageQueue msgQueue) {
    
     // 依赖注入
        this.msgQueue = msgQueue;
    }
    // msgFormatter:多态、依赖注入
    public void sendNotification(Notification notification, MessageFormatter msg) {
    
    
        //...
    }
}

2.4 How to flexibly apply the principle of opening and closing in projects?

The key to writing code that supports "open for extension, closed for modification" is to reserve extension points. The question then is how to identify all possible extension points?

If you are developing a business-oriented system, such as a financial system, e-commerce system, logistics system, etc., if you want to identify as many extension points as possible, you must have a sufficient understanding of the business and be able to know the current and future support business needs. If you are developing a business-independent, general-purpose, low-level system, such as frameworks, components, and class libraries, you need to understand "how will they be used? What functions do you plan to add in the future? What will the users have in the futureMore Functional requirements?" and other questions

However, there is a saying that goes well, "The only constant is change itself". Even if you have enough understanding of the business and the system, it is impossible to identify all the extension points. Even if you can identify all the extension points and reserve extension points for these places, the cost of doing so is unacceptable. There is no need to pay in advance and over-design for some distant and unreliable demands

The most reasonable approach is that, for some relatively certain situations that may be expanded in the short term, or requirements changes have a greater impact on the code structure, or the extension points that are not expensive to implement, after writing the code, you can advance Design for scalability. However, for some requirements that are not sure whether they will be supported in the future, or extension points that are more complicated to implement, you can wait until there is a demand-driven, and then support the extended requirements by refactoring the code

Also, the open-closed principle doesn't come for free. In some cases, code scalability conflicts with readability. For example, the previous Alert example. In order to better support scalability, the code has been refactored. The refactored code is much more complicated than the previous code, and it is more difficult to understand. Many times, there is a trade-off between scalability and readability. In some scenarios, the scalability of the code is very important, and the readability of some codes can be sacrificed appropriately; in other scenarios, the readability of the code is more important, so the scalability of some codes can be sacrificed appropriately

In the previous Alert example, if there are not many alert rules and are not complicated, then there will not be many if statements in check()the function code logic is not complicated, and the number of lines of code is not many, then the initial first This kind of code implementation idea is simple and easy to read, which is a more reasonable choice. On the contrary, if there are many and complex alarm rules, check()there will be many and complex if statements and code logic of the function, and the corresponding number of code lines will also be large, and the readability and maintainability will be poor. Then the refactored The second code implementation idea is a more reasonable choice

3. Lie Substitution (LSP)

3.1 How to understand the "Li-style substitution principle"?

The English translation of Liskov Substitution Principle is: Liskov Substitution Principle, abbreviated as LSP. This principle was first proposed by Barbara Liskov in 1986, the original text is as follows:

If S is a subtype of T, then objects of type T may be replaced with objects of type S, without breaking the program.

In 1996, Robert Martin re-described this principle in his SOLID principles, the original text is as follows:

Functions that use pointers of references to base classes must be able to use objects of derived classes without knowing it.

Combining the descriptions of the two, translate it into Chinese: the subclass object (object of subtype/derived class) can replace any place where the parent class object (object of base/parent class) appears in the program (program), and guarantee the original program Logical behavior (behavior) is unchanged and correctness is not violated

In the following example, the parent class Transporter uses the HttpClient class in the org.apache.http library to transmit network data. The subclass SecurityTransporter inherits the parent class Transporter and adds additional functions to support the transmission of appId and appToken security authentication information

public class Transporter {
    
    
    private HttpClient httpClient;

    public Transporter(HttpClient httpClient) {
    
    
        this.httpClient = httpClient;
    }
    public Response sendRequest(Request request) {
    
    
        // ...use httpClient to send request
    }
}

public class SecurityTransporter extends Transporter {
    
    
    private String appId;
    private String appToken;

    public SecurityTransporter(HttpClient httpClient, String appId, String appToken) {
    
    
        super(httpClient);
        this.appId = appId;
        this.appToken = appToken;
    }
    @Override
    public Response sendRequest(Request request) {
    
    
        if (StringUtils.isNotBlank(appId) && StringUtils.isNotBlank(appToken)) {
    
    
            request.addPayload("app-id", appId);
            request.addPayload("app-token", appToken);
        }
        return super.sendRequest(request);
    }
}

public class Demo {
    
    
    public void demoFunction(Transporter transporter) {
    
    
        Reuqest request = new Request();
        //... 省略设置 request 中数据值的代码...
        Response response = transporter.sendRequest(request);
        //... 省略其他逻辑...
    }
}
// 里式替换原则
Demo demo = new Demo();
demo.demofunction(new SecurityTransporter(/* 省略参数 */););

In the above code, the design of the subclass SecurityTransporter fully conforms to the Li-style replacement principle, which can replace any position where the parent class appears, and the logic behavior of the original code remains unchanged and the correctness is not destroyed.

Looking at it this way, isn't the code design just now a simple use of object-oriented polymorphism? Are polymorphism and Li style substitution principle the same thing? Judging from the example and definition description just now, the Li style substitution principle and polymorphism do look a bit similar, but in fact they are completely different things. Why do you say that?

If it is necessary to slightly modify sendRequest()the functions . Before the transformation, if the appId or appToken is not set, no verification will be performed; after the transformation, if the appId or appToken is not set, the NoAuthorizationRuntimeException will be thrown directly. The code comparison before and after the transformation is as follows:

// 改造前:
public class SecurityTransporter extends Transporter {
    
    
    //... 省略其他代码..
    @Override
    public Response sendRequest(Request request) {
    
    
        if (StringUtils.isNotBlank(appId) && StringUtils.isNotBlank(appToken)) {
    
    
            request.addPayload("app-id", appId);
            request.addPayload("app-token", appToken);
        }
        return super.sendRequest(request);
    }
}

// 改造后:
public class SecurityTransporter extends Transporter {
    
    
    //... 省略其他代码..
    @Override
    public Response sendRequest(Request request) {
    
    
        if (StringUtils.isBlank(appId) || StringUtils.isBlank(appToken)) {
    
    
            throw new NoAuthorizationRuntimeException(...);
        }
        request.addPayload("app-id", appId);
        request.addPayload("app-token", appToken);
        return super.sendRequest(request);
    }
}

In the modified code, if the parent class Transporter object is passed into the demoFunction() function, the demoFunction()function will not throw an exception, but if demoFunction()the subclass SecurityTransporter object is passed to the function demoFunction(), an exception may be thrown out. Although the exception thrown in the code is a runtime exception (Runtime Exception), it is not necessary to explicitly capture and handle it in the code, but after the subclass replaces the parent class and passes it into the demoFunction function, the logical behavior of the entire program changes

Although the modified code can still use Java's polymorphic syntax to dynamically replace the parent class Transporter with the subclass SecurityTransporter, it will not cause errors when compiling or running the program. However, in terms of design ideas, the design of SecurityTransporter does not conform to the principle of Li style replacement

Although polymorphism and Li-type substitution are somewhat similar in terms of definition description and code implementation, they focus on different angles. Polymorphism is a major feature of object-oriented programming and a syntax of object-oriented programming languages. It is an idea of ​​​​code implementation. Li-style replacement is a design principle, which is used to guide how to design subclasses in the inheritance relationship. The design of subclasses should ensure that when replacing the parent class, the logic of the original program will not be changed and the original program will not be destroyed. correctness

3.2 Which codes clearly violate the LSP?

In fact, there is another description of the Li-style replacement principle that is more practical and instructive, and that is "Design By Contract". The Chinese translation is "design according to the agreement."

It seems abstract, and further interpretation is that when the subclass is designed, it must abide by the behavior agreement (or agreement) of the parent class. The parent class defines the behavior contract of the function, and the subclass can change the internal implementation logic of the function, but cannot change the original behavior contract of the function. The behavior agreement here includes: the function to be realized by the function declaration; the agreement on input, output, and exception; even any special instructions listed in the comments. In fact, the relationship between the parent class and the child class in the definition can also be replaced by the relationship between the interface and the implementation class. A few examples of violations of the Li-style substitution principle are as follows:

1. The subclass violates the function declared by the parent class

sortOrdersByAmount()The order sorting function provided in the parent class sorts the orders according to the amount from small to large, while the subclass sorts sortOrdersByAmount()the orders according to the creation date after rewriting the order sorting function. The design of the subclass violates the Li style substitution principle

2. The subclass violates the agreement of the parent class on input, output and exception

In the parent class, a certain function contract: return null when the operation fails; return an empty collection (empty collection) when the acquired data is empty. After the subclass overloads the function, the implementation changes, and an exception is returned when the operation fails, and null is returned if the data cannot be obtained. The design of the subclass violates the Li style substitution principle

In the parent class, a function agrees that the input data can be any integer, but when the subclass implements, only the input data is allowed to be a positive integer, and the negative number is thrown, that is to say, the check ratio of the subclass to the input data If the parent class is stricter, then the design of the subclass violates the Li-style substitution principle

In the parent class, a certain function contract will only throw ArgumentNullException exceptions, and the design and implementation of the subclasses only allow ArgumentNullException exceptions to be thrown. Any other exceptions thrown will cause the subclasses to violate the Li-type replacement principle.

3. The subclass violates any special instructions listed in the parent class annotation

withdraw()The comment of the cash withdrawal function defined in the parent class is written as follows: "The user's cash withdrawal amount must not exceed the account balance...", and withdraw()after the subclass rewrites the function, the overdraft withdrawal function is realized for the VIP account, that is, the withdrawal amount can be is greater than the account balance, then the design of this subclass does not conform to the principle of Li style replacement

The above are three typical situations that violate the principle of Li-style substitution. In addition, there is another trick to judge whether the design and implementation of the subclass violates the principle of Li-style replacement, that is, use the unit test of the parent class to verify the code of the subclass. If some unit tests fail to run, it may indicate that the design and implementation of the subclass does not fully abide by the contract of the parent class, and the subclass may violate the Li style substitution principle

In fact, the principle of Li-style substitution is very loose. In general, it is not very likely to violate it

4. Interface Segregation Principle (ISP)

4.1 How to understand the "interface segregation principle"?

The English translation of the interface segregation principle is "Interface Segregation Principle", abbreviated as ISP. Robert Martin defines it this way in the SOLID principles:

Clients should not be forced to depend upon interfaces that they do not use.
Clients should not be forced to depend upon interfaces that they do not use

The "client" can be understood as the caller or user of the interface

In fact, the term "interface" can be used in many situations. In software development, it can be regarded as a set of abstract conventions, or it can specifically refer to the API interface between systems, or it can specifically refer to the interface in object-oriented programming language. The key to understanding the principle of interface isolation is to understand the word "interface". In this principle, "interface" can be understood as the following three things:

  • A collection of API interfaces
  • A single API interface or function
  • Interface concept in OOP

4.1.1 Understand "interface" as a set of API interfaces

In the following example, the microservice user system provides a set of user-related APIs for other systems to use, such as registration, login, and user information acquisition. The specific code is as follows:

public interface UserService {
    
    
    boolean register(String cellphone, String password);
    boolean login(String cellphone, String password);
    UserInfo getUserInfoById(long id);
    UserInfo getUserInfoByCellphone(String cellphone);
}
public class UserServiceImpl implements UserService {
    
    
    //...
}

Now, the background management system needs to implement the function of deleting users, and hopes that the user system provides an interface for deleting users. What should we do at this time? It may be said that only a new deleteUserByCellphone()or deleteUserById(). This method can solve the problem, but it also hides some security risks

Deleting a user is a very prudent operation, and it is only hoped to be performed by the background management system, so this interface is limited to the background management system. If you put it in UserService, all systems that use UserService can call this interface. Unrestricted calls by other business systems may lead to accidental deletion of users

Of course, the best solution is to restrict interface calls from the perspective of architecture design through interface authentication. However, if there is no authentication framework to support it, you can also try to avoid misuse of the interface from the level of code design. Referring to the principle of interface isolation, the caller should not be forced to rely on the interface it does not need, and put the deletion interface separately into another interface RestrictedUserService, and then package the RestrictedUserService only for the background management system to use. The specific code implementation is as follows:

public interface UserService {
    
    
    boolean register(String cellphone, String password);
    boolean login(String cellphone, String password);
    UserInfo getUserInfoById(long id);
    UserInfo getUserInfoByCellphone(String cellphone);
}
public interface RestrictedUserService {
    
    
    boolean deleteUserByCellphone(String cellphone);
    boolean deleteUserById(long id);
}
public class UserServiceImpl implements UserService, RestrictedUserService {
    
    
    // ... 省略实现代码...
}

In the example just now, the interface in the interface isolation principle is understood as a set of interfaces, which can be the interface of a certain microservice, or the interface of a certain class library, etc. When designing a microservice or class library interface, if some interfaces are only used by some callers, it is necessary to isolate this part of the interface and use it for the corresponding callers alone, instead of forcing other callers to also rely on this part. the interface that will be used

4.1.2 Understanding "interface" as a single API interface or function

Think of an interface as a single interface or function (here simply referred to as "function"). The principle of interface isolation can be understood as: the design of the function should have a single function, and do not implement multiple different functional logics in one function. For example:

public class Statistics {
    
    
    private Long max;
    private Long min;
    private Long average;
    private Long sum;
    private Long percentile99;
    private Long percentile999;
    //... 省略 constructor/getter/setter 等方法...
}
public Statistics count(Collection<Long> dataSet) {
    
    
    Statistics statistics = new Statistics();
    //... 省略计算逻辑...
    return statistics;
}

In the above code, count()the function of the function is not single enough, it contains many different statistical functions, such as finding the maximum value, minimum value, average value and so on. According to the principle of interface isolation, count()the function split into several smaller-grained functions, and each function is responsible for an independent statistical function. The code after splitting is as follows:

public Long max(Collection<Long> dataSet) {
    
     //... }
public Long min(Collection<Long> dataSet) {
    
     //... }
public Long average(Colletion<Long> dataSet) {
    
     //... }
// ... 省略其他统计函数...

However, in a sense, count()the function cannot be regarded as having a single responsibility, after all, what it does is only related to statistics. When talking about the principle of single responsibility, similar problems were mentioned. In fact, judging whether the function is single or not, in addition to strong subjectivity, also needs to be combined with specific scenarios

If the statistical information defined by Statistics is involved in each statistical requirement in the project, then the design of count()the function is reasonable. On the contrary, if each statistical requirement only involves part of the statistical information listed in Statistics, for example, some only need to use the three types of statistical information such as max, min, and average, and some only need to use average and sum. count()The function will calculate all the statistical information every time, and will do a lot of useless work, which will inevitably affect the performance of the code, especially when the amount of data to be counted is large. Therefore, in this application scenario, count()the design of the function is a bit unreasonable. It should be split into multiple statistical functions with finer granularity according to the second design idea.

Here we can find that the interface isolation principle is somewhat similar to the single responsibility principle, but there are still some differences. The single responsibility principle is aimed at the design of modules, classes, and interfaces. Compared with the principle of single responsibility, the principle of interface isolation focuses more on the design of the interface on the one hand, and on the other hand, it thinks from a different perspective. It provides a standard for judging whether an interface has a single responsibility: indirectly by how the caller uses the interface. If the caller only uses part of the interface or part of the function of the interface, then the design of the interface is not enough to have a single responsibility

4.1.3 Understand "interface" as the interface concept in OOP

Understand "interface" as the interface concept in OOP, such as interface in Java, as in the following example:

Assume that three external systems are used in the project: Redis, MySQL, and Kafka. Each system corresponds to a series of configuration information, such as address, port, access timeout, etc. In order to store these configuration information in memory for use by other modules in the project, three Configuration classes are designed and implemented: RedisConfig, MysqlConfig, and KafkaConfig. As follows:

public class RedisConfig {
    
    
    private ConfigSource configSource; // 配置中心(比如 zookeeper)
    private String address;
    private int timeout;
    private int maxTotal;
    // 省略其他配置: maxWaitMillis,maxIdle,minIdle...

    public RedisConfig(ConfigSource configSource) {
    
    
        this.configSource = configSource;
    }
    public String getAddress() {
    
    
        return this.address;
    }
    //... 省略其他 get()、init() 方法...
    public void update() {
    
    
        // 从 configSource 加载配置到 address/timeout/maxTotal...
    }
}
public class KafkaConfig {
    
     //... 省略... }
public class MysqlConfig {
    
     //... 省略... }

Now, there is a new functional requirement, hoping to support the hot update of Redis and Kafka configuration information. The so-called "hot update" means that if the configuration information is changed in the configuration center, it is hoped that the latest configuration information can be loaded into the memory (that is, RedisConfig, KafkaConfig classes) without restarting the system. However, for some reasons, it is not desirable to hot update the configuration information of MySQL

In order to achieve such a functional requirement, a ScheduledUpdater class is designed and implemented to call update()the methods update configuration information. The specific code implementation is as follows:

public interface Updater {
    
    
    void update();
}
public class RedisConfig implemets Updater {
    
    
    //... 省略其他属性和方法...
    @Override
    public void update() {
    
     //... }
}
public class KafkaConfig implements Updater {
    
    
    //... 省略其他属性和方法...
    @Override
    public void update() {
    
     //... }
}
public class MysqlConfig {
    
     //... 省略其他属性和方法... }

public class ScheduledUpdater {
    
    
    private final ScheduledExecutorService executor = Executors.newSingleThread
    private long initialDelayInSeconds;
    private long periodInSeconds;
    private Updater updater;

    public ScheduleUpdater(Updater updater, long initialDelayInSeconds, long periodInSeconds) {
    
    
        this.updater = updater;
        this.initialDelayInSeconds = initialDelayInSeconds;
        this.periodInSeconds = periodInSeconds;
    }

    public void run() {
    
    
        executor.scheduleAtFixedRate(new Runnable() {
    
    
            @Override
            public void run() {
    
    
                updater.update();
            }
        }, this.initialDelayInSeconds, this.periodInSeconds, TimeUnit.SECONDS)
    }
}

public class Application {
    
    
    ConfigSource configSource = new ZookeeperConfigSource(/* 省略参数 */);
    public static final RedisConfig redisConfig = new RedisConfig(configSource);
    public static final KafkaConfig kafkaConfig = new KakfaConfig(configSource);
    public static final MySqlConfig mysqlConfig = new MysqlConfig(configSource);

    public static void main(String[] args) {
    
    
        ScheduledUpdater redisConfigUpdater = new ScheduledUpdater(redisConfig, 300);
        redisConfigUpdater.run();
        ScheduledUpdater kafkaConfigUpdater = new ScheduledUpdater(kafkaConfig, 60);
        redisConfigUpdater.run();
    }
}

The demand for the hot update just now has been settled. Now, there is a new monitoring function requirement. It is cumbersome to view the configuration information in Zookeeper through the command line. Therefore, I hope there is a more convenient way to view configuration information

You can develop an embedded SimpleHttpServer in the project, and output the configuration information of the project to a fixed HTTP address, such as: http://127.0.0.1:2389/config. You only need to enter this address in the browser to display the configuration information of the system. However, for some reasons, I only want to expose the configuration information of MySQL and Redis, but not Kafka. In order to achieve such a function, the above code needs to be further modified. The modified code is as follows:

public interface Updater {
    
    
    void update();
}
public interface Viewer {
    
    
    String outputInPlainText();
    Map<String, String> output();
}

public class RedisConfig implemets Updater, Viewer {
    
    
    //... 省略其他属性和方法...
    @Override
    public void update() {
    
     //... }
    @Override
    public String outputInPlainText() {
    
     //... }
    @Override
    public Map<String, String> output() {
    
     //...}
}

public class KafkaConfig implements Updater {
    
    
    //... 省略其他属性和方法...
    @Override
    public void update() {
    
     //... }
}

public class MysqlConfig implements Viewer {
    
    
    //... 省略其他属性和方法...
    @Override
    public String outputInPlainText() {
    
     //... }
    @Override
    public Map<String, String> output () {
    
     //...}
}

public class SimpleHttpServer {
    
    
    private String host;
    private int port;
    private Map<String, List<Viewer>> viewers = new HashMap<>();

    public SimpleHttpServer(String host, int port) {
    
    //...}
    public void addViewers (String urlDirectory, Viewer viewer){
    
    
        if (!viewers.containsKey(urlDirectory)) {
    
    
            viewers.put(urlDirectory, new ArrayList<Viewer>());
        }
        this.viewers.get(urlDirectory).add(viewer);
    }
    public void run () {
    
     //... }
}

public class Application {
    
    
    ConfigSource configSource = new ZookeeperConfigSource();
    public static final RedisConfig redisConfig = new RedisConfig(configSource)
    public static final KafkaConfig kafkaConfig = new KakfaConfig(configSource)
    public static final MySqlConfig mysqlConfig = new MySqlConfig(configSource)

    public static void main(String[] args) {
    
    
        ScheduledUpdater redisConfigUpdater = new ScheduledUpdater(redisConfig, 300, 300);
        redisConfigUpdater.run();
        ScheduledUpdater kafkaConfigUpdater = new ScheduledUpdater(kafkaConfig, 60, 60);
        redisConfigUpdater.run();
        SimpleHttpServer simpleHttpServer = new SimpleHttpServer(127.0 .0 .1,2
        simpleHttpServer.addViewer("/config", redisConfig);
        simpleHttpServer.addViewer("/config", mysqlConfig);
        simpleHttpServer.run();
    }
}

So far, the requirements of hot update and monitoring have been realized. Two interfaces with very single functions are designed: Updater and Viewer. ScheduledUpdater only relies on Updater, an interface related to hot update, and does not need to be forced to rely on unnecessary Viewer interfaces, satisfying the interface isolation principle. Similarly, SimpleHttpServer only relies on the Viewer interface related to viewing information, does not rely on unnecessary Updater interfaces, and also satisfies the interface isolation principle

If you do not follow the principle of interface isolation, instead of designing two small interfaces, Updater and Viewer, design a large and comprehensive Config interface, let RedisConfig, KafkaConfig, and MysqlConfig all implement this Config interface, and pass the original Updater and Viewer to ScheduledUpdater The Viewer for SimpleHttpServer is replaced with Config, so what's the problem? Let's take a look at what the code implemented according to this idea looks like

public interface Config {
    
    
    void update();
    String outputInPlainText();
    Map<String, String> output();
}
public class RedisConfig implements Config {
    
    
    //... 需要实现 Config 的三个接口 update/outputIn.../output
}
public class KafkaConfig implements Config {
    
    
    //... 需要实现 Config 的三个接口 update/outputIn.../output
}
public class MysqlConfig implements Config {
    
    
    //... 需要实现 Config 的三个接口 update/outputIn.../output
}

public class ScheduledUpdater {
    
    
    //... 省略其他属性和方法..
    private Config config;

    public ScheduleUpdater(Config config, long initialDelayInSeconds, long period) {
    
    
        this.config = config;
        //...
    }
    //...
}

public class SimpleHttpServer {
    
    
    private String host;
    private int port;
    private Map<String, List<Config>> viewers = new HashMap<>();

    public SimpleHttpServer(String host, int port) {
    
    //...}
    public void addViewer (String urlDirectory, Config config){
    
    
        if (!viewers.containsKey(urlDirectory)) {
    
    
            viewers.put(urlDirectory, new ArrayList<Config>());
        }
        viewers.get(urlDirectory).add(config);
    }
    public void run () {
    
     //... }
}

This kind of design idea can also work, but comparing the two design ideas before and after, the first design idea is obviously much better than the second under the same code size, implementation complexity, and equal readability. Why do you say that? There are two main reasons

1. The first design idea is more flexible, easy to expand, and easy to reuse

Because the responsibilities of Updater and Viewer are more single, single means universal and good reusability. For example, now there is a new requirement to develop a Metrics performance statistics module, and it is hoped that the Metrics will also be displayed on the webpage through SimpleHttpServer for easy viewing. At this time, although Metrics has nothing to do with RedisConfig, etc., it is still possible to make the Metrics class implement a very general Viewer interface and reuse the code implementation of SimpleHttpServer. The specific code is as follows:

public class ApiMetrics implements Viewer {
    
    //...}
public class DbMetrics implements Viewer {
    
    //...}

public class Application {
    
    
    ConfigSource configSource = new ZookeeperConfigSource();
    public static final RedisConfig redisConfig = new RedisConfig(configSource)
    public static final KafkaConfig kafkaConfig = new KakfaConfig(configSource)
    public static final MySqlConfig mySqlConfig = new MySqlConfig(configSource)
    public static final ApiMetrics apiMetrics = new ApiMetrics();
    public static final DbMetrics dbMetrics = new DbMetrics();
    public static void main(String[] args) {
    
    
        SimpleHttpServer simpleHttpServer = new SimpleHttpServer(127.0.0.1, 2
        simpleHttpServer.addViewer("/config", redisConfig);
        simpleHttpServer.addViewer("/config", mySqlConfig);
        simpleHttpServer.addViewer("/metrics", apiMetrics);
        simpleHttpServer.addViewer("/metrics", dbMetrics);
        simpleHttpServer.run();
    }
}

2. The second design idea has done some useless work in code implementation

Because the Config interface contains two types of unrelated interfaces, one is andupdate() the other is . In theory, KafkaConfig only needs to implement the interface , not related interfaces. In the same way, MysqlConfig only needs to implement the related interface, and also needs to implement the interface. But the second design idea requires that RedisConfig, KafkaConfig, and MySqlConfig must implement all interface functions (update, output, outputInPlainText) of Config at the same time. In addition, if you want to continue to add a new interface to Config, all implementation classes must be changed. On the contrary, if the interface granularity is relatively small, there are fewer classes involved in changesoutput()outputInPlainText()update()output()output()update()

5. Dependency Inversion (DIP)

As mentioned earlier, the principle of the single responsibility principle and the principle of opening and closing are relatively simple, but it is more difficult to use them well in practice. The Dependency Inversion Principle is just the opposite. This principle is relatively simple to use, but the concept is more difficult to understand. For example, the following questions:

  • The concept of "dependency inversion" refers to the "what dependency" of "who and whom" is reversed? How should the word "reverse" be understood?
  • There are two other concepts: "Inversion of Control" and "Dependency Injection". What is the difference and connection between these two concepts and "dependency inversion"? Are they saying the same thing?
  • What does IOC in the Spring framework have to do with these concepts?

5.1 Inversion of Control (IOC)

The English translation of Inversion of Control is Inversion Of Control, abbreviated as IOC. For example:

public class UserServiceTest {
    
    
    public static boolean doTest() {
    
    
        // ...
    }
    public static void main(String[] args) {
    
    // 这部分逻辑可以放到框架中
        if (doTest()) {
    
    
            System.out.println("Test succeed.");
        } else {
    
    
            System.out.println("Test failed.");
        }
    }
}

In the above code, all flow is controlled by the programmer. If you abstract a framework like the one below, let's see how to use the framework to achieve the same function. The specific code implementation is as follows:

public abstract class TestCase {
    
    
    public void run() {
    
    
        if (doTest()) {
    
    
            System.out.println("Test succeed.");
        } else {
    
    
            System.out.println("Test failed.");
        }
    }
    public abstract void doTest();
}

public class JunitApplication {
    
    
    private static final List<TestCase> testCases = new ArrayList<>();

    public static void register(TestCase testCase) {
    
    
        testCases.add(testCase);
    }
    public static final void main(String[] args) {
    
    
        for (TestCase case: testCases) {
    
    
            case.run();
        }
    }
}

After introducing this simplified version of the test framework into the project, you only need to fill in the specific test code in the extension point reserved by the framework, that is, doTest()the abstract realize the previous functions without writing main()The function responsible for executing the process . The specific code is as follows:

public class UserServiceTest extends TestCase {
    
    
    @Override
    public boolean doTest() {
    
    
        // ...
    }
}
// 注册操作还可以通过配置的方式来实现,不需要程序员显示调用 register()
JunitApplication.register(new UserServiceTest();

The above example is a typical example of implementing "inversion of control" through the framework. The framework provides an extensible code skeleton for assembling objects and managing the entire execution process. When programmers use the framework for development, they only need to add code related to their own business to the reserved extension points, and then they can use the framework to drive the execution of the entire program flow

The "control" here refers to the control of the program execution flow, and "inversion" refers to the fact that the programmer controls the execution of the entire program before using the framework. After using the framework, the execution flow of the entire program can be controlled by the framework. Control of the process is "inverted" from the programmer to the framework

In fact, there are many ways to achieve inversion of control. In addition to the methods similar to the template design pattern shown above, there are also methods such as dependency injection below. Therefore, inversion of control is not a specific implementation technique. It is a relatively general design idea, which is generally used to guide the design at the framework level

5.2 Dependency Injection (DI)

Dependency injection is the exact opposite of inversion of control, it's a specific coding technique. The English translation of dependency injection is Dependency Injection, abbreviated as DI. For this concept, there is a very vivid statement, that is: Dependency injection is a $25 price tag, which is actually only worth 5 cents. In other words, this concept sounds very "tall", in fact, it is very simple to understand and apply

Dependency injection can be summed up in one sentence: instead of creating dependent class objects inside the class new()through method, after the dependent class objects are created externally, they are passed (or injected) to the class through constructors, function parameters, etc.

In the following example, the Notification class is responsible for message push, and relies on the MessageSender class to push messages such as product promotions and verification codes to users. Use dependency injection and non-dependency injection to achieve it. The specific implementation code is as follows:

// 非依赖注入实现方式
public class Notification {
    
    
    private MessageSender messageSender;

    public Notification() {
    
    
        this.messageSender = new MessageSender(); // 此处有点像 hardcode
    }
    public void sendMessage(String cellphone, String message) {
    
    
        //... 省略校验逻辑等...
        this.messageSender.send(cellphone, message);
    }
}
public class MessageSender {
    
    
    public void send(String cellphone, String message) {
    
    
        //....
    }
}
// 使用 Notification
Notification notification = new Notification();

// 依赖注入的实现方式
public class Notification {
    
    
    private MessageSender messageSender;

    // 通过构造函数将 messageSender 传递进来
    public Notification(MessageSender messageSender) {
    
    
        this.messageSender = messageSender;
    }
    public void sendMessage(String cellphone, String message) {
    
    
        //... 省略校验逻辑等...
        this.messageSender.send(cellphone, message);
    }
}
// 使用 Notification
MessageSender messageSender = new MessageSender();
Notification notification = new Notification(messageSender);

The dependent class object is passed in through dependency injection, which improves the scalability of the code and can flexibly replace the dependent class. This point was also mentioned when we talked about the "opening and closing principle". Of course, the above code can also define MessageSender as an interface, based on the interface rather than implementing programming. The modified code is as follows:

public class Notification {
    
    
    private MessageSender messageSender;
    public Notification(MessageSender messageSender) {
    
    
        this.messageSender = messageSender;
    }
    public void sendMessage(String cellphone, String message) {
    
    
        this.messageSender.send(cellphone, message);
    }
}
public interface MessageSender {
    
    
    void send(String cellphone, String message);
}

// 短信发送类
public class SmsSender implements MessageSender {
    
    
    @Override
    public void send(String cellphone, String message) {
    
    
        //....
    }
}
// 站内信发送类
public class InboxSender implements MessageSender {
    
    
    @Override
    public void send(String cellphone, String message) {
    
    
        //....
    }
}
// 使用 Notification
MessageSender messageSender = new SmsSender();
Notification notification = new Notification(messageSender);

In fact, just mastering the example just mentioned is equivalent to fully mastering dependency injection. Despite its simplicity, dependency injection is extremely useful and is the most effective means of writing testable code

5.3 Dependency Injection Framework (DI Framework)

In the Notification class implemented by dependency injection, although there is no need to use a similar hard code method to create MessageSender objects through new inside the class, the work of creating objects and assembling (or injecting) objects is only moved to a more It's just the upper-level code, and programmers still need to implement it themselves. The specific code is as follows:

public class Demo {
    
    
    public static final void main(String args[]) {
    
    
        MessageSender sender = new SmsSender(); // 创建对象
        Notification notification = new Notification(sender);// 依赖注入
        notification.sendMessage("13918942177", " 短信验证码:2346");
    }
}

In actual software development, some projects may involve dozens, hundreds, or even hundreds of classes, and the creation of class objects and dependency injection will become very complicated. If this part of the work is done by programmers writing their own code, it is prone to errors and the development cost is relatively high. The work of object creation and dependency injection has nothing to do with specific business, and can be abstracted into a framework to complete automatically

This framework is the "Dependency Injection Framework". Simply configure all the class objects that need to be created and the dependencies between classes through the extension points provided by the dependency injection framework, and then the framework can automatically create objects, manage object life cycles, and depend on injection. Things a programmer needs to do

In fact, there are many ready-made dependency injection frameworks, such as Google Guice, Java Spring, Pico Container, Butterfly Container, etc. However, the Spring framework itself claims to be an Inversion Of Control Container (Inversion Of Control Container)

In fact, both statements are true. It's just that the expression of control inversion container is a very broad description, and the expression of DI dependency injection framework is more specific and targeted. Because there are many ways to achieve inversion of control as mentioned earlier, in addition to dependency injection, there are template patterns, etc., and the inversion of control in the Spring framework is mainly realized through dependency injection. However, this distinction is not very obvious, nor is it very important.

5.4 Dependency Inversion Principle (DIP)

Dependency inversion principle. The English translation of the Dependency Inversion Principle is Dependency Inversion Principle, abbreviated as DIP. Chinese translation is sometimes called the Dependency Inversion Principle. The original text is as follows:

High-level modules shouldn't depend on low-level modules. Both modules should depend on abstractions. In addition, abstractions shouldn't depend on details. Details depend on abstractions. High-level modules (high-level modules) do not depend on low-level modules
( low-level). High-level modules and low-level modules should rely on each other through abstractions. In addition, abstractions (abstractions) do not depend on specific implementation details (details), specific implementation details (details) depend on abstractions (abstractions)

The so-called division of high-level modules and low-level modules simply means that in the call chain, the caller belongs to the high-level, and the callee belongs to the low-level. In normal business code development, there is no problem with high-level modules relying on low-level modules. In fact, this principle is mainly used to guide the design at the framework level, similar to the inversion of control mentioned earlier. Take the Servlet container Tomcat as an example to explain

Tomcat is a container for running Java web applications. The written web application code only needs to be deployed under the Tomcat container, and then it can be invoked and executed by the Tomcat container. According to the previous division principle, Tomcat is the high-level module, and the written Web application code is the low-level module. There is no direct dependency between Tomcat and the application code, both rely on the same "abstraction", which is the Servlet specification. The Servlet specification does not depend on the implementation details of specific Tomcat containers and applications, while Tomcat containers and applications depend on the Servlet specification

6. KISS principle

There are several versions of the English description of the KISS principle, such as the following:

  • Keep It Simple and Stupid.
  • Keep It Short and Simple.
  • Keep It Simple and Straightforward.

However, if you look carefully, you will find that the meanings they want to express are actually similar. The translation into Chinese is: try to keep it simple

The KISS principle is a panacea design principle that can be applied in many scenarios. It is often used not only to guide software development, but also to guide broader system design, product design, etc., for example, the design of refrigerators, buildings, iPhones, and so on. However, the focus here is on how to apply this principle in code development

Code readability and maintainability are two very important criteria for measuring code quality. The KISS principle is an important means to keep the code readable and maintainable. The code is simple enough, which means it is easy to read and understand, and bugs are harder to hide. Even if there is a bug, it is relatively simple to fix

However, this principle only tells us to keep the code "Simple and Stupid", but it does not talk about what kind of code is "Simple and Stupid", nor does it give a particularly clear methodology to guide how to develop Code for "Simple and Stupid". Therefore, it looks very simple, but it cannot be implemented

6.1 Is fewer lines of code "simple"?

Let's look at an example first. The following three pieces of code can achieve the same function: check whether the input string ipAddress is a legal IP address. A valid IP address consists of four numbers separated by ".". The value range of each group of numbers is 0~255. The first group of numbers is special, not allowed to be 0

// 第一种实现方式: 使用正则表达式
public boolean isValidIpAddressV1(String ipAddress) {
    
    
    if (StringUtils.isBlank(ipAddress)) return false;

    String regex = "^(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|[1-9])\\."
        + "(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|\\d)\\."
        + "(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|\\d)\\."
        + "(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|\\d)$";
    return ipAddress.matches(regex);
}

// 第二种实现方式: 使用现成的工具类
public boolean isValidIpAddressV2(String ipAddress) {
    
    
    if (StringUtils.isBlank(ipAddress)) return false;

    String[] ipUnits = StringUtils.split(ipAddress, '.');
    if (ipUnits.length != 4) {
    
    
        return false;
    }
    for (int i = 0; i < 4; ++i) {
    
    
        int ipUnitIntValue;
        try {
    
    
            ipUnitIntValue = Integer.parseInt(ipUnits[i]);
        } catch (NumberFormatException e) {
    
    
            return false;
        }
        if (ipUnitIntValue < 0 || ipUnitIntValue > 255) {
    
    
            return false;
        }
        if (i == 0 && ipUnitIntValue == 0) {
    
    
            return false;
        }
    }
    return true;
}

// 第三种实现方式: 不使用任何工具类
public boolean isValidIpAddressV3(String ipAddress) {
    
    
    char[] ipChars = ipAddress.toCharArray();
    int length = ipChars.length;
    int ipUnitIntValue = -1;
    boolean isFirstUnit = true;
    int unitsCount = 0;

    for (int i = 0; i < length; ++i) {
    
    
        char c = ipChars[i];
        if (c == '.') {
    
    
            if (ipUnitIntValue < 0 || ipUnitIntValue > 255) return false;
            if (isFirstUnit && ipUnitIntValue == 0) return false;
            if (isFirstUnit) isFirstUnit = false;
        ipUnitIntValue = -1;
        unitsCount++;
        continue;
    }
    if (c < '0' || c > '9') {
    
    
        return false;
    }
    if (ipUnitIntValue == -1) ipUnitIntValue = 0;
        ipUnitIntValue = ipUnitIntValue * 10 + (c - '0');
    }
    if (ipUnitIntValue < 0 || ipUnitIntValue > 255) return false;
    if (unitsCount != 3) return false;
    return true;
}

The first implementation uses regular expressions, and only three lines of code solve the problem. It has the least number of lines of code, so is it the most in line with the KISS principle? the answer is negative. While it appears to be the simplest with the fewest lines of code, it is actually quite complex. This is precisely because it uses the regular expression

On the one hand, the regular expression itself is relatively complicated, and it is quite challenging to write a regular expression without bugs; on the other hand, not every programmer is proficient in regular expressions. For colleagues who don't know much about regular expressions, it is more difficult to understand and maintain this regular expression. This implementation method will lead to poor readability and maintainability of the code. Therefore, from the original design intention of the KISS principle, this implementation method does not comply with the KISS principle.

The second implementation method uses some ready-made utility functions provided by the StringUtils class and the Integer class to process IP address strings. The third implementation method does not use any tool function, but processes the characters in the IP address one by one to determine whether it is legal. In terms of the number of lines of code, the two methods are almost the same. However, the third type is more difficult than the second type, and it is easier to write bugs. In terms of readability, the code logic of the second implementation is clearer and better understood. Therefore, among these two implementation methods, the second implementation method is more "simple" and more in line with the KISS principle

However, it may be said that although the third implementation method is a bit more complicated to implement, its performance is higher than the second implementation method. From a performance point of view, is it better to choose the third implementation?

Generally speaking, the functions of tools are relatively general and comprehensive, so in code implementation, more details need to be considered and processed, and the execution efficiency will be affected. The third implementation method is to operate the underlying characters by itself, and only deal with the data input in the format of IP address. There are not many redundant function calls and other unnecessary processing logic. Therefore, in terms of execution efficiency , this kind of custom-like processing code is definitely higher than the general tool class

However, although the performance of the third implementation is higher, it is still more inclined to choose the second implementation method. That's because the third implementation is actually an over-optimization. Unless isValidIpAddress()the function is the bottleneck code that affects system performance, the input-output ratio of such optimization is not high, which increases the difficulty of code implementation and sacrifices the readability of the code, but the performance improvement is not obvious

6.2 Does complex code logic violate the KISS principle?

As mentioned earlier, it is not that the fewer the number of lines of code, the "simple" it is. Logical complexity, implementation difficulty, and code readability must also be considered. If the logic of a piece of code is complicated, the implementation is difficult, and the readability is not very good, does it necessarily violate the KISS principle? First look at the following code:

// KMP algorithm: a, b 分别是主串和模式串;n, m 分别是主串和模式串的长度。
public static int kmp(char[] a, int n, char[] b, int m) {
    
    
    int[] next = getNexts(b, m);
    int j = 0;
    for (int i = 0; i < n; ++i) {
    
    
        while (j > 0 && a[i] != b[j]) {
    
     // 一直找到 a[i] 和 b[j]
            j = next[j - 1] + 1;
        }
        if (a[i] == b[j]) {
    
    
            ++j;
        }
        if (j == m) {
    
     // 找到匹配模式串的了
            return i - m + 1;
        }
    }
    return -1;
}

// b 表示模式串,m 表示模式串的长度
private static int[] getNexts(char[] b, int m) {
    
    
    int[] next = new int[m];
    next[0] = -1;
    int k = -1;
    for (int i = 1; i < m; ++i) {
    
    
        while (k != -1 && b[k + 1] != b[i]) {
    
    
            k = next[k];
        }
        if (b[k + 1] == b[i]) {
    
    
            ++k;
        }
        next[i] = k;
    }
    return next;
}

This code fully complies with the characteristics of complex logic, difficult implementation, and poor readability just mentioned, but it does not violate the KISS principle. Why do you say that?

The KMP algorithm is known for being fast and efficient. When it is necessary to deal with long text string matching problems (matching of text content with a size of hundreds of MB), or string matching is the core function of a product (such as text editors such as Vim and Word), or string matching algorithms are system When there is a performance bottleneck, the KMP algorithm that is as efficient as possible should be selected. The KMP algorithm itself has the characteristics of complex logic, difficult implementation and poor readability. Solving problems that are complex in themselves does not violate the KISS principle

However, most of the string matching problems involved in the usual project development are for relatively small text. In this case, it is sufficient to directly call the ready-made string matching functions provided by the programming language. If you have to use the KMP algorithm and BM algorithm to achieve string matching, it really violates the KISS principle. In other words, the same code that satisfies the KISS principle in a certain business scenario may not be satisfied in another application scenario

6.3 How to write code that meets the KISS principle?

  1. Don't implement code using techniques your colleagues may not understand. For example, the regular expressions in the previous example, as well as the overly advanced syntax in some programming languages, etc.
  2. Don't reinvent the wheel, but be good at using existing tool libraries. Experience has proved that if you implement these class libraries by yourself, the probability of bugs will be higher, and the cost of maintenance will be higher
  3. Don't over optimize. Don't overuse some tricks and tricks (for example, bit operations instead of arithmetic operations, complex conditional statements instead of if-else, use some functions that are too low-level, etc.) to optimize the code and sacrifice the readability of the code

In fact, whether the code is simple enough is a very subjective judgment. The same code, some people think it is simple, some people think it is not simple enough. And often the code you write yourself will feel simple enough. Therefore, there is another effective indirect method to judge whether the code is simple, and that is code review. If your colleagues have a lot of questions about your code during code review, it means that your code may not be "simple" enough and needs to be optimized

When doing development, you must not over-design, and don't think that simple things have no technical content. In fact, the more complex problems can be solved with simple methods, the more it can reflect a person's ability

7. YAGNI

The full English name of the YAGNI principle is: You Ain't Gonna Need It. The literal translation is: you won't need it. This principle is also a panacea. When used in software development, it means: don't design functions that are not currently used; do not write code that is not currently used. In fact, the core idea of ​​this principle is: don't over-design

For example, the system only uses Redis to store configuration information temporarily, and ZooKeeper may be used in the future. According to the YAGNI principle, there is no need to write this part of code in advance before ZooKeeper is used. Of course, this does not mean that there is no need to consider the scalability of the code. It is still necessary to reserve extension points, and then implement the part of the code for ZooKeeper storage configuration information when needed

For another example, do not introduce development packages that do not require dependencies in advance in the project. For Java programmers, Maven or Gradle is often used to manage dependent class libraries (library). Some colleagues frequently modify Maven or Gradle configuration files in order to avoid missing library packages during development, and introduce a large number of commonly used library packages into the project in advance. In fact, such an approach is also against the YAGNI principle

The YAGNI principle is not the same thing as the KISS principle. The KISS principle is about "how to do it" (keep it as simple as possible), while the YAGNI principle is about "doing it or not" (don't do it if you don't need it now)

8. DRY principle

The English description is: Don't Repeat Yourself. The literal translation in Chinese is: Don't repeat yourself. Applying it to programming can be understood as: don't write repetitive code

But as long as two pieces of code look the same, is that a violation of the DRY principle? the answer is negative. This is a misunderstanding of this principle by many people. In fact, repeated code does not necessarily violate the DRY principle, and some seemingly non-repetitive code may also violate the DRY principle. Three typical cases of code duplication are:

  1. Implement logical repetition
  2. functional semantic repetition
  3. code execution duplication

These three kinds of code repetition, some seem to violate DRY, but actually do not violate; some seem not to violate, but actually violate

8.1 Implement Logical Repetition

public class UserAuthenticator {
    
    
    public void authenticate(String username, String password) {
    
    
        if (!isValidUsername(username)) {
    
    
            // ...throw InvalidUsernameException...
        }
        if (!isValidPassword(password)) {
    
    
            // ...throw InvalidPasswordException...
        }
        //... 省略其他代码...
    }
    private boolean isValidUsername(String username) {
    
    
        // check not null, not empty
        if (StringUtils.isBlank(username)) {
    
    
            return false;
        }
        // check length: 4~64
        int length = username.length();
        if (length < 4 || length > 64) {
    
    
            return false;
        }
        // contains only lowcase characters
        if (!StringUtils.isAllLowerCase(username)) {
    
    
            return false;
        }
        // contains only a~z,0~9,dot
        for (int i = 0; i < length; ++i) {
    
    
            char c = username.charAt(i);
            if (!(c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '.') {
    
    
                return false;
            }
        }
        return true;
    }
    private boolean isValidPassword(String password) {
    
    
        // check not null, not empty
        if (StringUtils.isBlank(password)) {
    
    
            return false;
        }
        // check length: 4~64
        int length = password.length();
        if (length < 4 || length > 64) {
    
    
            return false;
        }
        // contains only lowcase characters
        if (!StringUtils.isAllLowerCase(password)) {
    
    
            return false;
        }
        // contains only a~z,0~9,dot
        for (int i = 0; i < length; ++i) {
    
    
            char c = password.charAt(i);
            if (!(c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '.') {
    
    
                return false;
            }
        }
        return true;
    }
}

In the above code, there are two very obvious repeated code fragments: isValidUserName()function and isValidPassword()function . Duplicate code is typed twice, or simply copy-pasted, which seems to clearly violate the DRY principle. In order to remove the duplicate code, refactor the above code and merge isValidUserName()the function and isValidPassword()function into a more general function isValidUserNameOrPassword(). The refactored code looks like this:

public class UserAuthenticatorV2 {
    
    
    public void authenticate(String userName, String password) {
    
    
        if (!isValidUsernameOrPassword(userName)) {
    
    
            // ...throw InvalidUsernameException...
        }
        if (!isValidUsernameOrPassword(password)) {
    
    
            // ...throw InvalidPasswordException...
        }
    }
    private boolean isValidUsernameOrPassword(String usernameOrPassword) {
    
    
        // 省略实现逻辑
        // 跟原来的 isValidUsername() 或 isValidPassword() 的实现逻辑一样...
        return true;
    }
}

After refactoring, the number of lines of code is reduced, and there is no repeated code. Is it better? the answer is negative

From the name alone, it can be found that the merged isValidUserNameOrPassword()function is responsible for two things: verifying the user name and verifying the password, which violates the "Single Responsibility Principle" and "Interface Segregation Principle". isValidUserNameOrPassword()In fact, the code is still problematic even when combining the two functions into

isValidUserName()Because isValidPassword()the two functions, although it seems to be repetitive from the code implementation logic, are not semantically repetitive. The so-called "semantic non-repetition" means: From a functional point of view, these two functions do two things that are completely non-repetitive, one is to verify the user name, and the other is to verify the password. Although in the current design, the two verification logics are exactly the same, if the two functions are combined according to the second writing method, there will be potential problems. One day in the future, if the verification logic of the password is modified, for example, the password is allowed to contain uppercase characters, and the length of the password is allowed to be 8 to 64 characters, then at this time, the implementation logic of and will be isValidUserName()different
isValidPassword(). It is necessary to re-split the merged function into the two functions before the merge

Although the implementation logic of the code is the same, the semantics are different, and it is determined that it does not violate the DRY principle. Problems involving repetitive code can be solved by abstracting into finer-grained functions. For example, encapsulate the logic of checking only including a~z、0~9dot into boolean onlyContains(String str, String charlist);a function

8.2 Functional Semantic Duplication

In the same project code there are the following two functions: isValidIp()and checkIfIpValid(). Although the names of the two functions are different and the implementation logic is different, the functions are the same, and they are both used to determine whether the IP address is legal.

The reason why there are two functions with the same function in the same project is that these two functions were developed by two different colleagues, and one of them defined it himself without knowing that it already isValidIp()exists Implemented the same checkIfIpValid()function . In the same project code, there are the following two functions, does it violate the DRY principle?

public boolean isValidIp(String ipAddress) {
    
    
    if (StringUtils.isBlank(ipAddress)) return false;
    String regex = "^(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|[1-9])\\."
        + "(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|\\d)\\."
        + "(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|\\d)\\."
        + "(1\\d{2}|2[0-4]\\d|25[0-5]|[1-9]\\d|\\d)$";
    return ipAddress.matches(regex);
}

public boolean checkIfIpValid(String ipAddress) {
    
    
    if (StringUtils.isBlank(ipAddress)) return false;
    String[] ipUnits = StringUtils.split(ipAddress, '.');
    if (ipUnits.length != 4) {
    
    
        return false;
    }
    for (int i = 0; i < 4; ++i) {
    
    
        int ipUnitIntValue;
        try {
    
    
            ipUnitIntValue = Integer.parseInt(ipUnits[i]);
        } catch (NumberFormatException e) {
    
    
            return false;
        }
        if (ipUnitIntValue < 0 || ipUnitIntValue > 255) {
    
    
            return false;
        }
        if (i == 0 && ipUnitIntValue == 0) {
    
    
            return false;
        }
    }
    return true;
}

This example is the exact opposite of the previous example. The last example is that the code implements logical repetition, but the semantics are not repeated, and it is not considered to violate the DRY principle. In this example, although the implementation logic of the two pieces of code is not repeated, the semantic duplication, that is, the duplication of functions, is considered to violate the DRY principle. In the project, one implementation idea should be unified, and the same function should be called uniformly in all places where it is used to judge whether the IP address is legal

Assuming that the implementation ideas are not unified, the isValidIp()function checkIfIpValid(), and the function is called in some places, which will cause the code to look strange, which is equivalent to "burying a hole" in the code and increasing the reading for colleagues who are not familiar with this part of the code. difficulty. Colleagues may have researched for a long time and felt that the functions are the same, but they were a little confused. They thought that there was a deeper consideration, so they defined two functions with similar functions, and finally found that it was a code design problem.

isValidIp()In addition, if one day the judging rules of whether the IP address is legal in the project change, for example: 255.255. Modified checkIfIpValid()function . Or, it is not known that there is a checkIfIpValid()function , which will cause some codes to still use the old IP address judgment logic, resulting in some inexplicable bugs

8.3 Duplication of Code Execution

One of the first two examples is to achieve logical repetition, and the other is semantic repetition. Let's look at the third example. Among them, login()the function is used to check whether the user login is successful. If it fails, it returns an exception; if it succeeds, it returns the user information. The specific code is as follows:

public class UserService {
    
    
    private UserRepo userRepo;// 通过依赖注入或者 IOC 框架注入
    public User login(String email, String password) {
    
    
        boolean existed = userRepo.checkIfUserExisted(email, password);
        if (!existed) {
    
    
            // ... throw AuthenticationFailureException...
        }
        User user = userRepo.getUserByEmail(email);
        return user;
    }
}

public class UserRepo {
    
    
    public boolean checkIfUserExisted(String email, String password) {
    
    
        if (!EmailValidation.validate(email)) {
    
    
            // ... throw InvalidEmailException...
        }
        if (!PasswordValidation.validate(password)) {
    
    
            // ... throw InvalidPasswordException...
        }
        //...query db to check if email&password exists...
    }
    public User getUserByEmail(String email) {
    
    
        if (!EmailValidation.validate(email)) {
    
    
            // ... throw InvalidEmailException...
        }
        //...query db to get user by email...
    }
}

The above code has neither logical duplication nor semantic duplication, but it still violates the DRY principle. This is because of "duplication of execution" in the code. Let's take a look together, which codes are executed repeatedly?

One of the most obvious places for repeated execution is that in login()the function , the validation logic of email is executed twice. Once when checkIfUserExisted()the function , and once getUserByEmail()when the function is called. This problem is relatively simple to solve, just remove the validation logic from UserRepo and put it in UserService

In addition, there is a relatively hidden execution repetition in the code. In fact, login()the function does not need to call checkIfUserExisted()the function , but only needs to call getUserByEmail()the function once to obtain the user's email, password and other information from the database, and then communicate with the user input Email and password information are compared to determine whether the login is successful

In fact, such an optimization is necessary. checkIfUserExisted()Because getUserByEmail()both the function and the function need to query the database, and I/O operations such as the database are time-consuming. When writing code, you should minimize such I/O operations

According to the modification idea just now, refactor the code, remove the "repeated execution" code, only verify email and password once, and only query the database once. The code after refactoring looks like this:

public class UserService {
    
    
    private UserRepo userRepo;// 通过依赖注入或者 IOC 框架注入
    public User login(String email, String password) {
    
    
        if (!EmailValidation.validate(email)) {
    
    
            // ... throw InvalidEmailException...
        }
        if (!PasswordValidation.validate(password)) {
    
    
            // ... throw InvalidPasswordException...
        }
        User user = userRepo.getUserByEmail(email);
        if (user == null || !password.equals(user.getPassword()) {
    
    
            // ... throw AuthenticationFailureException...
        }
        return user;
    }
}

public class UserRepo {
    
    
    public boolean checkIfUserExisted(String email, String password) {
    
    
        //...query db to check if email&password exists
    }
    public User getUserByEmail(String email) {
    
    
        //...query db to get user by email...
    }
}

8.4 Code Reusability

8.4.1 What is code reusability?

First, let’s distinguish three concepts: Code Reusability, Code Resue and DRY principles

  • Code reuse means a behavior: when developing new functions, try to reuse existing code
  • The reusability of code refers to the characteristics or capabilities of a piece of code that can be reused: when writing code, make the code as reusable as possible
  • The DRY principle is a principle: don't write duplicate code

From the definition and description, they seem to be somewhat similar, but when you look deeper, the difference between the three is quite big

"No repetition" does not mean "reusable"

In a project code, there may not be any repetitive code, but it does not mean that there is reusable code in it. Non-repetitive and reusable are completely two concepts. So, from this perspective, the DRY principle and code reusability are two different things

"Reuse" and "reusability" focus on different angles

Code "reusability" is from the perspective of code developers, and "reuse" is from the perspective of code users. For example, colleague A wrote a UrlUtils class, and the "reusability" of the code is very good. Colleague B directly "reuses" the UrlUtils class written by colleague A when developing new functions

Although reuse, reusability, and DRY principles are different in understanding, the actual goals to be achieved are similar, all of which are to reduce the amount of code and improve the readability and maintainability of the code . In addition, reusing old code that has been tested will have fewer bugs than redeveloping from scratch

The concept of "reuse" can not only guide the design and development of fine-grained modules, classes, and functions. In fact, some frameworks, class libraries, components, etc. are also created for the purpose of reuse. For example, Spring framework, Google Guava class library, UI components, etc.

8.4.2 How to improve code reusability?

  1. Reduce code coupling
    For highly coupled code, when you want to reuse one of the functions and extract the code of this function into an independent module, class or function, you will often find that it affects the whole body. Moving a little bit of code involves a lot of other related code. Therefore, highly coupled code will affect the reusability of code, we should minimize code coupling
  2. Satisfy the principle of single responsibility
    If the responsibility is not single enough, and the modules and classes are designed to be large and comprehensive, then there will be more codes that depend on it or the code it depends on, which will increase the coupling of the code. According to the above point, it will also affect the reusability of the code. On the contrary, the finer-grained code, the better the generality of the code and the easier it is to be reused
  3. Modularization
    The "module" here does not only refer to a module composed of a group of classes, but can also be understood as a single class or function. Be good at encapsulating code with independent functions into modules. Independent modules are like building blocks, which are easier to reuse and can be directly used to build more complex systems
  4. Separation of business and non-business logic
    The more business-independent code is easier to reuse, the more business-specific code is more difficult to reuse. Therefore, in order to reuse code that has nothing to do with the business, the business and non-business logic codes are separated and extracted into some common frameworks, class libraries, components, etc.
  5. General code sinking
    From the perspective of layering, the lower the code is, the more general it will be called by more modules, and the more it should be designed to be reusable. In general, after the code is layered, in order to avoid the confusion of the calling relationship caused by cross-calling, only the upper-level code is allowed to call the lower-level code and the call between the same-level code, and the lower-level code is prevented from calling the upper-level code. Therefore, the common code should sink to the lower layer as much as possible
  6. Inheritance, polymorphism, abstraction, and encapsulation
    When talking about object-oriented features, it is mentioned that by using inheritance, common code can be extracted to the parent class, and subclasses can reuse the properties and methods of the parent class. Using polymorphism, you can dynamically replace part of the logic of a piece of code, making this code reusable. In addition, abstraction and encapsulation, if understood from a broader level rather than a narrower level of object-oriented features, the more abstract, the less dependent on specific implementations, the easier it is to reuse. The code is encapsulated into modules, hiding variable details and exposing unchanging interfaces, the easier it is to reuse
  7. Design patterns such as application templates
    Some design patterns can also improve code reusability. For example, the template mode is implemented using polymorphism, which can flexibly replace part of the code, and the entire process template code can be reused

In addition to the above points, there are some features related to programming languages ​​that can also improve code reusability, such as generic programming. In fact, in addition to the methods mentioned above, awareness of reuse is also very important. When writing code, you need to think more about whether this part of the code can be extracted and used as an independent module, class or function in multiple places. When designing each module, class, and function, think about its reusability just like designing an external API

8.5 Dialectical thinking and flexible application

In fact, writing reusable code is not easy. If there is already a requirement scenario for reuse when writing code, it may not be difficult to develop reusable code according to the requirement for reuse. However, if there is no need for reuse at the moment, I just hope that the code written now has the characteristics of reusability, so that it can be reused when a colleague develops a new function in the future. In the absence of specific reuse requirements, it is more challenging to predict how the code will be reused in the future.

In fact, unless there are very clear reuse requirements, it is not a recommended practice to spend too much time, energy, and development costs for temporarily unused reuse requirements. This also violates the YAGNI principle mentioned earlier

In addition, there is a well-known principle called "Rule of Three". This principle can be used in many industries and scenarios. If this principle is applied here, that is to say, when writing code for the first time, if there is no need for reuse, and the need for future reuse is not particularly clear, and the cost of developing reusable code is relatively high , then there is no need to consider the reusability of the code. When developing new functions later, we find that the code written before can be reused, so we will refactor this code to make it more reusable

That is to say, when the code is written for the first time, reusability is not considered; when the reusable scene is encountered for the second time, refactoring is performed to make it reusable. It should be noted that the "Three" in "Rule of Three" does not really refer to the exact "three", here it refers to "two"

9. Law of Demeter (LOD)

9.1 What is "high cohesion and loose coupling"?

"High cohesion, loose coupling" is a very important design idea, which can effectively improve the readability and maintainability of the code, and reduce the scope of code changes caused by functional changes. In fact, many design principles are aimed at realizing "high cohesion and loose coupling" of code, such as single responsibility principle, interface-based rather than implementation programming, etc.

In fact, "high cohesion and loose coupling" is a relatively general design idea, which can be used to guide the design and development of different granularity codes, such as systems, modules, classes, and even functions, and can also be applied to different development scenarios In, such as microservices, frameworks, components, class libraries, etc. Here, "class" is used as the application object of this design idea to explain

In this design idea, "high cohesion" is used to guide the design of the class itself, and "loose coupling" is used to guide the design of dependencies between classes. However, the two are not completely independent. High cohesion helps loose coupling, and loose coupling requires high cohesion support

What is "high cohesion"?

The so-called high cohesion means that similar functions should be placed in the same class, and dissimilar functions should not be placed in the same class. Similar functions are often modified at the same time and placed in the same class, the modification will be more concentrated, and the code is easy to maintain. In fact, the single responsibility principle mentioned earlier is a very effective design principle for achieving high code cohesion

What is "loose coupling"?

The so-called loose coupling means that in the code, the dependencies between classes are simple and clear. Even if two classes have dependencies, a code change in one class will not or rarely cause code changes in the dependent class. In fact, the dependency injection, interface isolation, programming based on interface rather than implementation, and Dimit's law mentioned above are all to achieve loose coupling of code

The relationship between "cohesion" and "coupling"

"High cohesion" contributes to "loose coupling", and similarly, "low cohesion" also leads to "tight coupling". As shown in the figure below, the code structure in the left part of the figure is "high cohesion, loose coupling"; the right part is just the opposite, it is "low cohesion, tight coupling"

insert image description here

In the code design of the left part of the figure, the granularity of classes is relatively small, and the responsibility of each class is relatively single. Similar functions are put into one class, and dissimilar functions are divided into multiple classes. This class is more independent and the cohesion of the code is better. Because of the single responsibility, each class is dependent on fewer classes, and the code is low-coupled. The modification of a class will only affect the code changes of a dependent class. You only need to test whether this dependent class can still work normally

In the code design of the right part of the figure, the class granularity is relatively large, low cohesion, large and comprehensive functions, and dissimilar functions are put into one class. This causes many other classes to depend on this class. When modifying a certain functional code of this class, it will affect multiple classes that depend on it. It is necessary to test whether these three dependent classes can still work normally. This is the so-called "moving the whole body with one hair"

In addition, it can also be seen from the figure that the code structure with high cohesion and low coupling is simpler and clearer, and accordingly, it is indeed much better in terms of maintainability and readability

9.2 Theoretical description of "Demeter's law"

The English translation of the Law of Demeter is: Law of Demeter, the abbreviation is LOD. Judging from the name alone, it is completely impossible to guess what this principle is about. However, it has another more expressive name, called the principle of least knowledge, English translation: The Least Knowledge Principle. The original text is as follows:

Each unit should have only limited knowledge about other units: only units “closely” related to the current unit. Or: Each unit should only talk to its friends; Don’t talk to strangers.

Each module (unit) should only know the limited knowledge (knowledge) of those modules that are closely related to it (units: only units “closely” related to the current unit). In other words, each module only "talks" with its own friends, not with strangers

Most of the design principles and ideas are very abstract, and there are various interpretations. To be flexibly applied to actual development requires the accumulation of practical experience. Demeter's Law is no exception. Here we redefine the definition just now. For uniformity, replace "module" in the definition description with "class"

There should be no dependencies between classes that should not have direct dependencies; between classes that have dependencies, try to only rely on the necessary interfaces (that is, the "limited knowledge" in the definition)

From the above description, it can be seen that Dimit's law consists of two parts before and after, and these two parts are about two things

9.2.1 There should be no dependencies between classes that should not have direct dependencies

Take the example below, which implements a simplified version of the search engine's ability to crawl web pages. The code contains three main classes. Among them, the NetworkTransporter class is responsible for the underlying network communication and obtains data according to the request; the HtmlDownloader class is used to obtain web pages through URLs; Document represents web page documents, and subsequent web page content extraction, word segmentation, and indexing are all based on this. The specific code implementation is as follows:

public class NetworkTransporter {
    
    
    // 省略属性和其他方法...
    public Byte[] send(HtmlRequest htmlRequest) {
    
    
        //...
    }
}
public class HtmlDownloader {
    
    
    private NetworkTransporter transporter;// 通过构造函数或 IOC 注入
    public Html downloadHtml(String url) {
    
    
        Byte[] rawHtml = transporter.send(new HtmlRequest(url));
        return new Html(rawHtml);
    }
}
public class Document {
    
    
    private Html html;
    private String url;

    public Document(String url) {
    
    
        this.url = url;
        HtmlDownloader downloader = new HtmlDownloader();
        this.html = downloader.downloadHtml(url);
    }
    //...
}

Although this code is "usable" and can achieve the functions we want, it is not "easy to use" and has many design flaws

1. First look at the NetworkTransporter class

As a low-level network communication class, its function should be as general as possible, not just for downloading HTML, so it should not directly depend on the too specific sending object HtmlRequest. From this point of view, the design of the NetworkTransporter class violates Dimit's law and relies on the HtmlRequest class that should not have direct dependencies.

How should refactoring be done so that the NetworkTransporter class satisfies Dimit's law? There is an image metaphor here. If you are going to the store to buy something now, you will definitely not give the wallet directly to the cashier and let the cashier take the money from it, but you take the money out of the wallet and give it to the cashier. The HtmlRequest object here is equivalent to the wallet, and the address and content objects in the HtmlRequest are equivalent to money. Address and content should be handed over to NetworkTransporter instead of HtmlRequest directly to NetworkTransporter. According to this idea, the code after NetworkTransporter refactoring is as follows:

public class NetworkTransporter {
    
    
    // 省略属性和其他方法...
    public Byte[] send(String address, Byte[] data) {
    
    
        //...
    }
}

2. Look at the HtmlDownloader class again

There is nothing wrong with the design of this class. However, the definition of the NetworkTransporter send()function , and this class uses send()the function, so it needs to be modified accordingly. The modified code is as follows:

public class HtmlDownloader {
    
    
    private NetworkTransporter transporter;// 通过构造函数或 IOC 注入
    // HtmlDownloader 这里也要有相应的修改
    public Html downloadHtml(String url) {
    
    
        HtmlRequest htmlRequest = new HtmlRequest(url);
        Byte[] rawHtml = transporter.send(
                htmlRequest.getAddress(), htmlRequest.getContent().getBytes());
        return new Html(rawHtml);
    }
}

3. Finally, look at the Document class

There are many problems in this category, mainly in three points. First, downloader.downloadHtml()the logic is complex and time-consuming, and should not be placed in the constructor, which will affect the testability of the code. Second, the HtmlDownloader object is created through new in the constructor, which violates the design idea of ​​programming based on interfaces rather than implementation, and also affects the testability of the code. Third, from a business point of view, Document webpage documents do not need to rely on the HtmlDownloader class, which violates Dimit's law

Although the Document class has many problems, it is relatively simple to modify, and only one change can solve all problems. The modified code is as follows:

public class Document {
    
    
    private Html html;
    private String url;
    public Document(String url, Html html) {
    
    
        this.html = html;
        this.url = url;
    }
    //...
}

// 通过一个工厂方法来创建 Document
public class DocumentFactory {
    
    
    private HtmlDownloader downloader;
    public DocumentFactory(HtmlDownloader downloader) {
    
    
        this.downloader = downloader;
    }
    public Document createDocument(String url) {
    
    
        Html html = downloader.downloadHtml(url);
        return new Document(url, html);
    }
}

9.2.2 Between classes with dependencies, try to only rely on the necessary interfaces

In the following example, the Serialization class is responsible for the serialization and deserialization of objects

public class Serialization {
    
    
    public String serialize(Object object) {
    
    
        String serializedResult = ...;
        //...
        return serializedResult;
    }
    public Object deserialize(String str) {
    
    
        Object deserializedResult = ...;
        //...
        return deserializedResult;
    }
}

Just looking at the design of this class, there is no problem at all. However, if you put it in a certain application scenario, there is still room for further optimization. Assume that in the project, some classes only use serialization operations, while others only use deserialization operations. Based on the second half of Dimit's rule, "between classes with dependencies, try to only rely on the necessary interfaces", and those classes that only use serialization operations should not rely on deserialization interfaces. Similarly, classes that only use deserialization operations should not rely on serialization interfaces

According to this idea, the Serialization class should be split into two smaller-grained classes, one is only responsible for serialization (Serializer class), and the other is only responsible for deserialization (Deserializer class). After splitting, classes that use serialization operations only need to rely on the Serializer class, and classes that use deserialization operations only need to rely on the Deserializer class. The code after splitting is as follows:

public class Serializer {
    
    
    public String serialize(Object object) {
    
    
        String serializedResult = ...;
        ...
        return serializedResult;
    }
}
public class Deserializer {
    
    
    public Object deserialize(String str) {
    
    
        Object deserializedResult = ...;
        ...
        return deserializedResult;
    }
}

Although the code after splitting can better satisfy Dimit's law, it violates the design idea of ​​high cohesion. High cohesion requires similar functions to be placed in the same class, so that when the function is modified, the modification will not be too scattered. For the above example, if the implementation of serialization is changed, for example, from JSON to XML, the implementation logic of deserialization also needs to be modified. In the unsplit case, only one class needs to be modified. After the split, two classes need to be modified. Obviously, the scope of code changes for this design idea has become larger

If you don't want to violate the design idea of ​​high cohesion, and you don't want to violate Dimit's law, how to solve this problem? In fact, this problem can be easily solved by introducing two interfaces. The specific code is as follows:

public interface Serializable {
    
    
    String serialize(Object object);
}
public interface Deserializable {
    
    
    Object deserialize(String text);
}

public class Serialization implements Serializable, Deserializable {
    
    
    @Override
    public String serialize(Object object) {
    
    
        String serializedResult = ...;
        ...
        return serializedResult;
    }
    @Override
    public Object deserialize(String str) {
    
    
        Object deserializedResult = ...;
        ...
        return deserializedResult;
    }
}
public class DemoClass_1 {
    
    
    private Serializable serializer;
    public Demo(Serializable serializer) {
    
    
        this.serializer = serializer;
    }
    //...
}
public class DemoClass_2 {
    
    
    private Deserializable deserializer;
    public Demo(Deserializable deserializer) {
    
    
        this.deserializer = deserializer;
    }
    //...
}

Although the Serialization implementation class including serialization and deserialization must be passed in to the constructor of DemoClass_1, the dependent Serializable interface only includes serialization operations, and DemoClass_1 cannot use the deserialization interface in the Serialization class to deserialize There is no perception of the optimization operation, which also meets the requirement of "depending on limited interfaces" mentioned in the second half of Dimit's law

In fact, the above code implementation idea also reflects the design principle of "programming based on interface rather than implementation". Combined with Dimit's law, a new design principle can be summed up, that is, "based on the minimum interface rather than the maximum implementation" programming". In fact, the new design patterns and design principles are the routines summarized and summarized for the development pain points in a large number of practices

9.3 Dialectical thinking and flexible application

Corresponding to the example of serialization and deserialization, the whole class only includes two operations of serialization and deserialization, and only the users of the serialization operation are used. Even if they can perceive the only deserialization function, the problem is not big. In order to satisfy Dimit's law, is it a bit over-designed to split a very simple class into two interfaces?

There is no right or wrong in the design principle itself, only whether it can be said right. Do not apply design principles for the sake of applying design principles. When applying design principles, you must analyze specific problems

For the Serialization class just now, it only contains two operations, and there is really no need to split it into two interfaces. However, if you add more functions to the Serialization class and implement more and better serialization and deserialization functions, reconsider this issue. The specific code after modification is as follows:

public class Serializer {
    
     // 参看 JSON 的接口定义
    public String serialize(Object object) {
    
     //... }
    public String serializeMap(Map map) {
    
     //... }
    public String serializeList(List list) {
    
     //... }

    public Object deserialize(String objectString) {
    
     //... }
    public Map deserializeMap(String mapString) {
    
     //... }
    public List deserializeList(String listString) {
    
     //... }
}

In this scenario, the second design idea is better. Because based on the previous application scenarios, most of the code only needs to use the serialization function. For these users, there is no need to understand the "knowledge" of deserialization, and the modified Serialization class, the "knowledge" of deserialization has changed from one function to three. Once there is a code change in any deserialization operation, it is necessary to check and test whether all codes that depend on the Serialization class can still work normally. In order to reduce coupling and testing workload, the functions of deserialization and serialization should be separated according to Dimit's law

10. For the development of business systems, how to do demand analysis and design?

For an engineer, if he wants to pursue long-term development, he should not always just put himself in the role of executor, not just a code implementer, but also have the ability to be independently responsible for a system, and be able to end-to-end (end to end) ) to develop a complete system. The work includes: early demand communication and analysis, mid-term code design and implementation, and later system online maintenance, etc.

Most engineers do business development. Many engineers feel that business development has no technical content and no growth. It is simply CRUD, translating business logic, and does not use the design principles, ideas, and models mentioned in the column at all.

Here, through the actual development of a point exchange system, on the one hand, it shows the entire development routine of a business system from demand analysis to online maintenance, so that it can be applied to the development of all other systems by analogy. What design principles, ideas, and models are actually contained in the business development of content?

10.1 Requirements Analysis

Points are a common marketing method. Many products use it to promote consumption and increase user stickiness, such as Taobao points, credit card points, shopping mall consumption points, etc. Suppose you are an engineer on an e-commerce platform like Taobao, and the platform does not yet have a points system. Leader wants you to be responsible for developing such a system, how would you do it?

As a technical person, how to do product design? First of all, don't think about it alone. On the one hand, it is difficult to think comprehensively in doing so. On the other hand, designing from scratch is also time consuming. Therefore, we must learn to "learn from". Einstein said, "The great secret of creativity is knowing how to hide your origin"

You can find several similar products, such as Taobao, and see how they design the point system, and then use it for your own products. Or you can use Taobao yourself to see how the points are used, or you can directly search the "Taobao Points Rules" on Baidu. Based on these two inputs, it is basically possible to figure out how to design the scoring system. In addition, you must fully understand your company's products, incorporate what you have borrowed into your own products, and make appropriate micro-innovations

Generally speaking, the point system is nothing more than two major function points, one is to earn points, and the other is to consume points. The function of earning points includes points earning channels, such as placing an order, daily check-in, comments, etc.; it also includes points redemption rules, such as the conversion ratio of order amount to points, how many points are given for daily sign-in, etc. Consumption points function includes point consumption channels, such as deducting the order amount, redeeming coupons, redeeming points for purchase, deducting points for participating in activities, etc.; it also includes point redemption rules, such as how many points can be converted into how much the amount of the order is deducted, and a coupon How many points are needed to redeem etc.

The above are only very general and rough functional requirements. In actual situations, there must be some business details that need to be considered, such as the validity period of points. Regarding these business details, it is still not comprehensive to think about it. In case of omission, there is still a way to find it. In addition to the "reference" ideas just mentioned, you can also refine the business process through product wireframes, user cases (user cases) or user stories (user stories), and dig out some details that are not easy to think of. function points

User cases are somewhat similar to unit test cases. It focuses on contextualization, which actually simulates how users use products and describes a complete business operation process of users in a specific application scenario. Therefore, it contains more details and is easier to understand. For example, the user use case related to the validity period of points can be designed as follows:

  • When the user gets the points, the validity period of the points will be informed
  • When users use points, they will give priority to using points that are about to expire
  • When the user queries the point details, the validity period and status of the points will be displayed (whether expired or not)
  • When users query the total available points, expired points will be excluded

10.1.1 Points earning and redemption rules

Points earning channels include: placing an order, daily check-in, comments, etc.

Point redemption rules may be more general. For example, sign in to get 10 points. For another example, 10% of the total order amount is converted into points, that is, an order of 100 yuan can accumulate 10 points. In addition, the points redemption rules can also be more detailed. For example, different stores and different products can set different points exchange ratios

For the validity period of points, different validity periods can be set according to different channels. Points will be invalid after they expire; when consuming points, give priority to using points that are about to expire

10.1.2 Point consumption and redemption rules

Consumption channels for points include: deducting order amount, redeeming coupons, redeeming points for purchases, deducting points for participating in activities, etc.

Different points redemption rules can be set according to different consumption channels. For example, the ratio of points converted into consumption deductions is 10%, that is, 10 points can be deducted by 1 yuan; 100 points can be exchanged for 15 yuan coupons, etc.

10.1.3 Points and their details query

Query the user's total points, as well as the history of earning points and consuming points

10.2 System Design

Object-oriented design focuses on the code level (mainly for classes), while system design focuses on the architecture level (mainly for modules). There are many similarities between the two. Many design principles and ideas can be applied not only to code design, but also to architecture design. In fact, you can also learn from the four steps of object-oriented design to do system design

10.2.1 Reasonable division of functions into different modules

The essence of object-oriented design is to put the right code in the right class. Reasonably dividing the code can achieve high cohesion and low coupling of the code, the interaction between classes is simple and clear, and the overall structure of the code is clear at a glance, so the quality of the code will not be bad. Analogous to object-oriented design, system design is actually putting the right functions into the right modules. Reasonable division of modules can also achieve high cohesion and low coupling at the module level, and the structure is clean and clear

For all the function points listed above, there are the following three module division methods:

1. The management and maintenance of point earning channels and redemption rules, consumption channels and redemption rules (addition, deletion, modification and query) are not divided into the points system, but placed in the upper-level marketing system

In this way, the point system will become very simple. You only need to be responsible for adding points, reducing points, querying points, querying point details, etc.

For example, users earn points by placing orders. The order system informs the marketing system that the order transaction is successful by sending a message asynchronously or calling the interface synchronously. Based on the received order information, the marketing system queries the point exchange rules corresponding to the order (conversion ratio, validity period, etc.), calculates the number of points that can be redeemed for the order, and then calls the interface of the point system to add points to the user

2. The management and maintenance of point earning channels and redemption rules, consumption channels and redemption rules are scattered in various related business systems, such as order system, comment system, sign-in system, redemption mall, coupon system, etc.

After the user places an order successfully, the order system calculates the number of points that can be exchanged according to the point conversion ratio corresponding to the product, and then directly calls the point system to add points to the user

3. All functions are divided into the points system, including the management and maintenance of points earning channels and redemption rules, consumption channels and redemption rules

After the user places an order successfully, the order system directly informs the point system that the order transaction is successful, and the point system queries the point exchange rules based on the order information, and adds points to the user

How to judge which module division is reasonable? In fact, it can be judged by looking at whether it meets the characteristics of high cohesion and low coupling. If the modification or addition of a function often needs to be completed across teams, projects, and systems, it means that the division of modules is not reasonable enough, the responsibilities are not clear enough, and the coupling is too severe

In addition, in order to avoid the coupling of business knowledge and make the lower-level system more general, generally speaking, we do not want the lower-level system (that is, the called system) to contain too much business information of the upper-level system (that is, the calling system), but , it is acceptable that the upper-level system contains the business information of the lower-level system. For example, the order system, coupon system, redemption mall, etc., as the upper-level system that calls the point system, can contain some point-related business information. However, in turn, it is best not to include too much information related to orders, coupons, redemptions, etc. in the point system

Therefore, comprehensive consideration, more inclined to the first and second way of module division. However, no matter which one of the two is selected, the points system is responsible for the same work, including only the addition, subtraction, query of points, and the record and query of point details

10.2.2 Interaction between design modules

In object-oriented design, after the classes are designed, the interaction between classes needs to be designed. Analogy to system design, after the system responsibilities are divided, the next step is to design the interaction between the systems, that is, to determine which systems interact with the point system and how to interact

There are two common modes of interaction between systems, one is synchronous interface calls, and the other is asynchronous calls using message middleware. The first method is simple and direct, and the decoupling effect of the second method is better

For example, after the user successfully places an order, the order system pushes a message to the message middleware, and the marketing system subscribes to the order success message, triggering the execution of the corresponding point exchange logic. In this way, the order system is completely decoupled from the marketing system. The order system does not need to know any logic related to points, and the marketing system does not need to directly interact with the order system.

In addition, calls between upper and lower systems tend to be through synchronous interfaces, and calls between the same layer tend to be asynchronous message calls . For example, the marketing system and the point system are upper-lower relationships, and it is recommended to use synchronous interface calls between them

10.2.3 Interface, database and business model of the design module

After completing the functional division of the modules and the design of the interaction between the modules, let's look at how to design the modules themselves. In fact, the design of the business system itself is nothing more than three aspects of work: interface design, database design and business model design

10.3 Code implementation

In the previous part, the management and maintenance of points earning and consumption channels and rules were divided into the upper system, so the functions of the point system became very simple. Correspondingly, the code implementation is relatively simple. If you have some experience in project development, it is not difficult to implement such a system. Therefore, the focus here is not how to realize each function and each interface of the points system, let alone how to write SQL statements to add, delete, modify and query data, but to show some more general development ideas. For example, why should MVC be developed in three layers? Why define different data objects for each layer? And what design principles and ideas are contained in it, knowing why and why, so as to truly understand thoroughly

What does business development include?

In fact, the design and development of business systems usually involves three aspects of work: interface design, database design, and business model design (that is, business logic)

The design of database and interface is very important. Once designed and put into use, these two parts cannot be easily changed. Changing the database table structure needs to involve data migration and adaptation; changing the interface needs to push the user of the interface to make corresponding code modifications. In both cases, even small changes can be cumbersome to implement. Therefore, when designing interfaces and databases, you must spend more thought and time, and you must not be too casual. On the contrary, the business logic code focuses on internal implementation, does not involve externally dependent interfaces, and does not contain persistent data, so it is more tolerant to changes

10.3.1 Designing the database

The design of the database is relatively simple. In fact, all you need is a table that records the flow details of points. The table records the earning and consumption of points. Various statistical data of user points, such as total points, total available points, etc., can be calculated through this table

insert image description here

10.3.2 Design interface

The interface design should conform to the principle of single responsibility, the smaller the granularity, the better the versatility. However, too small interface granularity will also bring some problems. For example, the implementation of a function needs to call multiple small interfaces. On the one hand, if the interface calls go through the network (especially the public network), multiple remote interface calls will affect performance; on the other hand, the atomic operation that should be completed in one interface , Now it is split into multiple small interfaces to complete, which may involve the data consistency problem of distributed transactions (one interface executes successfully, but the other interface fails). Therefore, in order to balance ease of use and performance, we can learn from the facade (appearance) design pattern, and encapsulate a layer of coarse-grained interface for external use on top of the single-responsibility fine-grained interface

The scoring system needs to design the following interfaces:

insert image description here

10.3.3 Business Model Design

From the perspective of code implementation, the development of most business systems can be divided into three layers: Controller, Service, and Repository. The Controller layer is responsible for interface exposure, the Repository layer is responsible for data reading and writing, and the Service layer is responsible for core business logic, which is the business model mentioned here

In addition, there are two development modes mentioned above, the traditional development mode based on the anemia model and the DDD development mode based on the hyperemia model (see: Summary of the Beauty of Design Modes (Object-Oriented)_Fan 223's blog ). The former is a procedure-oriented programming style, and the latter is an object-oriented programming style. Whether it is DDD or OOP, the existence of advanced development models is generally to deal with complex systems and deal with the complexity of the system. For the points system to be developed here, because the business is relatively simple, it is enough to choose a simple traditional development model based on an anemia model

From a development perspective, the point system can be developed independently as an independent project, or it can be developed in the same project as other business codes (such as marketing systems). From the perspective of operation and maintenance, it can be deployed together with other businesses, or it can be deployed independently as a microservice. Which development and deployment method to choose specifically can be decided by referring to the company's current technical architecture

In fact, the point system business is relatively simple, and the amount of code is not much. It is more inclined to develop and deploy it and the marketing system in one project. As long as the modularization and decoupling of the code is done well, the boundary between the points-related business code and other business codes is clear, and there is not much coupling. If it needs to be split into independent projects for development and deployment later, it is not necessary difficulty

10.4 Why is MVC three-layer development necessary?

The development of most business systems can be divided into three layers: Contoller layer, Service layer, and Repository layer. Most people agree with this layered approach, and it has even become a development habit, but why layered development? Many businesses are relatively simple. Isn’t it good to handle all data reading, business logic, and interface exposure with one layer of code?

1. Layering can play a role in code reuse

The same Repository may be called by multiple Services, and the same Service may be called by multiple Controllers. For example, getUserById()the interface encapsulates the logic of obtaining user information through ID, and this part of logic may be used by multiple Controllers such as UserController and AdminController. If there is no Service layer, each Controller must repeatedly implement this part of logic, which will obviously violate the DRY principle

2. Layering can play a role in isolating changes

Layering embodies a design idea of ​​abstraction and encapsulation. For example, the Repository layer encapsulates the operation of database access and provides an abstract data access interface. Based on the design idea of ​​interface rather than implementation programming, the Service layer uses the interface provided by the Repository layer, and does not care about the specific database that its underlying layer depends on. When it is necessary to replace the database, such as from MySQL to Oracle, from Oracle to Redis, only the code of the Repository layer needs to be changed, and the code of the Service layer does not need to be modified at all

In addition, the three layers of code in Controller, Service, and Repository have different degrees of stability and cause changes. Therefore, organizing code in three layers can effectively isolate changes. For example, the Repository layer is based on the database table, and the possibility of changing the database table is very small, so the code of the Repository layer is the most stable, while the Controller layer provides an interface adapted to external use, and the code is often changed. After layering, frequent code changes in the Controller layer will not affect the stable Repository layer

3. Layering can play a role in isolating concerns

The Repository layer is only concerned with reading and writing data. The Service layer only focuses on business logic, not on the source of data. The Controller layer only focuses on dealing with the outside world, data verification, encapsulation, and format conversion, and does not care about business logic. The focus of the three layers is different. After layering, the responsibilities are clearly defined, which is more in line with the principle of single responsibility, and the cohesion of the code is better.

4. Layering can improve the testability of the code

Unit tests do not depend on uncontrollable external components, such as databases. After layering, the code of the Repsitory layer is used by the Service layer through dependency injection. When testing the Service layer code containing core business logic, the mock data source can be used instead of the real database and injected into the Service layer code

5. Layering can cope with the complexity of the system

All the code is put into a class, and the code of this class will expand infinitely due to the iteration of requirements. When a class or a function has too much code, the readability and maintainability will deteriorate. Then find a way to split it. Splitting has both vertical and horizontal directions. Splitting based on business in the horizontal direction is modularization; splitting based on process in the vertical direction is the layering mentioned here

Whether it is layering, modularization, or OOP, DDD, and various design patterns, principles, and ideas, they are all designed to deal with complex systems and the complexity of the system. For a simple system, it can't actually play a role, which is the saying "killing a chicken with a bull's knife"

10.5 What is the significance of the existence of BO, VO, and Entity?

For the three layers of Controller, Service, and Repository, each layer will define corresponding data objects, which are VO (View Object), BO (Business Object), and Entity, such as UserVo, UserBo, and UserEntity. In actual development, there may be a large number of repeated fields in VO, BO, and Entity, and even the fields contained in the three are exactly the same. In the process of development, it is often necessary to repeatedly define three almost identical classes, which is obviously a kind of repetitive labor

Is it better to define a common data object than each layer to define its own data object?

In fact, it is more recommended that each layer defines its own data object design idea, mainly for the following three reasons:

  1. VO, BO, and Entity are not exactly the same. For example, the Password field can be defined in UserEntity and UserBo, but obviously the Password field cannot be defined in UserVo, otherwise the user's password will be exposed
  2. Although the codes of VO, BO, and Entity are repeated, the functional semantics are not repeated, and they are different in terms of responsibilities. Therefore, it cannot be regarded as a violation of the DRY principle. When we talked about the DRY principle earlier, in this case, if merged into the same class, there will also be a problem that needs to be split later due to changes in requirements
  3. In order to minimize the coupling between each layer and clearly divide the boundaries of responsibilities, each layer maintains its own data objects, and the layers interact through interfaces. When the data is passed from the lower layer to the upper layer, the data object of the lower layer is converted into the data object of the upper layer, and then the processing continues. Although such a design is a bit cumbersome, each layer needs to define its own data objects, and needs to convert between data objects, but the layers are clear. For very large projects, clarity of structure comes first!

Since VO, BO, and Entity cannot be merged, how to solve the problem of code duplication?

From a design point of view, the design ideas of VO, BO, and Entity do not violate the DRY principle. In order to clarify the layering and reduce coupling, the cost of maintaining several classes is not unacceptable. However, there are also some ways to solve the problem of code duplication

Inheritance can solve the problem of code duplication. The public fields can be defined in the parent class, so that VO, BO, and Entity all inherit from this parent class, and each only defines unique fields. Because the inheritance level here is very shallow and uncomplicated, using inheritance does not affect the readability and maintainability of the code. Later, if due to business needs, some fields need to be moved from the parent class to the subclass, or extracted from the subclass to the parent class, the code modification is not complicated

When talking about the design idea of ​​"multiple use of combination, less use of inheritance", it is mentioned that combination can also solve the problem of code duplication. Therefore, it is also possible to extract public fields into public classes. VO, BO, and Entity can use combination relations To reuse the code of this class

The problem of code duplication is solved, so how to convert data objects between different layers?

After the data of the lower layer is passed to the upper layer through the interface call, it needs to be converted into the corresponding data object type of the upper layer. For example, after the Service layer obtains the Entity from the Repository layer, it converts it into a BO, and then continues the processing of business logic. Therefore, the entire development process will involve two conversions: "Entity to BO" and "BO to VO"

The easiest way to convert is by manual copying. Write your own code between two objects, one field and one field assignment. But such an approach is obviously low-level labor without technical content. Java provides a variety of data object conversion tools, such as BeanUtils, Dozer, etc. (see: MapStruct Summary_Fan 223's blog ), which can greatly simplify the tedious object conversion work. If you use other programming languages ​​for development, you can also learn from the design ideas of these tool classes in Java, and implement the object conversion tool class in the project yourself

VO, BO, and Entity are all based on the anemia model, and in order to be compatible with the framework or development library (such as MyBatis), it is also necessary to define the set method of each field. These are contrary to the encapsulation characteristics of OOP, which will cause the data to be modified at will. So what should we do?

As mentioned earlier, the life cycle of Entity and VO is limited, and both are limited to the scope of this layer. The corresponding Repository layer and Controller layer do not contain too much business logic, so there will not be too much code to modify data at will. Even if it is designed to be anemic and define the set method of each field, it is relatively safe

However, the Service layer contains more business logic codes, so BO is at risk of being modified arbitrarily. However, the design problem itself has no optimal solution, only trade-offs. For the convenience of use, we can only make some compromises, abandon the encapsulation characteristics of BO, and the programmers are responsible for these data objects not to be used incorrectly

10.6 Design principles and ideas used

High cohesion, loose coupling Divide different functions into different modules. The principle of division is to make the modules themselves highly cohesive and loosely coupled between modules.
Single Responsibility Principle The design of the module should try to have a single responsibility and conform to the principle of single responsibility. One purpose of layering is also to be more in line with the principle of single responsibility
dependency injection In the code implementation of the MVC three-tier structure, the class of the next layer is injected into the code of the previous layer through dependency injection
Dependency Inversion Principle In the development of business systems, if the creation and life cycle of objects are managed through containers like Spring IOC, then the principle of dependency inversion is used
Programming based on interface instead of implementation In the code implementation of the MVC three-tier structure, the Service layer uses the interface provided by the Repository layer, and does not care which specific database the bottom layer depends on, and follows the design idea of ​​​​based on interfaces rather than implementation programming
encapsulation, abstraction Layering embodies the design idea of ​​abstraction and encapsulation, which can isolate changes and isolate concerns. Although VO, BO, and Entity have code duplication, the functional semantics are different and do not violate the DRY principle
DRY with inheritance and composition In order to solve the problem of code duplication among the three, inheritance or combination is also used
DRY object-oriented design The process of system design can be done with reference to the steps of object-oriented design. The essence of object-oriented design is to put the right code in the right class. System design is about putting the right functions into the right modules

11. For non-business general framework development, how to do demand analysis and design?

11.1 Project Background

I hope to design and develop a small framework that can obtain various statistical information of interface calls, such as the maximum value (max), minimum value (min), average value (avg), percentile value (percentile) of the response time, interface The number of calls (count), frequency (tps), etc., and supports the output of statistical results in various display formats (such as: JSON format, web page format, custom display format, etc.) to various terminals (Console command line, HTTP web page, Email, log files, custom output terminals, etc.) for easy viewing

If such a general framework is developed and applied to various business systems to support real-time calculation and viewing of data statistics, how to design and implement it?

11.2 Requirements Analysis

As a function that has nothing to do with business, the performance counter can be developed into an independent framework or class library and integrated into many business systems. As a reusable framework, in addition to functional requirements, non-functional requirements are also very important. Therefore, do a demand analysis from these two aspects:

11.2.1 Functional requirements analysis

Compared with a long list of text descriptions, it is easier for the human brain to understand short, organized, and categorized list information. Obviously, the requirement description just now does not conform to this rule. It needs to be disassembled into "dry strips" one by one:

  • Interface statistics: including statistics of interface response time, statistics of interface calls, etc.
  • Types of statistics: max, min, avg, percentile, count, tps, etc.
  • Statistical information display format: Json, Html, custom display format
  • Statistical information display terminal: Console, Email, HTTP web page, log, custom display terminal

In addition, you can draw the display style of the final data with the help of wireframe diagrams that are often used when designing products, which will be more clear at a glance. The specific wireframe is as follows:

insert image description here

In fact, from the wireframe diagram, the following hidden requirements can be found:

  • Statistical triggering methods: including active and passive.
    Active means to regularly count data at a certain frequency and actively push it to the display terminal, such as email push. Passive means that the user triggers statistics, for example, the user selects the time interval to be counted on the web page, triggers the statistics, and displays the results to the user
  • Statistical time interval: the framework needs to support custom statistical time intervals,
    such as counting the tps and access times of an interface in the last 10 minutes, or counting the maximum response time of an interface between 00:00 on December 11 and 00:00 on December 12 , minimum, average, etc.
  • Statistics time interval: For active triggering of statistics, it is also supported to specify the statistics time interval,
    that is, how often to trigger the statistics display. For example, interface information is counted every 10s and displayed on the command line, and a statistics email is sent every 24 hours

11.2.2 Non-functional requirements analysis

For the development of such a general framework, many non-functional requirements need to be considered. Specifically, there are the following important aspects:

  1. Ease of use Ease
    of use sounds more like a criterion for judging a product. When developing such a technical framework, there must also be product awareness. Whether the framework is easy to integrate, easy to plug and unplug, whether it is loosely coupled with the business code, whether the interface provided is flexible enough, etc., should be thought and designed carefully. Sometimes, how well the documentation is written may even determine whether a framework is popular

  2. Performance
    For the framework that needs to be integrated into the business system, the code execution efficiency of the framework itself is not expected to have too much impact on the performance of the business system. For the performance counter framework, on the one hand, it is hoped that it is low-latency, that is, the statistical code does not affect or rarely affects the response time of the interface itself; on the other hand, it is hoped that the memory consumption of the framework itself will not be too large

  3. Extensibility The
    extensibility mentioned here is somewhat similar to the extensibility of the code mentioned before, which refers to adding new functions without modifying the code or modifying the code as little as possible. But there is also a difference between the two. The extensions mentioned above are from the perspective of framework code developers. The extension mentioned here is from the perspective of the framework user, specifically referring to the fact that the user can extend new functions for the framework without modifying the source code of the framework or even obtaining the source code of the framework. This is a bit like developing plugins for frameworks. The following example:
    feign is an HTTP client framework, you can use the following methods to expand your codec, log, interceptor, etc. without modifying the source code of the framework

    Feign feign = Feign.builder()
            .logger(new CustomizedLogger())
            .encoder(new FormEncoder(new JacksonEncoder()))
            .decoder(new JacksonDecoder())
            .errorDecoder(new ResponseErrorDecoder())
            .requestInterceptor(new RequestHeadersInterceptor()).build();
    
    public class RequestHeadersInterceptor implements RequestInterceptor {
          
          
        @Override
        public void apply(RequestTemplate template) {
          
          
            template.header("appId", "...");
            template.header("version", "...");
            template.header("timestamp", "...");
            template.header("token", "...");
            template.header("idempotent-token", "...");
            template.header("sequence-id", "...");
        }
    }
    
    public class CustomizedLogger extends feign.Logger {
          
          
        //...
    }
    
    public class ResponseErrorDecoder implements ErrorDecoder {
          
          
        @Override
        public Exception decode(String methodKey, Response response) {
          
          
            //...
        }
    }
    
  4. Fault tolerance
    For the performance counter framework, an interface request error cannot be caused by an exception of the framework itself. Therefore, it is necessary to consider all kinds of abnormal situations that may exist in the framework, and capture and process all runtime and non-runtime exceptions thrown by externally exposed interfaces

  5. Versatility
    In order to improve the reusability of the framework, it can be flexibly applied to various scenarios. When designing the framework, it should be as general as possible. You need to think more, besides the requirement of interface statistics, what other scenarios can it be applied to, such as whether it can also process statistics of other events, such as statistics of SQL request time and business statistics (such as payment success rate) wait

11.3 Framework Design

For the development of slightly complex systems, many people feel that they don't know where to start. The author personally likes to learn from the ideas of TDD (Test Driven Development) and Prototype (minimal prototype), first focusing on a simple application scenario, and implementing a simple prototype based on this design. Although this minimal prototype system is not perfect in terms of functions and non-functional features, it can be seen and touched, and it is more specific and not abstract. It can effectively help me understand more complex design ideas. It is an iterative design. Foundation

This is like doing an algorithm problem. When you want to come up with an optimal solution at once, you can first write several sets of test data, look for the rules, and then think of the simplest algorithm to solve it. Although this simplest algorithm may not be satisfactory in terms of time and space complexity, it can be optimized based on this, so that the thinking will be smoother

For the development of the performance counter framework, you can first focus on a very specific and simple application scenario, such as counting the maximum and average response time of the two interfaces for user registration and login, and the number of interface calls. The results are output to the command line in JSON format. Now this requirement is simple, specific, and clear, and the difficulty of design and implementation has been greatly reduced.

// 应用场景:统计下面两个接口 (注册和登录)的响应时间和访问次数
public class UserController {
    
    
    public void register(UserVo user) {
    
    
        //...
    }
    public UserVo login(String telephone, String password) {
    
    
        //...
    }
}

To output the maximum value, average value, and number of interface calls of an interface, the response time of each interface request must first be collected and stored, then aggregated and counted according to a certain time interval, and finally the results are output. In the code implementation of the prototype system, all the code can be stuffed into a class, and there is no need to consider any code quality, thread safety, performance, scalability, etc.

The code implementation of the minimal prototype is shown below. Among them, recordResponseTime()and recordTimestamp()two functions are used to record the response time and access time of the interface request respectively. startRepeatedReport()The function counts data at a specified frequency and outputs the result

public class Metrics {
    
    
    // Map 的 key 是接口名称,value 对应接口请求的响应时间或时间戳;
    private Map<String, List<Double>> responseTimes = new HashMap<>();
    private Map<String, List<Double>> timestamps = new HashMap<>();
    private ScheduledExecutorService executor = Executors.newSingleThreadSchedule;

    public void recordResponseTime(String apiName, double responseTime) {
    
    
        responseTimes.putIfAbsent(apiName, new ArrayList<>());
        responseTimes.get(apiName).add(responseTime);
    }

    public void recordTimestamp(String apiName, double timestamp) {
    
    
        timestamps.putIfAbsent(apiName, new ArrayList<>());
        timestamps.get(apiName).add(timestamp);
    }

    public void startRepeatedReport(long period, TimeUnit unit) {
    
    
        executor.scheduleAtFixedRate(new Runnable() {
    
    
            @Override
            public void run() {
    
    
                Gson gson = new Gson();
                Map<String, Map<String, Double>> stats = new HashMap<>();
                for (Map.Entry<String, List<Double>> entry : responseTimes.entrySet())
                    String apiName = entry.getKey();
                    List<Double> apiRespTimes = entry.getValue();
                    stats.putIfAbsent(apiName, new HashMap<>());
                    stats.get(apiName).put("max", max(apiRespTimes));
                    stats.get(apiName).put("avg", avg(apiRespTimes));
                 }
                for(Map.Entry<String, List<Double>> entry :timestamps.entrySet()) {
    
    
                    String apiName = entry.getKey();
                    List<Double> apiTimestamps = entry.getValue();
                    stats.putIfAbsent(apiName, new HashMap<>());
                    stats.get(apiName).put("count", (double) apiTimestamps.size());
                 }
                System.out.println(gson.toJson(stats));
            }
        },0,period,unit);
    }

    private double max(List<Double> dataset) {
    
     // 省略代码实现 }
    private double avg (List < Double > dataset) {
    
     // 省略代码实现 }
}

A minimal prototype was implemented in less than 50 lines of code. Next, let's see how to use it to count the response time and number of visits of registration and login interfaces. The specific code is as follows:

// 应用场景:统计下面两个接口 (注册和登录)的响应时间和访问次数
public class UserController {
    
    
    private Metrics metrics = new Metrics();

    public UserController() {
    
    
        metrics.startRepeatedReport(60, TimeUnit.SECONDS);
    }
    public void register(UserVo user) {
    
    
        long startTimestamp = System.currentTimeMillis();
        metrics.recordTimestamp("regsiter", startTimestamp);
        //...
        long respTime = System.currentTimeMillis() - startTimestamp;
        metrics.recordResponseTime("register", respTime);
    }
    public UserVo login(String telephone, String password) {
    
    
        long startTimestamp = System.currentTimeMillis();
        metrics.recordTimestamp("login", startTimestamp);
        //...
        long respTime = System.currentTimeMillis() - startTimestamp;
        metrics.recordResponseTime("login", respTime);
    }
}

Although the code implementation of the minimal prototype is crude, it helped us straighten out our thinking a lot, and now we will make the final framework design based on it. Below is a rough system design drawing for the performance counter framework. Diagrams can reflect design ideas very intuitively, and can effectively help us release more brain space to think about other details

insert image description here

As shown in the figure, the whole framework is divided into four modules: data acquisition, storage, aggregation statistics, and display. The work that each module is responsible for is briefly listed as follows:

1. Data collection

Responsible for collecting raw data, including recording the response time and request time of each interface request. The data collection process must be highly fault-tolerant and cannot affect the usability of the interface itself. In addition, because this part of the function is exposed to the users of the framework, when designing the data collection API, try to consider its ease of use

2. Storage

Responsible for saving the collected raw data for later aggregation statistics. There are many ways to store data, such as: Redis, MySQL, HBase, logs, files, memory, etc. Data storage is time-consuming. In order to minimize the impact on interface performance (such as response time), the collection and storage process is completed asynchronously

3. Aggregated Statistics

Responsible for aggregating raw data into statistical data, such as: max, min, avg, pencentile, count, tps, etc. In order to support more aggregation statistics rules, the code hopes to be as flexible and extensible as possible

4. Display

Responsible for displaying statistical data to the terminal in a certain format, such as: output to the command line, mail, web page, custom display terminal, etc.

When we talked about object-oriented analysis, design and implementation earlier, we mentioned that the final output of the design phase is the design of classes. At the same time, we also mentioned that software design and development is an iterative process. The boundaries of the three stages of analysis, design and implementation division is not obvious

11.4 Run in small steps and iterate step by step

Above, the entire framework is divided into four modules: data collection, storage, aggregation statistics, and display. In addition, the statistics trigger method (active push, passive trigger statistics), statistical time interval (which time period data is counted), and statistical time interval (for the active push method, how often statistics are pushed) have also been simplified. the design of

While the minimal prototype gave us the basis for iterative development, it was still a long way from what we ultimately wanted the framework to look like. When I was writing this article, I tried to achieve all the functional requirements listed above, hoping to write a perfect framework, and found that this is a very brain-burning thing. In the process of writing code, there is always a kind of "brain" Not enough" feeling. This is true for a person with more than ten years of work experience. For developers without much experience, it is a very challenging thing to realize all the requirements at once. Once it cannot be completed smoothly, there may be a strong sense of frustration, and it will fall into the mood of self-denial

However, even if you have the ability to realize all the requirements, it may take a lot of design effort and development time, and there is no output for a long time, and your leader will have a strong sense of uncontrollability. For today's Internet projects, running in small steps and iterating step by step is a better development model. Therefore, this framework should be gradually improved in multiple versions. In the first version, some basic functions can be implemented first, and there are no high requirements for more advanced and complex functions, as well as non-functional requirements, and iterative optimization will continue in subsequent v2.0, v3.0... versions

For the development of this framework, in the v1.0 version, only the following functions are temporarily implemented. The remaining functions stay in v2.0, v3.0 versions

  1. Data collection: Responsible for collecting raw data, including recording the response time and request time of each interface request
  2. Storage: responsible for saving the collected raw data for later aggregated statistics. There are many ways to store data, and only Redis is supported for the time being, and the two processes of collection and storage are executed synchronously
  3. Aggregated Statistics: Responsible for aggregating raw data into statistical data, including the maximum, minimum, average, 99.9th percentile, 99th percentile of response time, and the number of interface requests and tps
  4. Display: Responsible for displaying statistical data to the terminal in a certain format. For the time being, it only supports active push to the command line and email. The command line counts and displays the data of the last m seconds at an interval of n seconds (for example, the statistics of the last 60s at an interval of 60s). Mail statistics of the previous day's data on a daily basis

Now the requirements of this version are more specific and simpler than the previous ones, and it is easier to implement. In fact, learning to combine specific needs, make reasonable predictions, assumptions, trade-offs, and iterative design and development of planning versions is also a must-have ability for a senior engineer

Before, the object-oriented design and implementation were explained separately, and the boundaries were more obvious. In actual software development, these two processes are often interleaved. Generally, there is a rough design first, and then the implementation is started. Problems are found during the implementation process, and then the design is supplemented and modified later. Therefore, for the development of this framework, put the design and implementation together to explain

For the implementation of the minimal prototype, all codes are coupled in one class, which is obviously unreasonable. Next, follow the steps of object-oriented design mentioned earlier to re-divide and design classes

11.4.1 Divide responsibilities and identify which classes

According to the requirements description, first roughly identify the following interfaces or classes. This step is not difficult, it is completely translation needs

  • The MetricsCollector class is responsible for providing the API to collect the raw data requested by the interface. An interface can be abstracted for MetricsCollector, but this is not necessary, because only one implementation of MetricsCollector can be thought of for the time being
  • The MetricsStorage interface is responsible for raw data storage, and the RedisMetricsStorage class implements the MetricsStorage interface. This is done to flexibly expand new storage methods in the future, such as using HBase to store
  • The Aggregator class is responsible for computing statistics from raw data
  • The ConsoleReporter class and the EmailReporter class are respectively responsible for counting and sending statistical data to the command line and email at a certain frequency. As for whether ConsoleReporter and EmailReporter can abstract reusable abstract classes, or abstract a public interface, it is not yet certain

11.4.2 Defining classes and their relationships

The next step is to define classes, attributes and methods, and define the relationship between classes. These two steps cannot be separated, so let's combine them to explain

After roughly identifying several core classes, you can create these classes in the IDE first, and then start trying to define their properties and methods. When designing classes and interactions between classes, constantly use the design principles and ideas learned before to examine whether the design is reasonable, for example, whether it meets the single responsibility principle, opening and closing principle, dependency injection, KISS principle, DRY principle , Dimit's law, whether it conforms to the idea of ​​programming based on interfaces rather than implementation, whether the code has high cohesion and low coupling, whether reusable code can be abstracted, etc.

The definition of the MetricsCollector class is very simple, and the specific code is as follows. Compared with the code of the minimal prototype, MetricsCollector encapsulates the original data information by introducing the RequestInfo class, and replaces the previous two functions with a collection function

public class MetricsCollector {
    
    
    private MetricsStorage metricsStorage; // 基于接口而非实现编程
    // 依赖注入
    public MetricsCollector(MetricsStorage metricsStorage) {
    
    
        this.metricsStorage = metricsStorage;
    }
    // 用一个函数代替了最小原型中的两个函数
    public void recordRequest(RequestInfo requestInfo) {
    
    
        if (requestInfo == null || StringUtils.isBlank(requestInfo.getApiName())) {
    
    
            return;
        }
        metricsStorage.saveRequestInfo(requestInfo);
    }
}
public class RequestInfo {
    
    
    private String apiName;
    private double responseTime;
    private long timestamp;
    //... 省略 constructor/getter/setter 方法...
}

The properties and methods of the MetricsStorage class and the RedisMetricsStorage class are also relatively clear. The specific code implementation is as follows. Note that fetching data for too long a time at one time may cause too much data to be pulled into the memory, which may burst the memory. For Java, it is possible to trigger OOM (Out Of Memory). Moreover, even if there is no OOM, the memory is still sufficient, but it will also cause frequent Full GC due to memory shortage, which will slow down the processing of system interface requests, or even time out

public interface MetricsStorage {
    
    
    void saveRequestInfo(RequestInfo requestInfo);
    List<RequestInfo> getRequestInfos(String apiName, long startTimeInMillis, long endTimeInMillis);
    Map<String, List<RequestInfo>> getRequestInfos(long startTimeInMillis, long endTimeInMillis);
}

public class RedisMetricsStorage implements MetricsStorage {
    
    
    //... 省略属性和构造函数等...
    @Override
    public void saveRequestInfo(RequestInfo requestInfo) {
    
    
        //...
    }
    @Override
    public List<RequestInfo> getRequestInfos(String apiName, long startTimestamp) {
    
    
        //...
    }
    @Override
    public Map<String, List<RequestInfo>> getRequestInfos(long startTimestamp, long endTimeInMillis) {
    
    
        //...
    }
}

The design ideas of the MetricsCollector class and the MetricsStorage class are relatively simple, and the design results given by different people should be quite similar. However, the two functions of statistics and display are different, and there are many design ideas. In fact, if we subdivide the function logic to be completed by statistical display, it mainly includes the following four points:

  1. According to the given time interval, pull data from the database
  2. According to the original data, calculate the statistical data
  3. Display statistics to terminal (command line or mail)
  4. Timing triggers the execution of the above three processes

In fact, if you sum it up in one sentence, what you need to do in object-oriented design and implementation is to put the right code in the right class . Therefore, the work to be done now is to logically divide the above four functions into several classes. There are many ways to divide. For example, you can put the first two logics in one class, the third logic in another class, and the fourth logic as God Class (God Class) combined with the first two classes to trigger Execution of the first 3 logics. Of course, you can also put the second logic in a class alone, and put the 1st, 3rd, and 4th in another class

As for which permutation and combination method to choose, the criterion for judging is to make the code satisfy the various design principles and ideas mentioned before, such as low coupling, high cohesion, single responsibility, opening to extension and closing to modification, etc. The design meets the requirements of code reusability, readability, expansion and maintenance

Here we temporarily choose to put the 1st, 3rd, and 4th logics into the ConsoleReporter or EmailReporter class, and put the 2nd logic into the Aggregator class. Among them, the logic responsible for the Aggregator class is relatively simple, and it is designed as a tool class that only contains static methods. The specific code implementation is as follows:

public class Aggregator {
    
    
    public static RequestStat aggregate(List<RequestInfo> requestInfos, long duration) {
    
    
        double maxRespTime = Double.MIN_VALUE;
        double minRespTime = Double.MAX_VALUE;
        double avgRespTime = -1;
        double p999RespTime = -1;
        double p99RespTime = -1;
        double sumRespTime = 0;
        long count = 0;

        for (RequestInfo requestInfo : requestInfos) {
    
    
            ++count;
            double respTime = requestInfo.getResponseTime();
            if (maxRespTime < respTime) {
    
    
                maxRespTime = respTime;
            }
            if (minRespTime > respTime) {
    
    
                minRespTime = respTime;
            }
            sumRespTime += respTime;
        }
        if (count != 0) {
    
    
            avgRespTime = sumRespTime / count;
        }
        long tps = (long) (count / durationInMillis * 1000);
        Collections.sort(requestInfos, new Comparator<RequestInfo>() {
    
    
            @Override
            public int compare(RequestInfo o1, RequestInfo o2) {
    
    
                double diff = o1.getResponseTime() - o2.getResponseTime();
                if (diff < 0.0) {
    
    
                    return -1;
                } else if (diff > 0.0) {
    
    
                    return 1;
                } else {
    
    
                    return 0;
                }
            }
        });
        int idx999 = (int) (count * 0.999);
        int idx99 = (int) (count * 0.99);
        if (count != 0) {
    
    
            p999RespTime = requestInfos.get(idx999).getResponseTime();
            p99RespTime = requestInfos.get(idx99).getResponseTime();
        }
        RequestStat requestStat = new RequestStat();
        requestStat.setMaxResponseTime(maxRespTime);
        requestStat.setMinResponseTime(minRespTime);
        requestStat.setAvgResponseTime(avgRespTime);
        requestStat.setP999ResponseTime(p999RespTime);
        requestStat.setP99ResponseTime(p99RespTime);
        requestStat.setCount(count);
        requestStat.setTps(tps);
        return requestStat;
    }
}

public class RequestStat {
    
    
    private double maxResponseTime;
    private double minResponseTime;
    private double avgResponseTime;
    private double p999ResponseTime;
    private double p99ResponseTime;
    private long count;
    private long tps;
    //... 省略 getter/setter 方法...
}

The ConsoleReporter class is equivalent to a god class. It regularly fetches data from the database according to a given time interval, completes statistical work with the help of the Aggregator class, and outputs the statistical results to the command line. The specific code implementation is as follows:

public class ConsoleReporter {
    
    
    private MetricsStorage metricsStorage;
    private ScheduledExecutorService executor;

    public ConsoleReporter(MetricsStorage metricsStorage) {
    
    
        this.metricsStorage = metricsStorage;
        this.executor = Executors.newSingleThreadScheduledExecutor();
    }

    // 第 4 个代码逻辑:定时触发第 1、2、3 代码逻辑的执行;
    public void startRepeatedReport(long periodInSeconds, long durationInSeconds) {
    
    
        executor.scheduleAtFixedRate(new Runnable() {
    
    
            @Override
            public void run() {
    
    
                // 第 1 个代码逻辑:根据给定的时间区间,从数据库中拉取数据;
                long durationInMillis = durationInSeconds * 1000;
                long endTimeInMillis = System.currentTimeMillis();
                long startTimeInMillis = endTimeInMillis - durationInMillis;
                Map<String, List<RequestInfo>> requestInfos =
                        metricsStorage.getRequestInfos(startTimeInMillis, endTimeInMillis);
                Map<String, RequestStat> stats = new HashMap<>();
                for (Map.Entry<String, List<RequestInfo>> entry : requestInfos.entrySet) {
    
    
                    String apiName = entry.getKey();
                    List<RequestInfo> requestInfosPerApi = entry.getValue();
                    // 第 2 个代码逻辑:根据原始数据,计算得到统计数据;
                    RequestStat requestStat = Aggregator.aggregate(requestInfosPerApi, durationInMillis);
                    stats.put(apiName, requestStat);
                }
                // 第 3 个代码逻辑:将统计数据显示到终端(命令行或邮件);
                System.out.println("Time Span: ["+startTimeInMillis +", "+endTimeInmillis);
                Gson gson = new Gson();
                System.out.println(gson.toJson(stats));
            }
        },0,periodInSeconds,TimeUnit.SECONDS);
    }
}

public class EmailReporter {
    
    
    private static final Long DAY_HOURS_IN_SECONDS = 86400L;
    private MetricsStorage metricsStorage;
    private EmailSender emailSender;
    private List<String> toAddresses = new ArrayList<>();

    public EmailReporter(MetricsStorage metricsStorage) {
    
    
        this(metricsStorage, new EmailSender(/* 省略参数 */));
    }

    public EmailReporter(MetricsStorage metricsStorage, EmailSender emailSender) {
    
    
        this.metricsStorage = metricsStorage;
        this.emailSender = emailSender;
    }

    public void addToAddress(String address) {
    
    
        toAddresses.add(address);
    }

    public void startDailyReport() {
    
    
        Calendar calendar = Calendar.getInstance();
        calendar.add(Calendar.DATE, 1);
        calendar.set(Calendar.HOUR_OF_DAY, 0);
        calendar.set(Calendar.MINUTE, 0);
        calendar.set(Calendar.SECOND, 0);
        calendar.set(Calendar.MILLISECOND, 0);
        Date firstTime = calendar.getTime();
        Timer timer = new Timer();
        timer.schedule(new TimerTask() {
    
    
            @Override
            public void run() {
    
    
                long durationInMillis = DAY_HOURS_IN_SECONDS * 1000;
                long endTimeInMillis = System.currentTimeMillis();
                long startTimeInMillis = endTimeInMillis - durationInMillis;
                Map<String, List<RequestInfo>> requestInfos =
                        metricsStorage.getRequestInfos(startTimeInMillis, endTimeInMillis);
                Map<String, RequestStat> stats = new HashMap<>();
                for (Map.Entry<String, List<RequestInfo>> entry : requestInfos.entrySet) {
    
    
                    String apiName = entry.getKey();
                    List<RequestInfo> requestInfosPerApi = entry.getValue();
                    RequestStat requestStat = Aggregator.aggregate(requestInfosPerApi, durationInMillis);
                    stats.put(apiName, requestStat);
                }
                // TODO: 格式化为 html 格式,并且发送邮件
            }
        }, firstTime, DAY_HOURS_IN_SECONDS * 1000);
    }
}

11.4.3 Assemble classes and provide execution entry

Because this framework is somewhat special, there are two execution entrances: one is the MetricsCollector class, which provides a set of APIs to collect raw data; the other is the ConsoleReporter class and EmailReporter class, which are used to trigger statistical display. The specific usage of the framework is as follows:

public class Demo {
    
    
    public static void main(String[] args) {
    
    
        MetricsStorage storage = new RedisMetricsStorage();
        ConsoleReporter consoleReporter = new ConsoleReporter(storage);
        consoleReporter.startRepeatedReport(60, 60);

        EmailReporter emailReporter = new EmailReporter(storage);
        emailReporter.addToAddress("[email protected]");
        emailReporter.startDailyReport();

        MetricsCollector collector = new MetricsCollector(storage);
        collector.recordRequest(new RequestInfo("register", 123, 10234));
        collector.recordRequest(new RequestInfo("register", 223, 11234));
        collector.recordRequest(new RequestInfo("register", 323, 12334));
        collector.recordRequest(new RequestInfo("login", 23, 12434));
        collector.recordRequest(new RequestInfo("login", 1223, 14234));
        try {
    
    
            Thread.sleep(100000);
        } catch (InterruptedException e) {
    
    
            e.printStackTrace();
        }
    }
}

11.4.4 Review design and implementation

I mentioned the design principles of SOLID, KISS, DRY, YAGNI, LOD, etc., based on interface rather than implementation programming, multi-purpose combination and less inheritance, high cohesion and low coupling and other design ideas. Now let's see if the above code implementation conforms to these design principles and ideas

1、MetricsCollector

MetricsCollector is responsible for collecting and storing data, and its responsibilities are relatively single. It is based on the interface rather than the implementation of programming. It passes the MetricsStorage object through dependency injection. It can flexibly replace different storage methods without modifying the code and meets the principle of opening and closing.

2、MetricsStorage、RedisMetricsStorage

The design of MetricsStorage and RedisMetricsStorage is relatively simple. When you need to implement a new storage method, you only need to implement the MetricsStorage interface. Because all places where MetricsStorage and RedisMetricsStorage are used are programmed based on the same interface function, except for changes in the assembly class (from RedisMetricsStorage to a new storage implementation class), other interface function calls There is no need to change, and the principle of opening and closing is satisfied.

3、Aggregator

The Aggregator class is a tool class, which contains only one static function with about 50 lines of code, and is responsible for the calculation of various statistical data. When new statistical functions need to be expanded, aggregate()the function , and once more and more statistical functions are added, the code size of this function will continue to increase, and the readability and maintainability will deteriorate. Therefore, judging from the analysis just now, the design of this class may have problems such as insufficient responsibility and difficult expansion. It needs to optimize its structure in future versions.

4、ConsoleReporter、EmailReporter

There is a code duplication issue in ConsoleReporter and EmailReporter. In these two classes, the logic of fetching data from the database and making statistics is the same, which can be extracted and reused, otherwise the DRY principle will be violated. Moreover, the whole class is responsible for many things, and the responsibility is not too single. Especially the display part of the code may be more complicated (such as the display method of Email), it is best to split the code logic of the display part into an independent class. In addition, because the code involves thread operations and calls the static function of Aggregator, the testability of the code is not good

There are still many problems in the code implementation given here, and it will be gradually optimized later to show the entire design evolution process, which is much more meaningful than directly giving the final optimal solution! In fact, excellent code is refactored, and complex code is slowly piled up

Guess you like

Origin blog.csdn.net/ACE_U_005A/article/details/127412458
Recommended