Summary of the Beauty of Design Patterns (Refactoring)


title: Summary of the Beauty of Design Patterns (Refactoring)
date: 2022-10-27 17:31:42
tags:

  • Design pattern
    categories:
  • Design mode
    cover: https://cover.png
    feature: false

Article directory


See the first two articles:

1 Overview

1.1 The purpose of refactoring: why refactor (why)?

Software design guru Martin Fowler defines refactoring as follows: "Refactoring is an improvement to the internal structure of software, with the purpose of making it easier to understand and less expensive to modify without changing the visible behavior of the software."

In fact, many books refer to this definition when it comes to refactoring. There is a point worth emphasizing in this definition: "Refactoring does not change externally visible behavior". Refactoring can be understood as, under the premise of keeping the function unchanged, use design ideas, principles, patterns, programming specifications and other theories to optimize code, modify design deficiencies, and improve code quality

Why Code Refactoring?

1. Refactoring is an extremely effective means to ensure code quality at all times

The project is evolving, and the code is constantly piling up. If no one is responsible for the quality of the code, the code will always evolve in the direction of more and more chaos. When the confusion reaches a certain level, quantitative changes lead to qualitative changes, and the maintenance cost of the project is already higher than the cost of redeveloping a new set of code. If you want to refactor, no one can do it.

Excellent code or architecture is not fully designed at the beginning, just like excellent companies and products are also iterated. We cannot 100% foresee future needs, and we do not have enough energy, time, and resources to pay for the distant future. Therefore, as the system evolves, refactoring code is inevitable

2. Refactoring is an effective means to avoid over-design

In the process of maintaining the code, when you really encounter problems, refactor the code, which can effectively avoid spending too much time on excessive design in the early stage, and achieve a targeted

3. Refactoring is also of great significance to the growth of an engineer's own technology

Refactoring is actually an application of learning classic design ideas, design principles, design patterns, and programming specifications. Refactoring is actually a very good scenario for applying these theoretical knowledge to practice, which can exercise our ability to use these theoretical knowledge proficiently

除此之外,重构能力也是衡量一个工程师代码能力的有效手段。所谓“初级工程师在维护代码,高级工程师在设计代码,资深工程师在重构代码”,这句话的意思是说,初级工程师在已有代码框架下修改 bug、修改添加功能代码;高级工程师从零开始设计代码结构、搭建代码框架;而资深工程师为代码质量负责,需要发觉代码存在的问题,重构代码,时刻保证代码质量处于一个可控的状态(当然这里的初级、高级、资深只是一个相对概念,并不是一个确定的职级)

1.2 重构的对象:到底重构什么(what)?

根据重构的规模,可以笼统地分为大规模高层次重构(以下简称为“大型重构”)和小规模低层次的重构(以下简称为“小型重构”)

大型重构指的是对顶层代码设计的重构,包括:系统、模块、代码结构、类与类之间的关系等的重构,重构的手段有:分层、模块化、解耦、抽象可复用组件等等。这类重构的工具就是学习过的那些设计思想、原则和模式。这类重构涉及的代码改动会比较多,影响面会比较大,所以难度也较大,耗时会比较长,引入 bug 的风险也会相对比较大

小型重构指的是对代码细节的重构,主要是针对类、函数、变量等代码级别的重构,比如规范命名、规范注释、消除超大类或函数、提取重复代码等等。小型重构更多的是利用编码规范。这类重构要修改的地方比较集中,比较简单,可操作性较强,耗时会比较短,引入 bug 的风险相对来说也会比较小

1.3 重构的时机:什么时候重构(when)?

搞清楚了为什么重构,到底重构什么,再来看一下,什么时候重构?是代码烂到一定程度之后才去重构吗?当然不是。因为当代码真的烂到出现“开发效率低,招了很多人,天天加班,出活却不多,线上 bug 频发,领导发飙,中层束手无策,工程师抱怨不断,查找 bug 困难”的时候,基本上重构也无法解决问题了

Personally, I am more opposed to the behavior of not paying attention to code quality, piling up bad code, and refactoring or even rewriting if it can't be maintained. Sometimes there are too many project codes, and it is difficult to do thorough refactoring, and finally a "monster with four different shapes" is created, which is even more troublesome! Therefore, it is unrealistic to hope that after the code is rotten to a certain extent, centralized refactoring can solve all problems, and a sustainable and evolvable way must be explored

The recommended refactoring strategy is continuous refactoring. When you have nothing to do, you can look at the code that is not well written and can be optimized in the project, and take the initiative to refactor it. Or, when modifying or adding a certain functional code, you can easily refactor the bad design that does not meet the coding standards
. In short, just like unit testing and code review are part of development, if continuous refactoring can also be part of development and become a development habit, it will be very beneficial to the project and yourself

Although the ability to refactor is important, the awareness of continuous refactoring is more important. It is necessary to correctly view the matter of code quality and refactoring. Technology is updating, requirements are changing, people are flowing, code quality will always decline, code will always be imperfect, and refactoring will continue. Always have the awareness of continuous refactoring in order to avoid over-design in the early stage of development and avoid the decline in quality during code maintenance

1.4 Refactoring method: how to refactor (how)?

According to the scale of refactoring, refactoring can be broadly divided into large refactoring and small refactoring. For these two different scales of refactoring, treat them differently:

For large-scale refactoring, because there will be more modules and codes involved, if the project code quality is relatively poor and the coupling is serious, it will often affect the whole body. Refactoring that I thought could be completed in a day will find that The more changes are made, the more changes become more chaotic, and it can't be done within a week or two. And the new business development conflicts with the refactoring, so we can only give up halfway, revert all the changes, and go to pile up bad code in frustration

When carrying out large-scale refactoring, it is necessary to make a complete refactoring plan in advance and proceed in an orderly manner in stages. Complete the refactoring of a small part of the code in each stage, then submit, test, and run it. After no problems are found, proceed to the next stage of refactoring to ensure that the code in the code warehouse is always in a runnable and logically correct state. At each stage, it is necessary to control the scope of the code affected by the refactoring, consider how to be compatible with the old code logic, and write some compatible transitional code when necessary. Only in this way can the refactoring of each stage not take too long (preferably can be completed in one day), and will not conflict with new function development

Large-scale, high-level refactoring must be organized, planned, and very prudent, and requires senior colleagues with experience and familiarity with the business to lead. For small-scale and low-level refactoring, because the scope of influence is small and the change takes a short time, you can do it at any time as long as you want and have time. In fact, in addition to manually finding low-level quality problems, you can also use many mature static code analysis tools (such as CheckStyle, FindBugs, PMD) to automatically find problems in the code, and then perform targeted refactoring and optimization

For the matter of refactoring, senior engineers and project leaders should take responsibility, refactor the code if there is nothing to do, and ensure that the code quality is in a good state at all times. Otherwise, once the "broken window effect" occurs, one person piles some bad code in, and then more people will pile in even worse code. After all, the cost of piling bad code into the project is too low. However, the best way to maintain code quality is to create a good technical atmosphere to drive everyone to actively pay attention to code quality and continue to refactor code

2. Unit Testing

Many programmers still agree with the practice of refactoring. Faced with the bad code in the project, they also want to refactor, but they are worried that problems will occur after refactoring, and their efforts will not be thankful. Indeed, if the code to be refactored is developed by other colleagues and is not particularly familiar with it, without any guarantee, the risk of introducing bugs after refactoring is still very high

So how to ensure that the refactoring does not go wrong? It is necessary to be proficient in various design principles, ideas, and patterns, and to have a sufficient understanding of the refactored business and code. In addition to these personal ability factors, the most implementable and effective way to ensure that refactoring does not go wrong should be **Unit Testing**. After the refactoring is completed, if the new code can still pass the unit test, it means that the correctness of the original logic of the code has not been destroyed, and the original externally visible behavior has not changed, which meets the definition of refactoring

2.1 What is unit testing?

Unit tests are written by R&D engineers themselves to test the correctness of the code they write. It is often compared with integration testing. Compared with integration testing (Integration Testing), unit testing has a smaller granularity of testing. The test object of the integration test is the entire system or a certain functional module, such as testing whether the user registration and login functions are normal, which is an end-to-end (end to end) test. The test object of unit test is a class or function, which is used to test whether a class and function are executed according to the expected logic. This is code-level testing. For example:

public class Text {
    
    
    private String content;
    public Text(String content) {
    
    
        this.content = content;
    }
    /**
     * 将字符串转化成数字,忽略字符串中的首尾空格;
     * 如果字符串中包含除首尾空格之外的非数字字符,则返回 null。
     */
    public Integer toNumber() {
    
    
        if (content == null || content.isEmpty()) {
    
    
            return null;
        }
        //... 省略代码实现...
        return null;
    }
}

To test the correctness of toNumber()the functions , how should I write unit tests?

In fact, writing unit tests itself does not require any advanced technology. It is more about testing the level of thoughtfulness of programmers to see if they can design test cases covering various normal and abnormal situations to ensure that the code can run correctly under any expected or unexpected conditions

In order to ensure the comprehensiveness of the test, the following test cases need to be designed for toNumber()the function :

  • If the string contains only numbers: "123", toNumber()the function outputs the corresponding integer: 123
  • If the string is empty or null, toNumber()the function returns: null
  • If the string contains leading and trailing spaces: "123", "123", "123", toNumber()return the corresponding integer: 123
  • If the string contains multiple leading and trailing spaces: "123", toNumber()return the corresponding integer: 123
  • 如果字符串包含非数字字符:“123a4”,“123 4”,toNumber() 返回 null

当设计好测试用例之后,剩下的就是将其翻译成代码了。翻译成代码的过程非常简单,如下:(这里没有使用任何测试框架)

public class Assert {
    
    
    public static void assertEquals(Integer expectedValue, Integer actualValue) {
    
    
        if (actualValue != expectedValue) {
    
    
            String message = String.format(
                    "Test failed, expected: %d, actual: %d.", expectedValue, actualVa
                    System.out.println(message);
        } else {
    
    
            System.out.println("Test succeeded.");
        }
    }
    public static boolean assertNull(Integer actualValue) {
    
    
        boolean isNull = actualValue == null;
        if (isNull) {
    
    
            System.out.println("Test succeeded.");
        } else {
    
    
            System.out.println("Test failed, the value is not null:" + actualValue);
        }
        return isNull;
    }
}
public class TestCaseRunner {
    
    
    public static void main(String[] args) {
    
    
        System.out.println("Run testToNumber()");
        new TextTest().testToNumber();
        System.out.println("Run testToNumber_nullorEmpty()");
        new TextTest().testToNumber_nullorEmpty();
        System.out.println("Run testToNumber_containsLeadingAndTrailingSpaces()");
        new TextTest().testToNumber_containsLeadingAndTrailingSpaces();
        System.out.println("Run testToNumber_containsMultiLeadingAndTrailingSpaces
        new TextTest().testToNumber_containsMultiLeadingAndTrailingSpaces();
        System.out.println("Run testToNumber_containsInvalidCharaters()");
        new TextTest().testToNumber_containsInvalidCharaters();
    }
}

public class TextTest {
    
    
    public void testToNumber() {
    
    
        Text text = new Text("123");
        Assert.assertEquals(123, text.toNumber());
    }
    public void testToNumber_nullorEmpty() {
    
    
        Text text1 = new Text(null);
        Assert.assertNull(text1.toNumber());
        Text text2 = new Text("");
        Assert.assertNull(text2.toNumber());
    }
    public void testToNumber_containsLeadingAndTrailingSpaces() {
    
    
        Text text1 = new Text(" 123");
        Assert.assertEquals(123, text1.toNumber());
        Text text2 = new Text("123 ");
        Assert.assertEquals(123, text2.toNumber());
        Text text3 = new Text(" 123 ");
        Assert.assertEquals(123, text3.toNumber());
    }
    public void testToNumber_containsMultiLeadingAndTrailingSpaces() {
    
    
        Text text1 = new Text(" 123");
        Assert.assertEquals(123, text1.toNumber());
        Text text2 = new Text("123 ");
        Assert.assertEquals(123, text2.toNumber());
        Text text3 = new Text(" 123 ");
        Assert.assertEquals(123, text3.toNumber());
    }
    public void testToNumber_containsInvalidCharaters() {
    
    
        Text text1 = new Text("123a4");
        Assert.assertNull(text1.toNumber());
        Text text2 = new Text("123 4");
        Assert.assertNull(text2.toNumber());
    }
}

2.2 为什么要写单元测试?

单元测试除了能有效地为重构保驾护航之外,也是保证代码质量最有效的两个手段之一(另一个是 Code Review)。单元测试的好处如下:

  1. 单元测试能有效地发现代码中的 bug
    能否写出 bug free 的代码,是判断工程师编码能力的重要标准之一,通过单元测试能发现代码中的很多考虑不全面的地方
  2. 写单元测试能发现代码设计上的问题
    代码的可测试性是评判代码质量的一个重要标准。对于一段代码,如果很难为其编写单元测试,或者单元测试写起来很吃力,需要依靠单元测试框架里很高级的特性才能完成,那往往就意味着代码设计得不够合理,比如,没有使用依赖注入、大量使用静态函数、全局变量、代码高度耦合等
  3. 单元测试是对集成测试的有力补充
    程序运行的 bug 往往出现在一些边界条件、异常情况下,比如,除数未判空、网络超时。而大部分异常情况都比较难在测试环境中模拟。而单元测试可以利用 mock 的方式,控制 mock 的对象返回需要模拟的异常,来测试代码在这些异常情况的表现
    除此之外,对于一些复杂系统来说,集成测试也无法覆盖得很全面。复杂系统往往有很多模块。每个模块都有各种输入、输出、异常情况,组合起来,整个系统就有无数测试场景需要模拟,无数的测试用例需要设计,再强大的测试团队也无法穷举完备
    尽管单元测试无法完全替代集成测试,但如果能保证每个类、每个函数都能按照预期来执行,底层 bug 少了,那组装起来的整个系统,出问题的概率也就相应减少了
  4. The process of writing unit tests is itself a process of code refactoring.
    Writing unit tests is actually an effective way to implement continuous refactoring. When designing and implementing code, it is difficult to think through all the problems. Writing unit tests is equivalent to a self-code review of the code. During this process, some design problems (such as untestable code design) and code writing problems (such as improper handling of some boundary conditions) can be found. , and then perform targeted reconstruction
  5. Reading unit tests can quickly familiarize yourself with the code
    . The most effective way to read the code is to first understand its business background and design ideas, and then look at the code, so that the code will be much easier to read. But programmers don't like to write documentation and comments very much, and the code written by most programmers is difficult to be "self-explanatory". In the absence of documentation and comments, unit tests serve as a substitute. Unit test cases are actually user cases, which reflect the functionality of the code and how it is used. With unit testing, you don’t need to read the code in depth to know what functions the code implements, what special situations need to be considered, and what boundary conditions need to be dealt with
  6. Unit testing is an improved solution to TDD that can be implemented on the ground.
    Test-Driven Development (TDD for short) is a development model that is often mentioned but rarely implemented. Its core guiding ideology is that test cases are written before code. However, it is quite difficult for programmers to completely accept and get used to this development mode. After all, many programmers are too lazy to write unit tests, not to mention writing test cases before writing code. Unit testing is just right for TDD
    . An improvement plan, write the code first, then write the unit test, and finally find out the problem according to the unit test feedback, and then go back and refactor the code. This development process is easier to accept, easier to implement, and takes into account the advantages of TDD

2.3 How to write unit tests?

Writing unit tests is the process of designing test cases covering various inputs, exceptions, and boundary conditions for the code, and translating these test cases into code

When translating test cases into code, you can use the unit test framework to simplify the writing of test code. For example, the well-known unit testing frameworks in Java include Junit, TestNG, Spring Test, etc. These frameworks provide common execution processes (such as TestCaseRunner for executing test cases) and tool libraries (such as various Assert judgment functions), etc. With them, when writing test code, you only need to pay attention to the writing of the test case itself, and use the test framework to achieve the following:

import org.junit.Assert;
import org.junit.Test;

public class TextTest {
    
    
    @Test
    public void testToNumber() {
    
    
        Text text = new Text("123");
        Assert.assertEquals(new Integer(123), text.toNumber());
    }
    @Test
    public void testToNumber_nullorEmpty() {
    
    
        Text text1 = new Text(null);
        Assert.assertNull(text1.toNumber());
        Text text2 = new Text("");
        Assert.assertNull(text2.toNumber());
    }
    @Test
    public void testToNumber_containsLeadingAndTrailingSpaces() {
    
    
        Text text1 = new Text(" 123");
        Assert.assertEquals(new Integer(123), text1.toNumber());
        Text text2 = new Text("123 ");
        Assert.assertEquals(new Integer(123), text2.toNumber());
        Text text3 = new Text(" 123 ");
        Assert.assertEquals(new Integer(123), text3.toNumber());
    }
    @Test
    public void testToNumber_containsMultiLeadingAndTrailingSpaces() {
    
    
        Text text1 = new Text(" 123");
        Assert.assertEquals(new Integer(123), text1.toNumber());
        Text text2 = new Text("123 ");
        Assert.assertEquals(new Integer(123), text2.toNumber());
        Text text3 = new Text(" 123 ");
        Assert.assertEquals(new Integer(123), text3.toNumber());
    }
    @Test
    public void testToNumber_containsInvalidCharaters() {
    
    
        Text text1 = new Text("123a4");
        Assert.assertNull(text1.toNumber());
        Text text2 = new Text("123 4");
        Assert.assertNull(text2.toNumber());
    }
}

2.4 Summary

2.4.1 Is it really time-consuming to write unit tests?

Although the amount of unit test code may be 1 to 2 times that of the tested code itself, the writing process is very cumbersome, but it is not very time-consuming. After all, there is no need to consider too many code design issues, and the test code is relatively simple to implement. The code difference between different test cases may not be very big, simply copy-paste and change it

2.4.2 Are there any requirements for the code quality of unit tests?

After all, unit tests will not run on the production line, and the test codes of each class are relatively independent and basically do not depend on each other. Therefore, compared with the code under test, the quality of the unit test code can be lowered. The naming is a little irregular, and the code is a little repetitive, but there is no problem

2.4.3 Is high coverage enough for unit testing?

Unit test coverage is an indicator that is relatively easy to quantify, and is often used as a criterion for judging whether unit tests are well written. There are many off-the-shelf tools dedicated to coverage statistics, for example, JaCoCo, Cobertura, Emma, ​​Clover. There are many ways to calculate coverage, the simpler one is statement coverage, and the more advanced ones are: conditional coverage, decision coverage, path coverage

No matter how advanced the coverage calculation method is, it is unreasonable to use coverage as the only criterion for measuring the quality of unit tests. In fact, it is more important to see whether the test cases cover all possible cases, especially some corner cases. For example:

public double cal(double a, double b) {
    
    
    if (b != 0) {
    
    
        return a / b;
    }
}

Like the code above, only one test case is needed to achieve 100% coverage, such as cal(10.0, 2.0), but it does not mean that the test is comprehensive enough. It also needs to be considered. When the divisor is equal to 0, the code Does it perform as expected

In fact, excessive attention to the coverage rate of unit tests will lead developers to write a lot of unnecessary test codes in order to improve the coverage rate. For example, the get and set methods are very simple and there is no need to test them. From past experience, a project can go online when the unit test coverage rate is 60-70%. If the project has relatively high requirements on code quality, the requirements for unit test coverage can be appropriately increased

2.4.4 Do I need to understand the implementation logic of the code when writing unit tests?

Unit testing does not depend on the specific implementation logic of the tested function, it only cares about what functions the tested function implements. Never read the code line by line in pursuit of coverage, and then write unit tests for the implementation logic. Otherwise, once the code is refactored, the implementation logic of the code is modified while the external behavior of the code remains unchanged, the original unit test will fail to run, and it will not be able to protect the refactoring It also violates the original intention of writing unit tests

2.4.5 How to choose a unit testing framework?

Writing unit tests itself does not require too complicated technology, and most unit test frameworks can meet the requirements. Within the company, at least the team needs a unified unit testing framework. If the code you write cannot be tested with the selected unit test framework, it is probably because the code is not well written and the testability of the code is not good enough. At this time, you should refactor your code to make it easier to test, instead of looking for another more advanced unit testing framework

2.4.6 Why is unit testing difficult to implement?

Although it is mentioned in many books that unit testing is an effective means to ensure that refactoring does not go wrong; there are also many people who have realized the importance of unit testing. But how many projects have sound, high-quality unit tests? very very little

Writing unit tests is indeed a test of patience. In general, the amount of unit test code is greater than the amount of code being tested, or even several times more. Many people tend to think that writing unit tests is cumbersome and there are not many challenges, so they are unwilling to do it. There are many teams and projects that were more serious and executed better when they first started to implement unit testing. But when the development tasks are tight, the requirements for unit testing begin to be lowered. Once the broken window effect occurs, everyone will stop writing. This is very common.

Another situation is that due to historical problems, the original code has not written unit tests, and the code has already piled up more than 100,000 lines, and it is impossible to supplement unit tests one by one. In this case, first of all, it is necessary to ensure that the newly written code must have unit tests. Secondly, every time a class is changed, if there is no unit test, it will be added by the way. However, this requires engineers to have a strong sense of ownership (ownership), after all, it is difficult to implement many things in place only relying on the supervision of the leader

In addition, some people think that with a test team, writing unit tests is a waste of time and unnecessary. The industry of programmers should be intelligence-intensive, but now many companies make it labor-intensive, including some large factories. During the development process, there is neither unit testing nor Code Review process. Even if there is, what it does is not satisfactory. Write the code and submit it directly, and then throw it to the black box test to test it fiercely. If the problem is detected, it will be fed back to the development team for modification. If the problem cannot be detected, it will be left online and then repaired.

In such a development mode, the team often feels that there is no need to write unit tests, but if the unit tests are written well, Code Review is done well, and code quality is emphasized, the investment in black-box testing can be greatly reduced.

3. Code testability

3.1 How to write code with good testability?

As follows, Transaction is a transaction class of an e-commerce system after abstraction and simplification, and is used to record the status of each order transaction. execute()The functions in the Transaction class are responsible for performing the transfer operation, transferring money from the buyer's wallet to the seller's wallet. The real transfer operation is done by calling the WalletRpcService RPC service. In addition, the code also involves a distributed lock DistributedLock singleton class, which is used to avoid the concurrent execution of Transaction, causing the user's money to be transferred out repeatedly

public class Transaction {
    
    
    private String id;
    private Long buyerId;
    private Long sellerId;
    private Long productId;
    private String orderId;
    private Long createTimestamp;
    private Double amount;
    private STATUS status;
    private String walletTransactionId;

    // ...get() methods...
    public Transaction(String preAssignedId, Long buyerId, Long sellerId, Long p) {
    
    
        if (preAssignedId != null && !preAssignedId.isEmpty()) {
    
    
            this.id = preAssignedId;
        } else {
    
    
            this.id = IdGenerator.generateTransactionId();
        }
        if (!this.id.startWith("t_")) {
    
    
            this.id = "t_" + preAssignedId;
        }
        this.buyerId = buyerId;
        this.sellerId = sellerId;
        this.productId = productId;
        this.orderId = orderId;
        this.status = STATUS.TO_BE_EXECUTD;
        this.createTimestamp = System.currentTimestamp();
    }

    public boolean execute() throws InvalidTransactionException {
    
    
        if ((buyerId == null || (sellerId == null || amount < 0.0) {
    
    
            throw new InvalidTransactionException(...);
        }
        if (status == STATUS.EXECUTED) return true;
        boolean isLocked = false;
        try {
    
    
            isLocked = RedisDistributedLock.getSingletonIntance().lockTransction(id)
            if (!isLocked) {
    
    
                return false; // 锁定未成功,返回 false,job 兜底执行
            }
            if (status == STATUS.EXECUTED) return true; // double check
            long executionInvokedTimestamp = System.currentTimestamp();
            if (executionInvokedTimestamp - createdTimestap > 14d ays){
    
    
                this.status = STATUS.EXPIRED;
                return false;
            }
            WalletRpcService walletRpcService = new WalletRpcService();
            String walletTransactionId = walletRpcService.moveMoney(id, buyerId, sell);
            if (walletTransactionId != null) {
    
    
                this.walletTransactionId = walletTransactionId;
                this.status = STATUS.EXECUTED;
                return true;
            } else {
    
    
                this.status = STATUS.FAILED;
                return false;
            }
        } finally {
    
    
            if (isLocked) {
    
    
                RedisDistributedLock.getSingletonIntance().unlockTransction(id);
            }
        }
    }
}

Compared with the previous Text class code, this code is much more complicated. If you write a unit test for this code, how should you write it?

In the Transaction class, the main logic is concentrated in execute()the function , so it is the key object of the test. In order to cover various normal and abnormal situations as comprehensively as possible, the following 6 test cases are designed for this function:

  1. 正常情况下,交易执行成功,回填用于对账(交易与钱包的交易流水)用的 walletTransactionId,交易状态设置为 EXECUTED,函数返回 true
  2. buyerId、sellerId 为 null、amount 小于 0,返回 InvalidTransactionException
  3. 交易已过期(createTimestamp 超过 14 天),交易状态设置为 EXPIRED,返回 false
  4. 交易已经执行了(status==EXECUTED),不再重复执行转钱逻辑,返回 true
  5. 钱包(WalletRpcService)转钱失败,交易状态设置为 FAILED,函数返回 false
  6. 交易正在执行着,不会被重复执行,函数直接返回 false

测试用例设计完了。现在看起来似乎一切进展顺利。但是,事实是,当将测试用例落实到具体的代码实现时,就会发现有很多行不通的地方。对于上面的测试用例,第 2 个实现起来非常简单,重点来看其中的 1 和 3。测试用例 4、5、6 跟 3类似

测试用例 1 的代码实现。具体如下所示:

public void testExecute() {
    
    
        Long buyerId = 123L;
        Long sellerId = 234L;
        Long productId = 345L;
        Long orderId = 456L;

        Transction transaction = new Transaction(null, buyerId, sellerId, productId, orderId);
        boolean executedResult = transaction.execute();
        assertTrue(executedResult);
}

execute() 函数的执行依赖两个外部的服务,一个是 RedisDistributedLock,一个 WalletRpcService。这就导致上面的单元测试代码存在下面几个问题:

  • 如果要让这个单元测试能够运行,需要搭建 Redis 服务和 Wallet RPC 服务。搭建和维护的成本比较高
  • 还需要保证将伪造的 transaction 数据发送给 Wallet RPC 服务之后,能够正确返回期望的结果,然而 Wallet RPC 服务有可能是第三方(另一个团队开发维护的)的服务,并不是可控的。换句话说,并不是想让它返回什么数据就返回什么
  • Transaction 的执行跟 Redis、RPC 服务通信,需要走网络,耗时可能会比较长,对单元测试本身的执行性能也会有影响
  • 网络的中断、超时、Redis、RPC 服务的不可用,都会影响单元测试的执行

回到单元测试的定义上来看一下。单元测试主要是测试程序员自己编写的代码逻辑的正确性,并非是端到端的集成测试,它不需要测试所依赖的外部系统(分布式锁、Wallet RPC 服务)的逻辑正确性。所以,如果代码中依赖了外部系统或者不可控组件,比如,需要依赖数据库、网络通信、文件系统等,那就需要将被测代码与外部系统解依赖,而这种解依赖的方法就叫作“mock”。所谓的 mock 就是用一个“假”的服务替换真正的服
务。mock 的服务完全在我们的控制之下,模拟输出我们想要的数据

那如何来 mock 服务呢?mock 的方式主要有两种,手动 mock 和利用框架 mock。利用框架 mock 仅仅是为了简化代码编写,每个框架的 mock 方式都不大一样。这里只展示手动 mock

通过继承 WalletRpcService 类,并且重写其中的 moveMoney() 函数的方式来实现 mock。具体的代码实现如下所示。通过 mock 的方式,可以让 moveMoney() 返回任意我们想要的数据,完全在我们的控制范围内,并且不需要真正进行网络通信

public class MockWalletRpcServiceOne extends WalletRpcService {
    
    
    public String moveMoney(Long id, Long fromUserId, Long toUserId, Double amoun) {
    
    
        return "123bac";
    }
}

public class MockWalletRpcServiceTwo extends WalletRpcService {
    
    
    public String moveMoney(Long id, Long fromUserId, Long toUserId, Double amoun) {
    
    
        return null;
    }
}

现在再来看,如何用 MockWalletRpcServiceOne、MockWalletRpcServiceTwo 来替换代码中的真正的 WalletRpcService 呢?

因为 WalletRpcService 是在 execute() 函数中通过 new 的方式创建的,无法动态地对其进行替换。也就是说,Transaction 类中的 execute() 方法的可测试性很差,需要通过重构来让其变得更容易测试。该如何重构这段代码呢?

依赖注入是实现代码可测试性的最有效的手段。可以应用依赖注入,将 WalletRpcService 对象的创建反转给上层逻辑,在外部创建好之后,再注入到 Transaction 类中。重构之后的 Transaction 类的代码如下所示:

public class Transaction {
    
    
    //...
    // 添加一个成员变量及其 set 方法
    private WalletRpcService walletRpcService;
    public void setWalletRpcService(WalletRpcService walletRpcService) {
    
    
        this.walletRpcService = walletRpcService;
    }
    // ...
    public boolean execute() {
    
    
       // ...
       // 删除下面这一行代码
       // WalletRpcService walletRpcService = new WalletRpcService();
       // ...
    }
}

Now it is very easy to replace WalletRpcService with MockWalletRpcServiceOne or WalletRpcServiceTwo in unit tests. The unit test corresponding to the refactored code is as follows:

public void testExecute() {
    
    
        Long buyerId = 123L;
        Long sellerId = 234L;
        Long productId = 345L;
        Long orderId = 456L;

        Transction transaction = new Transaction(null, buyerId, sellerId, productId, orderId);
        // 使用 mock 对象来替代真正的 RPC 服务
        transaction.setWalletRpcService(new MockWalletRpcServiceOne()):
        boolean executedResult = transaction.execute();
        assertTrue(executedResult);
        assertEquals(STATUS.EXECUTED, transaction.getStatus());
}

The mock and replacement problem of WalletRpcService is solved, let's look at RedisDistributedLock again. Its mock and replacement are more complicated, mainly because RedisDistributedLock is a singleton class. A singleton is equivalent to a global variable, which cannot be mocked (methods cannot be inherited and overridden), nor can it be replaced by dependency injection

If RedisDistributedLock is maintained by ourselves and can be freely modified and refactored, then we can change it to a non-singleton mode, or define an interface, such as IDistributedLock, and let RedisDistributedLock implement this interface. In this way, RedisDistributedLock can be replaced with MockRedisDistributedLock in the same way as the replacement method of WalletRpcService. But if
RedisDistributedLock is not maintained by us, we have no right to modify this part of the code, what should we do at this time?

You can repackage the logic of locking the transaction. The specific code implementation is as follows:

public class TransactionLock {
    
    
    public boolean lock(String id) {
    
    
        return RedisDistributedLock.getSingletonIntance().lockTransction(id);
    }
    public void unlock() {
    
    
        RedisDistributedLock.getSingletonIntance().unlockTransction(id);
    }
}

public class Transaction {
    
    
    //...
    private TransactionLock lock;
    public void setTransactionLock(TransactionLock lock) {
    
    
        this.lock = lock;
    }
    public boolean execute() {
    
    
        //...
        try {
    
    
            isLocked = lock.lock();
        //...
        } finally {
    
    
            if (isLocked) {
    
    
                lock.unlock();
            }
        }
        //...
    }
}

For the refactored code, the unit test code is modified to look like this. In this way, the logic of the real RedisDistributedLock distributed lock can be isolated in the unit test code

public void testExecute() {
    
    
    Long buyerId = 123L;
    Long sellerId = 234L;
    Long productId = 345L;
    Long orderId = 456L;

    TransactionLock mockLock = new TransactionLock() {
    
    
        public boolean lock(String id) {
    
    
            return true;
        }
        public void unlock() {
    
    }
    };
    Transction transaction = new Transaction(null, buyerId, sellerId, productId, orderId);
    transaction.setWalletRpcService(new MockWalletRpcServiceOne());
    transaction.setTransactionLock(mockLock);
    boolean executedResult = transaction.execute();
    assertTrue(executedResult);
    assertEquals(STATUS.EXECUTED, transaction.getStatus());
}

At this point, test case 1 is written. Through dependency injection and mock, the unit test code does not depend on any uncontrollable external services

Now, let's look at test case 3 again: the transaction has expired (createTimestamp is more than 14 days old), the transaction status is set to EXPIRED, and false is returned. For this unit test case, it is better to write the code first, and then analyze it

public void testExecute_with_TransactionIsExpired() {
    
    
        Long buyerId = 123L;
        Long sellerId = 234L;
        Long productId = 345L;
        Long orderId = 456L;

        Transction transaction = new Transaction(null, buyerId, sellerId, productId,orderId);
        transaction.setCreatedTimestamp(System.currentTimestamp() - 14days);
        boolean actualResult = transaction.execute();
        assertFalse(actualResult);
        assertEquals(STATUS.EXPIRED, transaction.getStatus());
}

The above code doesn't seem to have any problems. Set the createdTimestamp of the transaction to 14 days ago, that is to say, when the unit test code runs, the transaction must be in the expired state. However, what if the set method for modifying the createdTimestamp member variable is not exposed in the Transaction class (that is, no setCreatedTimestamp()function )?

At this time, it may be said that if there is no set method for createTimestamp, just add one again! In effect, this violates the encapsulation properties of classes. In the design of the Transaction class, createTimestamp is the system time that is automatically obtained when the transaction is generated (that is, in the constructor), and should not be easily modified artificially. Therefore, although the set method of exposing createTimestamp brings flexibility, But it also brings uncontrollability. Because it is impossible to control whether the user will call the set method to reset createTimestamp, and resetting createTimestamp is not our expected behavior

If there is no set method for createTimestamp, how can test case 3 be implemented? In fact, this is a relatively common type of problem, that is, the code contains "pending behavior" logic related to "time". The general approach is to repackage this pending behavior logic. For the Transaction class, you only need to encapsulate the logic of whether the transaction expires into isExpired()a function . The specific code implementation is as follows:

public class Transaction {
    
    
    protected boolean isExpired() {
    
    
        long executionInvokedTimestamp = System.currentTimestamp();
        return executionInvokedTimestamp - createdTimestamp > 14days;
    }
    public boolean execute() throws InvalidTransactionException {
    
    
        //...
        if (isExpired()) {
    
    
            this.status = STATUS.EXPIRED;
            return false;
        }
        //...
    }
}

For the refactored code, the code implementation of test case 3 is as follows:

public void testExecute_with_TransactionIsExpired() {
    
    
        Long buyerId = 123L;
        Long sellerId = 234L;
        Long productId = 345L;
        Long orderId = 456L;
  
        Transction transaction = new Transaction(null, buyerId, sellerId, productId, orderId) {
    
    
            protected boolean isExpired() {
    
    
                return true;
            }
        };
        boolean actualResult = transaction.execute();
        assertFalse(actualResult);
        assertEquals(STATUS.EXPIRED, transaction.getStatus());
}

Through refactoring, the testability of Transaction code is improved. All the test cases listed before have now been successfully implemented. However, the design of the constructor of the Transaction class is still a bit inappropriate, as follows:

public Transaction(String preAssignedId, Long buyerId, Long sellerId, Long p) {
    
    
        if (preAssignedId != null && !preAssignedId.isEmpty()) {
    
    
            this.id = preAssignedId;
        } else {
    
    
            this.id = IdGenerator.generateTransactionId();
        }
        if (!this.id.startWith("t_")) {
    
    
            this.id = "t_" + preAssignedId;
        }
        this.buyerId = buyerId;
        this.sellerId = sellerId;
        this.productId = productId;
        this.orderId = orderId;
        this.status = STATUS.TO_BE_EXECUTD;
        this.createTimestamp = System.currentTimestamp();
}

The constructor does not contain only simple assignment operations. The assignment logic of the transaction id is a little more complicated. It is best to test it to ensure the correctness of this part of the logic. For the convenience of testing, the logic of id assignment can be abstracted into a function separately. The specific code implementation is as follows:

public Transaction(String preAssignedId, Long buyerId, Long sellerId, Long p
        //...
        fillTransactionId(preAssignId);
        //...
}

protected void fillTransactionId(String preAssignedId) {
    
    
        if (preAssignedId != null && !preAssignedId.isEmpty()) {
    
    
            this.id = preAssignedId;
        } else {
    
    
            this.id = IdGenerator.generateTransactionId();
        }
        if (!this.id.startWith("t_")) {
    
    
            this.id = "t_" + preAssignedId;
        }
}

So far, step by step, Transaction has been reconstructed from untestable code to well-testable code. But there may still be doubts, don't isExpired()the functions need to be tested? For isExpired()the function , the logic is very simple, and the naked eye can determine whether there is a bug, so there is no need to write unit tests

In fact, codes with poor testability are not designed well enough, and many places do not follow the design principles and ideas mentioned above, such as the idea of ​​"programming based on interfaces rather than implementation", the principle of dependency inversion, etc. The refactored code not only has better testability, but also follows the classic design principles and ideas from the perspective of code design. This also confirms that the testability of the code can reflect whether the code design is reasonable from the side. In addition, in the usual development, we should also think more about whether it is easy to write unit tests when writing code in this way, which is also conducive to designing good code

3.2 Other common Anti-Patterns

To sum up, what are the typical and common codes with poor testability, which are often called Anti-Patterns:

3.2.1 Pending actions

The so-called pending behavior logic is that the output of the code is random or uncertain, for example, code related to time and random numbers

3.2.2 Global variables

Global variables are a procedural-oriented programming style that has various drawbacks. In fact, misuse of global variables also makes writing unit tests difficult, as in the following example:

RangeLimiter represents a range of [-5, 5], the position is initially at 0, and move()the function is responsible for moving the position. Among them, position is a static global variable. The RangeLimiterTest class is a unit test designed for it, however, there is a big problem with it

public class RangeLimiter {
    
    
    private static AtomicInteger position = new AtomicInteger(0);
    public static final int MAX_LIMIT = 5;
    public static final int MIN_LIMIT = -5;
    public boolean move(int delta) {
    
    
        int currentPos = position.addAndGet(delta);
        boolean betweenRange = (currentPos <= MAX_LIMIT) && (currentPos >= MIN_LIMI
        return betweenRange;
    }
}
public class RangeLimiterTest {
    
    
    public void testMove_betweenRange() {
    
    
        RangeLimiter rangeLimiter = new RangeLimiter();
        assertTrue(rangeLimiter.move(1));
        assertTrue(rangeLimiter.move(3));
        assertTrue(rangeLimiter.move(-5));
    }
    public void testMove_exceedRange() {
    
    
        RangeLimiter rangeLimiter = new RangeLimiter();
        assertFalse(rangeLimiter.move(6));
    }
}

The above unit test may fail to run. Assume that the unit testing framework executes testMove_betweenRange()and testMove_exceedRange(). After the execution of the first test case is completed, the value of position becomes -1; when the second test case is executed, the position becomes 5, the move()function returns true, and the assertFalse statement fails. So, the second test case fails to run

Of course, if the RangeLimiter class has a function that exposes the reset (reset) position value, you can reset the position to 0 before each execution of the unit test case, which can solve the problem just now

However, the way each unit testing framework executes unit test cases may be different. Some are executed sequentially and some are executed concurrently. For the case of concurrent execution, even if the position is reset to 0 every time, it will not work. If two test cases are executed concurrently, the four lines of code 16, 17, 18, and 23 may be cross-executed, affecting the execution result of move()the function

3.2.3 Static methods

Static methods, like global variables, are also a process-oriented programming thinking. Calling static methods in code sometimes makes the code difficult to test. The main reason is that static methods are also difficult to mock. However, this depends on the situation. Only when the static method takes too long to execute, relies on external resources, has complex logic, and pending behavior, does it need to mock this static method in the unit test. In addition, if it is just a simple static method like Math.abs()this , it will not affect the testability of the code, because it does not require mock

3.2.4 Complex inheritance

Compared with the composition relationship, the code structure of the inheritance relationship is more coupled and inflexible, and it is less easy to expand and maintain. In fact, inheritance relationships are also more difficult to test. This also confirms the correlation between code testability and code quality

If the parent class needs to mock a dependent object for unit testing, then all subclasses, subclasses of subclasses... must mock this dependent object when writing unit tests. For an inheritance relationship with a deep hierarchy (shown as vertical depth in the inheritance relationship class diagram) and complex structure (shown as horizontal breadth in the inheritance relationship class diagram), the lower the subclass may have more objects to mock, This will lead to the fact that when the underlying subclasses write unit tests, they need to mock many dependent objects one by one, and also need to check the parent class code to understand how to mock these dependent objects

If you use composition instead of inheritance to organize the relationship between classes, the structural hierarchy between classes is relatively flat. When writing unit tests, you only need to mock the objects that the class depends on.

3.2.5 Highly coupled code

If a class has heavy responsibilities and needs to rely on more than a dozen external objects to complete the work, and the code is highly coupled, then when writing unit tests, you may need to mock these more than a dozen dependent objects. This is unreasonable both from the point of view of code design and from the point of view of writing unit tests

4. Decoupling

As mentioned earlier, refactoring can be divided into large-scale high-level refactoring (referred to as "large refactoring") and small-scale low-level refactoring (referred to as "small refactoring"). Large-scale refactoring is the refactoring of top-level code designs such as systems, modules, code structures, and relationships between classes. For large-scale refactoring, one of the most effective means is "decoupling". The purpose of decoupling is to achieve high code cohesion and loose coupling

4.1 Why is "decoupling" so important?

One of the most important tasks of software design and development is dealing with complexity. Humans are limited in their ability to deal with complexity. Overly complex code is often unfriendly in terms of readability and maintainability. So how to control the complexity of the code? There are many methods, and I personally think that the most important thing is decoupling to ensure loose coupling and high cohesion of the code. If refactoring is an effective means to ensure that the code quality will not be corrupted to the point of hopelessness, then the use of decoupling method to refactor the code is an effective means to ensure that the code will not be too complicated to be uncontrollable

"High cohesion and loose coupling" is a relatively general design idea, which can not only guide the design of fine-grained classes and the relationship between classes, but also guide the design of coarse-grained systems, architectures, and modules. Compared with coding standards, it can improve the readability and maintainability of code at a higher level

Whether it is reading code or modifying code, the characteristics of "high cohesion and loose coupling" allow us to focus on a certain module or class without knowing too much about the code of other modules or classes, so that our focus will not be too divergent , reducing the difficulty of reading and modifying the code. Moreover, because the dependencies are simple and the coupling is small, modifying the code will not affect the whole body. The code modification is relatively concentrated, and the risk of introducing bugs is greatly reduced. At the same time, "high cohesion, loose coupling" code is also more testable, easy to mock or rarely need to mock externally dependent modules or classes

In addition, the code is "high cohesion and loose coupling", which means that the code structure is clear, the layering and modularization are reasonable, the dependencies are simple, and the coupling between modules or classes is small, so the overall quality of the code is Not bad. Even if a specific class or module is not well designed and the code quality is not very high, the scope of influence is very limited. You can focus on this module or class and do small refactoring accordingly. Compared with the adjustment of the code structure, the difficulty of small refactoring with a relatively concentrated range of changes is much easier

4.2 Does the code need to be "decoupled"?

How to judge the degree of coupling of the code? In other words, how to judge whether the code conforms to "high cohesion and loose coupling"? In other words, how to judge whether the system needs to be decoupled and refactored?

There are many indirect measurement standards, for example, whether modifying the code will affect the whole body. In addition, there is a direct measurement standard, which is to draw the dependencies between modules and between classes, and judge whether decoupling refactoring is needed according to the complexity of the dependency graph.

If the dependencies are complex and confusing, the readability and maintainability of the code structure are definitely not very good, so it is necessary to consider whether the dependencies can be made clear and simple through decoupling. Of course, this kind of judgment still has a relatively strong subjective color, but it can be used as a reference and a means of sorting out dependencies, and it can be used together with indirect measurement standards

4.3 How to "decouple" the code?

4.3.1 Encapsulation and abstraction

Encapsulation and abstraction, as two very general design ideas, can be applied in many design scenarios, such as the design of systems, modules, libs, components, interfaces, classes, etc. Encapsulation and abstraction can effectively hide the complexity of implementation, isolate the variability of implementation, and provide stable and easy-to-use abstract interfaces for dependent modules

For example, open()the file are very simple to use, but the underlying implementation is very complicated, involving permission control, concurrency control, physical storage, and so on. By encapsulating it into an abstract open()function , the spread of code complexity can be effectively controlled and the complexity can be encapsulated in local code. In addition, because open()the function is defined based on abstraction rather than specific implementation, when changing open()the underlying implementation of the function, there is no need to change the upper-level code that depends on it, which also meets the aforementioned "high cohesion, loose coupling "Code Judging Criteria

4.3.2 Middle layer

Introducing an intermediate layer simplifies dependencies between modules or classes. The picture below is a comparison of dependencies before and after the introduction of the middle layer. Before introducing the middle layer of data storage, the three modules A, B, and C all depend on the three modules of memory level-1 cache, Redis level-2 cache, and DB persistent storage. After the middle layer is introduced, the three modules only need to rely on one module for data storage. As can be seen from the figure, the introduction of the middle layer significantly simplifies the dependencies and makes the code structure clearer

insert image description here

In addition, when refactoring, the introduction of the middle layer can play a transitional role, allowing development and refactoring to proceed simultaneously without interfering with each other. For example, if there is a problem with the design of an interface, its definition needs to be modified, and at the same time, all codes that call this interface must be changed accordingly. If the newly developed code also uses this interface, then the development conflicts with the refactoring. In order to allow refactoring to run in small steps, the interface modification can be completed in the following four stages:

  1. Introduce an intermediate layer, wrap the old interface, and provide a new interface definition
  2. The newly developed code relies on the new interface provided by the middle layer
  3. Change the code that relies on the old interface to call the new interface
  4. After ensuring that all code calls the new interface, delete the old interface

In this way, the development workload of each stage will not be very large, and can be completed in a short period of time. The probability of refactoring and development conflicts is also reduced

4.3.3 Modularity

Modularity is a common means of building complex systems. Not only in the software industry, but also in construction, machinery manufacturing and other industries, this method is also very useful. For a large complex system, no one person can control all the details. The main reason why such a complex system can be built and maintained is to divide the system into independent modules and let different people be responsible for different modules, so that even without knowing all the details, the management The operator can also coordinate the various modules to make the whole system work effectively

Focusing on software development, the reason why many large-scale software (such as Windows) can achieve orderly collaborative development by hundreds or thousands of people is also due to the good modularity. Different modules communicate through APIs, and the coupling between each module is very small. Each small team focuses on an independent high-cohesion module for development, and finally assembles each module like building blocks to build a a super complex system

Then focus on the code level. Reasonable division of modules can effectively decouple the code and improve the readability and maintainability of the code. Therefore, when developing code, you must have a sense of modularity, develop each module as an independent lib, and only provide interfaces that encapsulate internal implementation details for other modules to use, which can reduce the differences between different modules. Coupling between

In fact, the idea of ​​modularization is ubiquitous, such as SOA, microservices, lib library, module division in the system, and even the design of classes and functions, all embody the idea of ​​modularization. If you go back to the source, the more essential thing of modular thinking is to divide and conquer

4.4 Other design ideas and principles

"High cohesion and loose coupling" is a very important design idea, which can effectively improve the readability and maintainability of the code, and reduce the scope of code changes caused by functional changes. In fact, in the previous chapters, this design idea has been mentioned many times. Many design principles aim to achieve "high cohesion and loose coupling" of the code, as follows:

4.4.1 Single Responsibility Principle

Cohesion and coupling are not independent. High cohesion makes the code more loosely coupled, and an important guiding principle to achieve high cohesion is the single responsibility principle. If the responsibility of a module or class is designed to be single, rather than large and comprehensive, then there will be fewer classes that depend on it and the classes it depends on, and the code coupling will be reduced accordingly

4.4.2 Interface-based rather than implementation-based programming

Interface-based rather than implementation-based programming can isolate changes from specific implementations through an intermediary layer such as interfaces. The advantage of this is that between two modules or classes that have dependencies, changes in one module or class will not affect the other module or class. In fact, this is equivalent to decoupling a strong dependency (strong coupling) into a weak dependency (weak coupling)

4.4.3 Dependency Injection

Similar to the idea of ​​programming based on interfaces rather than implementation, dependency injection also changes the strong coupling between codes into weak coupling. Although dependency injection cannot decouple two classes that should have dependencies into no dependencies, it can make the coupling relationship less tight and easy to plug and replace

4.4.4 More use of composition and less use of inheritance

Inheritance is a strong dependency relationship. The parent class and the subclass are highly coupled, and this coupling relationship is very fragile. Every change of the parent class will affect all subclasses. On the contrary, the composition relationship is a weak dependency relationship, which is more flexible. Therefore, for code with a complex inheritance structure, using composition to replace inheritance is also an effective means of decoupling

4.4.5 Law of Demeter

Dimit's law says that there should be no dependencies between classes that should not have direct dependencies; between classes that have dependencies, try to only rely on the necessary interfaces. From the definition, it is obvious that the purpose of this principle is to achieve loose coupling of code

In addition to the design ideas and principles mentioned above, there are also some design patterns for decoupling dependencies, such as the observer pattern

5. Programming Specifications

There are many books about coding standards and how to write readable code. However, here is a summary of 20
coding standards that I think are the best to use. Mastering these 20 coding standards can improve the code quality most quickly. Divided into three parts: Naming and Comments, Code Style and Coding Tips

5.1 Naming

As big as project name, module name, package name, and externally exposed interfaces, as small as class name, function name, variable name, and parameter name, as long as you are doing development, you can't escape the "naming" level. The quality of the naming is very important to the readability of the code, and it can even be said to play a decisive role. In addition, the naming ability also reflects the basic programming literacy of a programmer.

5.1.1 What is the most appropriate length for naming?

There are two typical ones here. The first is that I like to use very long naming methods. I feel that the naming must be accurate and expressive, even if it is longer, it does not matter, so in the project, the class name and function name are very long. The second type likes to use short naming methods, and use abbreviations as much as possible. Therefore, the project is full of names containing various abbreviations. Which of these two naming methods is more recommended?

Although long names can contain more information and express intent more accurately and intuitively, if the names of functions and variables are very long, the statements composed of them will be very long. When the length of the code column is limited, it often happens that a statement is divided into two lines, which actually affects the readability of the code

In fact, the shorter the naming, the better, if it is enough to express its meaning. However, in most cases, short names are not as expressive as long ones. Therefore, many books or articles do not recommend using abbreviations when naming. But for some default and well-known words, it is recommended to use abbreviations. In this way, on the one hand, the name can be shortened, and on the other hand, it does not affect reading comprehension. For example, sec means second, str means string, num means number, and doc means document. In addition, for variables with relatively small scope, you can use relatively short names, such as temporary variables in some functions. On the contrary, for a class name with a relatively large scope, it is recommended to use a long naming method

In short, one of the principles of naming is to aim at expressing the meaning accurately. However, for the code writers, they are very clear about the logic of the code, and always feel that any name can be used to express their ideas. In fact, for colleagues who are not familiar with the code, they may not think so. Therefore, when naming, you must learn to empathize. Assuming that you are not familiar with the code, consider whether the naming is intuitive enough from the perspective of the code reader

5.1.2 Using context to simplify naming

For example:

public class User {
    
    
    private String userName;
    private String userPassword;
    private String userAvatarUrl;
    //...
}

In the context of the User class, there is no need to repeatedly add a prefix word such as "user" in the naming of member variables, but directly name them name, password, and avatarUrl. When using these attributes, the context of the object can be used, and the meaning is clear enough. The specific code is as follows:

User user = new User();
user.getName(); // 借助 user 对象这个上下文

In addition to classes, function parameters can also use the function context to simplify naming, as in the following example:

public void uploadUserAvatarImageToAliyun(String userAvatarImageUri);
// 利用上下文简化为:
public void uploadUserAvatarImageToAliyun(String imageUri);

5.1.3 Naming should be readable and searchable

What is named readable. The "readable" mentioned here means not to use some particularly uncommon and difficult-to-pronounce English words to name

过去曾参加过两个项目,一个叫 plateaux,另一个叫 eyrie,从项目立项到结束,自始至终都没有几个人能叫对这两个项目的名字。在沟通的时候,每当有人提到这两个项目的名字的时候,都会尴尬地卡顿一下。虽然我们并不排斥一些独特的命名方式,但起码得让大部分人看一眼就能知道怎么读

在 IDE 中编写代码的时候,经常会用“关键词联想”的方法来自动补全和搜索。比如,键入某个对象“.get”,希望 IDE 返回这个对象的所有 get开头的方法。再比如,通过在 IDE 搜索框中输入“Array”,搜索 JDK 中数组相关的类。所以,在命名的时候,最好能符合整个项目的命名习惯。大家都用“selectXXX”表示查询,你就不要用“queryXXX”;大家都用“insertXXX”表示插入一条数据,你就要不用“addXXX”,统一规约是很重要的,能减少很多不必要的麻烦

5.1.4 如何命名接口和抽象类?

对于接口的命名,一般有两种比较常见的方式。一种是加前缀“I”,表示一个 Interface。比如 IUserService,对应的实现类命名为 UserService。另一种是不加前缀,比如 UserService,对应的实现类加后缀“Impl”,比如 UserServiceImpl

对于抽象类的命名,也有两种方式,一种是带上前缀“Abstract”,比如 AbstractConfiguration;另一种是不带前缀“Abstract”。实际上,对于接口和抽象类,选择哪种命名方式都是可以的,只要项目里能够统一就行

5.2 注释

命名很重要,注释跟命名同等重要。很多书籍认为,好的命名完全可以替代注释。如果需要注释,那说明命名不够好,需要在命名上下功夫,而不是添加注释。实际上,我个人觉得,这样的观点有点太过极端。命名再好,毕竟有长度限制,不可能足够详尽,而这个时候,注释就是一个很好的补充

5.2.1 注释到底该写什么?

注释的目的就是让代码更容易看懂。只要符合这个要求的内容,就可以将它写到注释里。总结一下,注释的内容主要包含这样三个方面:做什么、为什么、怎么做。如下例:

/**
 * (what) Bean factory to create beans.
 *
 * (why) The class likes Spring IOC framework, but is more lightweight.
 *
 * (how) Create objects from different sources sequentially:
 * user specified object > SPI > configuration > default object.
 */
public class BeansFactory {
    
    
    // ...
}

有些人认为,注释是要提供一些代码没有的额外信息,所以不要写“做什么、怎么做”,这两方面在代码中都可以体现出来,只需要写清楚“为什么”,表明代码的设计意图即可。我个人不是特别认可这样的观点,理由主要有下面 3 点:

1、注释比代码承载的信息更多

命名的主要目的是解释“做什么”。比如,void increaseWalletAvailableBalance(BigDecimal amount) 表明这个函数用来增加钱包的可用余额,boolean isValidatedPassword 表明这个变量用来标识是否是合法密码。函数和变量如果命名得好,确实可以不用再在注释中解释它是做什么的。但是,对于类来说,包含的信息比较多,一个简单的命名就不够全面详尽了。这个时候,在注释中写明“做什么”就合情合理了

2、注释起到总结性作用、文档的作用

代码之下无秘密。阅读代码可以明确地知道代码是“怎么做”的,也就是知道代码是如何实现的,那注释中是不是就不用写“怎么做”了?实际上也可以写。在注释中,关于具体的代码实现思路,可以写一些总结性的说明、特殊情况的说明。这样能够让阅读代码的人通过注释就能大概了解代码的实现思路,阅读起来就会更加容易

实际上,对于有些比较复杂的类或者接口,可能还需要在注释中写清楚“如何用”,举一些简单的 quick start 的例子,让使用者在不阅读代码的情况下,快速地知道该如何使用

3、一些总结性注释能让代码结构更清晰

对于逻辑比较复杂的代码或者比较长的函数,如果不好提炼、不好拆分成小的函数调用,那可以借助总结性的注释来让代码结构更清晰、更有条理

public boolean isValidPasword(String password) {
    
    
        // check if password is null or empty
        if (StringUtils.isBlank(password)) {
    
    
            return false;
        }
        // check if the length of password is between 4 and 64
        int length = password.length();
        if (length < 4 || length > 64) {
    
    
            return false;
        }
        // check if password contains only lowercase characters
        if (!StringUtils.isAllLowerCase(password)) {
    
    
            return false;
        }
        // check if password contains only a~z,0~9,dot
        for (int i = 0; i < length; ++i) {
    
    
            char c = password.charAt(i);
            if (!(c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '.') {
    
    
                return false;
            }
        }
        return true;
}

5.2.2 注释是不是越多越好?

There are problems with too many comments and too few. Too many may mean that the code is not readable enough and needs to be supplemented by writing a lot of comments. In addition, too many comments will interfere with the reading of the code itself. Moreover, the maintenance cost in the later period is relatively high. Sometimes the code is changed, and the comments are forgotten to be modified synchronously, which will confuse the code readers even more. Of course, if there is no line of comments in the code, it can only show that the programmer is lazy, and he should be properly supervised to make him pay attention to adding some necessary comments

According to experience, comments must be written for classes and functions, and they must be written as comprehensively and detailedly as possible, while comments inside functions are relatively less, generally relying on good naming, refined functions, explanatory variables, and summative Comments to improve code readability

5.3 Code Style (Code Style)

5.3.1 What is the appropriate size for classes and functions?

Generally speaking, the number of lines of code of a class or function should not be too many, but it should not be too few. There are too many lines of code in a class or function. A class has thousands of lines, and a function has hundreds of lines. The logic is too complicated. When reading the code, it is easy to read the back and forget the front. On the contrary, if the number of code lines of a class or function is too small, when the total amount of code is the same, the number of divided classes and functions will increase accordingly, and the calling relationship will become more complicated. When reading a certain code logic, It is necessary to jump between n multiple categories or n multiple functions frequently, and the reading experience is not good

How many lines of code is the most appropriate for a class or function?

It is difficult to give an exact quantitative value. Regarding the maximum limit on the number of lines of function code, there is a saying on the Internet that it should not exceed the vertical height of a display screen. For example, on my computer, if the code of a function is to be completely displayed in the IDE, the maximum number of lines of code cannot exceed 50. I think this statement is quite reasonable. Because after more than one screen, when reading the code, in order to connect the code logic before and after, you may need to scroll up and down the screen frequently. Not to mention the bad reading experience, it is also prone to errors

For the maximum limit of the number of lines of code of the class, this is even more difficult to give an exact value. I also gave an indirect judgment standard before, that is, when the code of a class makes you feel overwhelmed when you read it, and you don’t know which function to use when implementing a certain function, you can search for which function you want to use for a long time. No more, when only a small function is used to introduce the entire class (the class contains many functions that have nothing to do with the implementation of this function), it means that the number of lines in the class is too much

5.3.2 What is the most suitable length for a line of code?

在 Google Java Style Guide 文档中,一行代码最长限制为 100 个字符。不过,不同的编程语言、不同的规范、不同的项目团队,对此的限制可能都不相同。不管这个限制是多少,总体上来讲要遵循的一个原则是:一行代码最长不能超过 IDE 显示的宽度。需要滚动鼠标才能查看一行的全部代码,显然不利于代码的阅读。当然,这个限制也不能太小,太小会导致很多稍长点的语句被折成两行,也会影响到代码的整洁,不利于阅读

5.3.3 善用空行分割单元块

对于比较长的函数,如果逻辑上可以分为几个独立的代码块,在不方便将这些独立的代码块抽取成小函数的情况下,为了让逻辑更加清晰,除了可以用总结性注释的方法之外,还可以使用空行来分割各个代码块

除此之外,在类的成员变量与函数之间、静态成员变量与普通成员变量之间、各函数之间、甚至各成员变量之间,我们都可以通过添加空行的方式,让这些不同模块的代码之间,界限更加明确。写代码就类似写文章,善于应用空行,可以让代码的整体结构看起来更加有清晰、有条理

5.3.4 四格缩进还是两格缩进?

“PHP 是世界上最好的编程语言?代码换行应该四格缩进还是两格缩进?”这应该是程序员争论得最多的两个话题了。据我所知,Java 语言倾向于两格缩进,PHP 语言倾向于四格缩进。至于到底应该是两格缩进还是四格缩进,我觉得这个取决于个人喜好。只要项目内部能够统一就行了

当然,还有一个选择的标准,那就是跟业内推荐的风格统一、跟著名开源项目统一。当我们需要拷贝一些开源的代码到项目里的时候,能够让引入的代码跟我们项目本身的代码,保持风格统一

不过,个人比较推荐使用两格缩进,这样可以节省空间。特别是在代码嵌套层次比较深的情况下,累计缩进较多的话,容易导致一个语句被折成两行,影响代码可读性

In addition, it is worth emphasizing that whether you use two-space indentation or four-space indentation, you must not use the Tab key to indent. Because under different IDEs, the display width of the Tab key is different, some are displayed as four-space indentation, and some are displayed as two-space indentation. If in the same project, different colleagues use different indentation methods (space indentation or tab key indentation), it may cause some codes to be displayed as two-space indentation, and some codes to be displayed as four-space indentation , but you can set the number of tabs indented in the IDE

5.3.5 Should curly braces be placed on a new line?

Should the left curly brace start on a new line? This too is controversial. As far as I know, PHP programmers like to start a new line, and Java programmers like to put it together with the previous statement. The specific code example is as follows:

// PHP
class ClassName
{
    
    
    public function foo()
    {
    
    
        // method body
    }
}
// Java
public class ClassName {
    
    
    public void foo() {
    
    
        // method body
    }
}

Personally, it is recommended to put the brackets on the same line as the statement. The reason is similar to the above, saving lines of code. But the way of opening curly braces on a new line also has its advantages. In this way, the left and right brackets can be aligned vertically, and which code belongs to which code block is more clear at a glance

However, it is still the same sentence, the braces are on the same line as the previous statement, or a new line, as long as the team is unified, the industry is unified, and it is in line with open source projects, there is no absolute distinction between good and bad

5.3.6 Arrangement order of members in a class

In the Java class file, first write the package name to which the class belongs, and then list the dependent classes introduced by import. In Google Coding Standards, dependent classes are arranged alphabetically from small to large

In a class, member variables come before functions. Between member variables or between functions, they are arranged in the manner of "static first (static function or static member variable), then ordinary (non-static function or non-static member variable)". In addition, between member variables or functions, they will be arranged in order of scope from large to small, first write public member variables or functions, then protected, and finally private

However, in different programming languages, the arrangement order of the internal members of the class may be quite different. For example, in C++, member variables are habitually placed behind functions. In addition, the arrangement order between functions will be arranged according to the size of the scope. In fact, there is another arrangement habit, which is to put functions that have a calling relationship together. For example, if a public function calls another private function, put the two together

5.4 Programming Tips

5.4.1 Divide the code into smaller unit blocks

Most people have the habit of reading code, first look at the whole and then look at the details. Therefore, we must have modular and abstract thinking, be good at refining large blocks of complex logic into classes or functions, and shield the details, so that people who read the code will not get lost in the details, which can greatly improve the readability of the code. However, only when the code logic is more complex, it is actually recommended to extract classes or functions. After all, if the extracted function only contains two or three lines of code, you have to skip it when reading the code, which will increase the reading cost

In the following example, before refactoring, in invest()the function , is the first piece of code about time processing difficult to understand? After refactoring, abstract this part of the logic into a function and name it isLastDayOfMonth. From the name, you can clearly understand its function and judge whether today is the
last day of the month. Here, the readability of the code is greatly improved by refining the complex logic code into a function

// 重构前的代码
public void invest(long userId, long financialProductId) {
    
    
        Calendar calendar = Calendar.getInstance();
        calendar.setTime(date);
        calendar.set(Calendar.DATE, (calendar.get(Calendar.DATE) + 1));

        if (calendar.get(Calendar.DAY_OF_MONTH) == 1) {
    
    
            return;
        }
        //...
}

// 重构后的代码:提炼函数之后逻辑更加清晰
public void invest(long userId, long financialProductId) {
    
    
        if (isLastDayOfMonth(new Date())) {
    
    
            return;
        }
        //...
}
public boolean isLastDayOfMonth(Date date) {
    
    
        Calendar calendar = Calendar.getInstance();
        calendar.setTime(date);
        calendar.set(Calendar.DATE, (calendar.get(Calendar.DATE) + 1));

        if (calendar.get(Calendar.DAY_OF_MONTH) == 1) {
    
    
            return true;
        }
        return false;
}

5.4.2 Avoid too many function parameters

When the function contains 3 or 4 parameters, it is still acceptable. When it is greater than or equal to 5, it feels that there are too many parameters, which will affect the readability of the code and is inconvenient to use. For the situation of too many parameters, there are generally two processing methods:

1. Consider whether the function has a single responsibility, and whether parameters can be reduced by splitting it into multiple functions

public void getUser(String username, String telephone, String email);
// 拆分成多个函数
public void getUserByUsername(String username);
public void getUserByTelephone(String telephone);
public void getUserByEmail(String email);

2. Encapsulate the parameters of the function into objects

public void postBlog(String title, String summary, String keywords, String cont
// 将参数封装成对象
public class Blog {
    
    
    private String title;
    private String summary;
    private String keywords;
    private Strint content;
    private String category;
    private long authorId;
}
public void postBlog(Blog blog);

In addition, if the function is a remote interface exposed to the outside world, encapsulating the parameters into an object can also improve the compatibility of the interface. When adding new parameters to the interface, the old remote interface caller may not need to modify the code to be compatible with the new interface

5.4.3 Do not use function parameters to control logic

Do not use Boolean-type identification parameters in functions to control internal logic. When true, use this logic, and when false, use another logic. This clearly violates the Single Responsibility Principle and the Interface Segregation Principle. It is recommended to split it into two functions for better readability. For example:

public void buyCourse(long userId, long courseId, boolean isVip);
// 将其拆分成两个函数
public void buyCourse(long userId, long courseId);
public void buyCourseForVip(long userId, long courseId);

However, if the function is a private function with limited scope of influence, or the two functions after splitting are often called at the same time, you can consider retaining the identification parameter as appropriate. The sample code is as follows:

// 拆分成两个函数的调用方式
boolean isVip = false;
//...省略其他逻辑...
if (isVip) {
    
    
    buyCourseForVip(userId, courseId);
} else {
    
    
    buyCourse(userId, courseId);
}
// 保留标识参数的调用方式更加简洁
boolean isVip = false;
//...省略其他逻辑...
buyCourse(userId, courseId, isVip);

In addition to the case where the boolean type is used as an identification parameter to control the logic, there is also a case of "according to whether the parameter is null" to control the logic. In this case, it should also be split into multiple functions. The function responsibilities after splitting are clearer and less likely to be used incorrectly. The specific code example is as follows:

public List<Transaction> selectTransactions(Long userId, Date startDate, Date endDate) {
    
    
    if (startDate != null && endDate != null) {
    
    
        // 查询两个时间区间的transactions
    }
    if (startDate != null && endDate == null) {
    
    
        // 查询startDate之后的所有transactions
    }
    if (startDate == null && endDate != null) {
    
    
        // 查询endDate之前的所有transactions
    }
    if (startDate == null && endDate == null) {
    
    
        // 查询所有的transactions
    }
}

// 拆分成多个public函数,更加清晰、易用
public List<Transaction> selectTransactionsBetween(Long userId, Date startDate, Date endDate) {
    
    
    return selectTransactions(userId, startDate, endDate);
}
public List<Transaction> selectTransactionsStartWith(Long userId, Date startDate, Date endDate) {
    
    
    return selectTransactions(userId, startDate, null);
}
public List<Transaction> selectTransactionsEndWith(Long userId, Date endDate) {
    
    
    return selectTransactions(userId, null, endDate);
}
public List<Transaction> selectAllTransactions(Long userId) {
    
    
    return selectTransactions(userId, null, null);
}
private List<Transaction> selectTransactions(Long userId, Date startDate, Date endDate) {
    
    
    // ...
}

5.4.4 Function design should have a single responsibility

When we talked about the single responsibility principle earlier, we aimed at application objects such as classes and modules. In fact, for the design of functions, the single responsibility principle must be satisfied. Compared with classes and modules, the granularity of functions is relatively small, and the number of lines of code is small. Therefore, when applying the principle of single responsibility, it is not as ambiguous as it is applied to classes or modules. It can be as simple as possible, as in the following example:

public boolean checkUserIfExisting(String telephone, String username, String email) {
    
    
        if (!StringUtils.isBlank(telephone)) {
    
    
            User user = userRepo.selectUserByTelephone(telephone);
            return user != null;
        }
        if (!StringUtils.isBlank(username)) {
    
    
            User user = userRepo.selectUserByUsername(username);
            return user != null;
        }
        if (!StringUtils.isBlank(email)) {
    
    
            User user = userRepo.selectUserByEmail(email);
            return user != null;
        }
        return false;
}
// 拆分成三个函数
public boolean checkUserIfExistingByTelephone(String telephone);
public boolean checkUserIfExistingByUsername(String username);
public boolean checkUserIfExistingByEmail(String email);

5.4.5 Remove too deep nesting levels

Too deep code nesting is often caused by excessive nesting of if-else, switch-case, and for loops. Personally, it is recommended that the nesting should not exceed two layers. After more than two layers, you should think about whether you can reduce the nesting. Too deep nesting itself is more difficult to understand. In addition, too deep nesting is easy to indent the code multiple times, causing the statement inside the nest to exceed the length of one line and fold into two lines, which affects the cleanliness of the code.

The method of solving too deep nesting is relatively mature, and there are the following four common ideas:

1. Remove redundant if or else statements

// 示例一
public double caculateTotalAmount(List<Order> orders) {
    
    
        if (orders == null || orders.isEmpty()) {
    
    
            return 0.0;
        } else {
    
     // 此处的else可以去掉
            double amount = 0.0;
            for (Order order : orders) {
    
    
                if (order != null) {
    
    
                amount += (order.getCount() * order.getPrice());
            }
        }
        return amount;
        }
}

// 示例二
public List<String> matchStrings(List<String> strList,String substr) {
    
    
        List<String> matchedStrings = new ArrayList<>();
        if (strList != null && substr != null) {
    
    
            for (String str : strList) {
    
    
                if (str != null) {
    
     // 跟下面的if语句可以合并在一起
                    if (str.contains(substr)) {
    
    
                        matchedStrings.add(str);
                    }
                }
            }
        }
        return matchedStrings;
}

2. Use the continue, break, and return keywords provided by the programming language to exit the nesting in advance

// 重构前的代码
public List<String> matchStrings(List<String> strList,String substr) {
    
    
        List<String> matchedStrings = new ArrayList<>();
        if (strList != null && substr != null){
    
    
            for (String str : strList) {
    
    
                if (str != null && str.contains(substr)) {
    
    
                    matchedStrings.add(str);
                    // 此处还有10行代码...
                }
            }
        }
        return matchedStrings;
}

// 重构后的代码:使用continue提前退出
public List<String> matchStrings(List<String> strList,String substr) {
    
    
        List<String> matchedStrings = new ArrayList<>();
        if (strList != null && substr != null){
    
    
            for (String str : strList) {
    
    
                if (str == null || !str.contains(substr)) {
    
    
                    continue;
                }
                matchedStrings.add(str);
                // 此处还有10行代码...
            }
        }
        return matchedStrings;
}

3. Adjust the execution order to reduce nesting

// 重构前的代码
public List<String> matchStrings(List<String> strList,String substr) {
    
    
        List<String> matchedStrings = new ArrayList<>();
        if (strList != null && substr != null) {
    
    
            for (String str : strList) {
    
    
                if (str != null) {
    
    
                    if (str.contains(substr)) {
    
    
                        matchedStrings.add(str);
                    }
                }
            }
        }
        return matchedStrings;
}

// 重构后的代码:先执行判空逻辑,再执行正常逻辑
public List<String> matchStrings(List<String> strList,String substr) {
    
    
        if (strList == null || substr == null) {
    
     //先判空
            return Collections.emptyList();
        }
        List<String> matchedStrings = new ArrayList<>();
        for (String str : strList) {
    
    
            if (str != null) {
    
    
                if (str.contains(substr)) {
    
    
                    matchedStrings.add(str);
                }
            }
        }
        return matchedStrings;
}

4. Encapsulate part of the nesting logic into function calls to reduce nesting

// 重构前的代码
public List<String> appendSalts(List<String> passwords) {
    
    
        if (passwords == null || passwords.isEmpty()) {
    
    
            return Collections.emptyList();
        }
        List<String> passwordsWithSalt = new ArrayList<>();
        for (String password : passwords) {
    
    
            if (password == null) {
    
    
                continue;
            }
            if (password.length() < 8) {
    
    
                // ...
            } else {
    
    
                // ...
            }
        }
        return passwordsWithSalt;
}

// 重构后的代码:将部分逻辑抽成函数
public List<String> appendSalts(List<String> passwords) {
    
    
        if (passwords == null || passwords.isEmpty()) {
    
    
            return Collections.emptyList();
        }
        List<String> passwordsWithSalt = new ArrayList<>();
        for (String password : passwords) {
    
    
            if (password == null) {
    
    
                continue;
            }
            passwordsWithSalt.add(appendSalt(password));
        }
        return passwordsWithSalt;
}
private String appendSalt(String password) {
    
    
        String passwordWithSalt = password;
        if (password.length() < 8) {
    
    
            // ...
        } else {
    
    
            // ...
        }
        return passwordWithSalt;
}

In addition, there are commonly used methods to replace if-else and switch-case conditional judgments by using polymorphism. This idea involves changes in the code structure

5.4.6 Learn to use explanatory variables

There are two common situations where explanatory variables are used to improve the readability of code:

1. Constants instead of magic numbers

public double CalculateCircularArea(double radius) {
    
    
    return (3.1415) * radius * radius;
}
// 常量替代魔法数字
public static final Double PI = 3.1415;
public double CalculateCircularArea(double radius) {
    
    
    return PI * radius * radius;
}

2. Use explanatory variables to explain complex expressions

if (date.after(SUMMER_START) && date.before(SUMMER_END)) {
    
    
    // ...
} else {
    
    
    // ...
}

// 引入解释性变量后逻辑更加清晰
boolean isSummer = date.after(SUMMER_START)&&date.before(SUMMER_END);
if (isSummer) {
    
    
    // ...
} else {
    
    
    // ...
}

6. Through a piece of ID generator code, learn how to find code quality problems

6.1 Requirement background introduction

The Chinese translation of "ID" is "identification (Identifier)". This concept can be seen everywhere in life and work, such as ID cards, product barcodes, QR codes, license plate numbers, and driver's license numbers. Focusing on software development, ID is often used to represent the unique identifier of some business information, such as the order number or the unique primary key in the database, such as the ID field in the address table (actually it has no business meaning, it is transparent, do not need attention)

Assume that in the development of a back-end business system, in order to facilitate troubleshooting when a request goes wrong, logs will be printed on the critical path when writing code. After a request goes wrong, it is hoped that all the logs corresponding to the request can be searched to find the cause of the problem. In fact, in the log file, the logs of different requests will be intertwined. There is no way to correlate all logs for the same request without something to identify which logs belong to the same request

This sounds a bit like call chain tracing in microservices. However, the call chain tracking in microservices is the tracking between services, and what we need to implement now is the tracking within the service

Drawing on the implementation idea of ​​microservice call chain tracking, a unique ID can be assigned to each request and stored in the request context (Context), for example, in the local variable of the worker thread that processes the request. In the Java language, you can store the ID in the ThreadLocal of the Servlet thread, or use the MDC (Mapped Diagnostic Contexts) of the Slf4j log framework to implement (in fact, the underlying principle is also based on the ThreadLocal of the thread). Every time the log is printed, the request ID is taken out from the request context and output together with the log. In this way, all logs of the same request contain the same request ID information, and all logs of the same request can be searched by the request ID

6.2 A "working" code implementation

public class IdGenerator {
    
    
    private static final Logger logger = LoggerFactory.getLogger(IdGenerator.class);

    public static String generate() {
    
    
        String id = "";
        try {
    
    
            String hostName = InetAddress.getLocalHost().getHostName();
            String[] tokens = hostName.split("\\.");
            if (tokens.length > 0) {
    
    
                hostName = tokens[tokens.length - 1];
            }
            char[] randomChars = new char[8];
            int count = 0;
            Random random = new Random();
            while (count < 8) {
    
    
                int randomAscii = random.nextInt(122);
                if (randomAscii >= 48 && randomAscii <= 57) {
    
    
                    randomChars[count] = (char)('0' + (randomAscii - 48));
                    count++;
                } else if (randomAscii >= 65 && randomAscii <= 90) {
    
    
                    randomChars[count] = (char)('A' + (randomAscii - 65));
                    count++;
                } else if (randomAscii >= 97 && randomAscii <= 122) {
    
    
                    randomChars[count] = (char)('a' + (randomAscii - 97));
                    count++;
                }
            }
            id = String.format("%s-%d-%s", hostName,
                    System.currentTimeMillis(), new String(randomChars));
        } catch (UnknownHostException e) {
    
    
            logger.warn("Failed to get the host name.", e);
        }
        return id;
    }
}

An example of an ID generated by the code above is shown below. The entire ID consists of three parts. The first part is the last field of the hostname. The second part is the current timestamp, accurate to milliseconds. The third part is an 8-bit random string containing uppercase and lowercase letters and numbers. Although the ID generated in this way is not absolutely unique and may be duplicated, in fact the probability of duplication is very low. For log tracking, a very small probability of ID duplication is perfectly acceptable

103-1577456311467-3nR3Do45
103-1577456311468-0wnuV5yw
103-1577456311468-sdrnkFxN
103-1577456311468-8lwk0BP0

6.3 How to find code quality problems?

If you look at the big picture, you can refer to the code quality evaluation criteria mentioned above to see if this code is readable, extensible, maintainable, flexible, concise, reusable, testable, etc. To implement the specific details, you can examine the code from the following aspects:

  • Is the directory setting reasonable, is the module division clear, and does the code structure satisfy "high cohesion and loose coupling"?
  • Does it follow classic design principles and design ideas (SOLID, DRY, KISS, YAGNI, LOD, etc.)?
  • Are design patterns applied properly? Is there over-engineering?
  • Is the code easily extensible? If new functionality is to be added, is it easy to implement?
  • Can the code be reused? Is it possible to reuse existing project code or class library? Is there reinvention of the wheel?
  • Is the code easy to test? Do the unit tests comprehensively cover normal and abnormal cases?
  • Is the code easy to read? Does it conform to coding standards (such as whether the naming and comments are appropriate, whether the code style is consistent, etc.)?

The above are some general concerns, which can be used as general inspection items and applied to any code refactoring. In addition, it is also necessary to pay attention to whether the code implementation meets the unique functional and non-functional requirements of the business itself. Here is a list of some of the more common issues, as follows. This list may not be comprehensive enough, and the rest needs to be analyzed specifically for specific businesses and specific codes:

  • Does the code implement the expected business requirements?
  • Is the logic correct? Are exceptions handled?
  • Are the logs printed properly? Is it convenient to debug and troubleshoot problems?
  • Is the interface easy to use? Does it support idempotence, transactions, etc.?
  • Does the code have concurrency issues? Is it thread safe?
  • Is there room for optimization in performance, for example, can SQL and algorithms be optimized?
  • Are there security holes? For example, is the input and output verification comprehensive?

Now, according to the above check items, let’s see what are the problems with the 6.2 code

1. The code of IdGenerator is relatively simple, with only one class, so it does not involve directory settings, module division, code structure issues, and does not violate basic design principles such as SOLID, DRY, KISS, YAGNI, and LOD. It does not apply design patterns, so there is no problem of unreasonable use and overdesign

2. IdGenerator is designed as an implementation class instead of an interface, and the caller directly relies on the implementation instead of the interface, which violates the design idea of ​​programming based on interfaces rather than implementation. In fact, it is not a big problem to design IdGenerator as an implementation class without defining an interface. If one day the ID generation algorithm changes, you only need to directly modify the code of the implementation class. However, if two ID generation algorithms need to exist in the project at the same time, that is to say, two IdGenerator implementation classes must exist at the same time. For example, this framework needs to be used by more systems. When the system is in use, it can flexibly choose the generation algorithm it needs. At this time, you need to define IdGenerator as an interface, and define different implementation classes for different generation algorithms

3、把 IdGenerator 的 generate() 函数定义为静态函数,会影响使用该函数的代码的可测试性。同时,generate() 函数的代码实现依赖运行环境(本机名)、时间函数、随机函数,所以 generate() 函数本身的可测试性也不好,需要做比较大的重构。除此之外,也没有编写单元测试代码,需要在重构时对其进行补充

4、虽然 IdGenerator 只包含一个函数,并且代码行数也不多,但代码的可读性并不好。特别是随机字符串生成的那部分代码,一方面,代码完全没有注释,生成算法比较难读懂,另一方面,代码里有很多魔法数,严重影响代码的可读性。在重构的时候,需要重点提高这部分代码的可读性

刚刚参照跟业务本身无关的、通用的代码质量关注点,对代码进行了评价。现在,再对照业务本身的功能和非功能需求,重新审视一下代码

1、前面提到,虽然代码生成的 ID 并非绝对的唯一,但是对于追踪打印日志来说,是可以接受小概率 ID 冲突的,满足预期的业务需求。不过,获取 hostName 这部分代码逻辑貌似有点问题,并未处理“hostName 为空”的情况。除此之外,尽管代码中针对获取不到本机名的情况做了异常处理,但是对异常的处理是在 IdGenerator 内部将其吐掉,然后打印一条报警日志,并没有继续往上抛出。这样的异常处理是否得当呢?

2、代码的日志打印得当,日志描述能够准确反应问题,方便 debug,并且没有过多的冗余日志。IdGenerator 只暴露一个 generate() 接口供使用者使用,接口的定义简单明了,不存在不易用问题。generate() 函数代码中没有涉及共享变量,所以代码线程安全,多线程环境下调用 generate() 函数不存在并发问题

3、性能方面,ID 的生成不依赖外部存储,在内存中生成,并且日志的打印频率也不会很高,所以代码在性能方面足以应对目前的应用场景。不过,每次生成 ID 都需要获取本机名,获取主机名会比较耗时,所以,这部分可以考虑优化一下。还有,randomAscii 的范围是 0~122,但可用数字仅包含三段子区间(0~9,a~z,A~Z),极端情况下会随机生成很多三段区间之外的无效数字,需要循环很多次才能生成随机字符串,所以随机字符串的生成算法也可以优化一下

有一些代码质量问题不具有共性,没法一一罗列,需要针对具体的业务、具体的代码去具体分析。那像这份代码,还能发现有哪些具体问题吗?

generate() 函数的 while 循环里面,三个 if 语句内部的代码非常相似,而且实现稍微有点过于复杂了,实际上可以进一步简化,将这三个 if 合并在一起

6.4 重构

前面讲到系统设计和实现的时候,多次讲到要循序渐进、小步快跑。重构代码的过程也应该遵循这样的思路。每次改动一点点,改好之后,再进行下一轮的优化,保证每次对代码的改动不会过大,能在很短的时间内完成。所以,将上面发现的代码质量问题,分成四次重构来完成,具体如下所示:

  1. 提高代码的可读性
  2. 提高代码的可测试性
  3. 编写完善的单元测试
  4. 所有重构完成之后添加注释

6.4.1 提高代码的可读性

首先,要解决最明显、最急需改进的代码可读性问题。具体有下面几点:

  • hostName 变量不应该被重复使用,尤其当这两次使用时的含义还不同的时候
  • 将获取 hostName 的代码抽离出来,定义为 getLastfieldOfHostName() 函数
  • 删除代码中的魔法数,比如,57、90、97、122
  • 将随机数生成的代码抽离出来,定义为 generateRandomAlphameric() 函数
  • generate() 函数中的三个 if 逻辑重复了,且实现过于复杂,我们要对其进行简化
  • 对 IdGenerator 类重命名,并且抽象出对应的接口

这里重点讨论下最后一个修改。实际上,对于 ID 生成器的代码,有下面三种类的命名方式。哪种更合适呢?

insert image description here

1、将接口命名为 IdGenerator,实现类命名为 LogTraceIdGenerator,这可能是很多人最先想到的命名方式了。在命名的时候,要考虑到,以后两个类会如何使用、会如何扩展。从使用和扩展的角度来分析,这样的命名就不合理了

首先,如果扩展新的日志 ID 生成算法,也就是要创建另一个新的实现类,因为原来的实现类已经叫 LogTraceIdGenerator 了,命名过于通用,那新的实现类就不好取名了,无法取一个跟 LogTraceIdGenerator 平行的名字了

其次可能会说,假设没有日志 ID 的扩展需求,但要扩展其他业务的 ID 生成算法,比如针对用户的(UserldGenerator)、订单的(OrderIdGenerator),第一种命名方式是不是就是合理的呢?答案也是否定的。基于接口而非实现编程,主要的目的是为了方便后续灵活地替换实现类。而 LogTraceIdGenerator、UserIdGenerator、OrderIdGenerator 三个类从命名上来看,涉及的是完全不同的业务,不存在互相替换的场
景。也就是说,不可能在有关日志的代码中,进行下面这种替换。所以,让这三个类实现同一个接口,实际上是没有意义的

IdGenearator idGenerator = new LogTraceIdGenerator();
替换为:
IdGenearator idGenerator = new UserIdGenerator();

2、第二种命名方式是不是就合理了呢?答案也是否定的。其中,LogTraceIdGenerator 接口的命名是合理的,但是 HostNameMillisIdGenerator 实现类暴露了太多实现细节,只要代码稍微有所改动,就可能需要改动命名,才能匹配实现

3. The third naming method is recommended. In the current code implementation of the ID generator, the generated ID is a random ID, not in an incremental order. Therefore, it is more reasonable to name it RandomIdGenerator. Even if the internal generation algorithm is changed, as long as the generated ID is still random, No need to change the naming. If you need to extend a new ID generation algorithm, such as implementing an incrementally ordered ID generation algorithm, you can name it SequenceIdGenerator

In fact, a better naming method is to abstract two interfaces, one is IdGenerator and the other is LogTraceIdGenerator, and LogTraceIdGenerator inherits IdGenerator. The implementation class implements the interface LogTraceIdGenerator, named as RandomIdGenerator, SequenceIdGenerator, etc. In this way, the implementation class can be reused in multiple business modules, such as the aforementioned user and order

According to the above optimization strategy, the first round of refactoring is performed on the code. The code after refactoring is as follows:

public interface IdGenerator {
    
    
    String generate();
}

public interface LogTraceIdGenerator extends IdGenerator {
    
    
}

public class RandomIdGenerator implements IdGenerator {
    
    
    private static final Logger logger = LoggerFactory.getLogger(RandomIdGenerator.class);

    @Override
    public String generate() {
    
    
        String substrOfHostName = getLastfieldOfHostName();
        long currentTimeMillis = System.currentTimeMillis();
        String randomString = generateRandomAlphameric(8);
        String id = String.format("%s-%d-%s",
                substrOfHostName, currentTimeMillis, randomString);
        return id;
    }
    private String getLastfieldOfHostName() {
    
    
        String substrOfHostName = null;
        try {
    
    
            String hostName = InetAddress.getLocalHost().getHostName();
            String[] tokens = hostName.split("\\.");
            substrOfHostName = tokens[tokens.length - 1];
            return substrOfHostName;
        } catch (UnknownHostException e) {
    
    
            logger.warn("Failed to get the host name.", e);
        }
        return substrOfHostName;
    }
    private String generateRandomAlphameric(int length) {
    
    
        char[] randomChars = new char[length];
        int count = 0;
        Random random = new Random();
        while (count < length) {
    
    
            int maxAscii = 'z';
            int randomAscii = random.nextInt(maxAscii);
            boolean isDigit= randomAscii >= '0' && randomAscii <= '9';
            boolean isUppercase= randomAscii >= 'A' && randomAscii <= 'Z';
            boolean isLowercase= randomAscii >= 'a' && randomAscii <= 'z';
            if (isDigit|| isUppercase || isLowercase) {
    
    
                randomChars[count] = (char) (randomAscii);
                ++count;
            }
        }
        return new String(randomChars);
    }
}

// 代码使用举例
LogTraceIdGenerator logTraceIdGenerator = new RandomIdGenerator();

6.4.2 Improve code testability

Questions about code testability mainly include the following two aspects:

  1. generate()The function is defined as a static function, which affects the testability of the code that uses the function
  2. generate()The code implementation of the function depends on the running environment (local name), time function, and random function, so the testability of generate()the function itself is not good

For the first point, it has been solved in the first round of refactoring. generate()Redefine the static functions in the RandomIdGenerator class as ordinary functions. The caller can create the RandomIdGenerator object externally and inject it into his own code through dependency injection, so as to solve the problem that the static function call affects the testability of the code

For the second point, refactoring needs to be done on the basis of the first round of refactoring. The refactored code is as follows, mainly including the following code changes:

  1. From getLastfieldOfHostName()the function , the part of the code with more complex logic is stripped and defined as getLastSubstrSplittedByDot()a function. Since getLastfieldOfHostName()the function relies on the local hostname, this function becomes so simple that the main code is stripped out, so you don't need to test it. Focus on the test getLastSubstrSplittedByDot()function
  2. generateRandomAlphameric()Set the access permissions of the and getLastSubstrSplittedByDot()functions to protected. The purpose of this is that the two functions can be called directly through the object in the unit test for testing
  3. Add Google Guava's annotation @VisibleForTesting to generateRandomAlphameric()and two functions. getLastSubstrSplittedByDot()This annotation has no practical effect, it only serves as an identification, telling others that these two functions should have private access rights, and the reason why the access rights are raised to protected is only for testing and can only be used for unit testing middle

public class RandomIdGenerator implements LogTraceIdGenerator {
    
    
  private static final Logger logger = LoggerFactory.getLogger(RandomIdGenerator.class);

  @Override
  public String generate() {
    
    
    String substrOfHostName = getLastfieldOfHostName();
    long currentTimeMillis = System.currentTimeMillis();
    String randomString = generateRandomAlphameric(8);
    String id = String.format("%s-%d-%s",
            substrOfHostName, currentTimeMillis, randomString);
    return id;
  }

  private String getLastfieldOfHostName() {
    
    
    String substrOfHostName = null;
    try {
    
    
      String hostName = InetAddress.getLocalHost().getHostName();
      substrOfHostName = getLastSubstrSplittedByDot(hostName);
    } catch (UnknownHostException e) {
    
    
      logger.warn("Failed to get the host name.", e);
    }
    return substrOfHostName;
  }

  @VisibleForTesting
  protected String getLastSubstrSplittedByDot(String hostName) {
    
    
    String[] tokens = hostName.split("\\.");
    String substrOfHostName = tokens[tokens.length - 1];
    return substrOfHostName;
  }

  @VisibleForTesting
  protected String generateRandomAlphameric(int length) {
    
    
    char[] randomChars = new char[length];
    int count = 0;
    Random random = new Random();
    while (count < length) {
    
    
      int maxAscii = 'z';
      int randomAscii = random.nextInt(maxAscii);
      boolean isDigit= randomAscii >= '0' && randomAscii <= '9';
      boolean isUppercase= randomAscii >= 'A' && randomAscii <= 'Z';
      boolean isLowercase= randomAscii >= 'a' && randomAscii <= 'z';
      if (isDigit|| isUppercase || isLowercase) {
    
    
        randomChars[count] = (char) (randomAscii);
        ++count;
      }
    }
    return new String(randomChars);
  }
}

The Logger object that prints the log is defined as static final and created inside the class. Does this affect the testability of the code? Should the Logger object be injected into the class through dependency injection?

The reason why dependency injection can improve code testability is mainly because, in this way, it is easy to replace the real objects of dependencies with mock objects. So why mock this object? This is because this object participates in logic execution (for example, depends on the data it outputs for subsequent calculations) but is uncontrollable. For the Logger object, it only writes data into it, does not read the data, does not participate in the execution of business logic, and will not affect the correctness of the code logic, so there is no need to mock the Logger object

In addition, some value objects that are only used to store data, such as String, Map, and UserVo, do not need to be created through dependency injection. They can be created directly in the class through new

6.4.3 Writing a good unit test

After the above refactoring, the more obvious problems in the code have basically been solved. Now complete the unit tests for the code. There are 4 functions in the RandomIdGenerator class:

public String generate();
private String getLastfieldOfHostName();
@VisibleForTesting
protected String getLastSubstrSplittedByDot(String hostName);
@VisibleForTesting
protected String generateRandomAlphameric(int length);

Let's look at the latter two functions first. The logic contained in these two functions is more complicated and is the focus of the test. Moreover, in the previous step of refactoring, in order to improve the testability of the code, these two parts of the code have been isolated from uncontrollable components (local name, random function, time function). Therefore, you only need to design a complete unit test case. The specific code implementation is as follows (note that the JUnit test framework is used here):

public class RandomIdGeneratorTest {
    
    
  @Test
  public void testGetLastSubstrSplittedByDot() {
    
    
    RandomIdGenerator idGenerator = new RandomIdGenerator();
    String actualSubstr = idGenerator.getLastSubstrSplittedByDot("field1.field2.field3");
    Assert.assertEquals("field3", actualSubstr);

    actualSubstr = idGenerator.getLastSubstrSplittedByDot("field1");
    Assert.assertEquals("field1", actualSubstr);

    actualSubstr = idGenerator.getLastSubstrSplittedByDot("field1#field2#field3");
    Assert.assertEquals("field1#field2#field3", actualSubstr);
  }

  // 此单元测试会失败,因为在代码中没有处理hostName为null或空字符串的情况
  @Test
  public void testGetLastSubstrSplittedByDot_nullOrEmpty() {
    
    
    RandomIdGenerator idGenerator = new RandomIdGenerator();
    String actualSubstr = idGenerator.getLastSubstrSplittedByDot(null);
    Assert.assertNull(actualSubstr);

    actualSubstr = idGenerator.getLastSubstrSplittedByDot("");
    Assert.assertEquals("", actualSubstr);
  }

  @Test
  public void testGenerateRandomAlphameric() {
    
    
    RandomIdGenerator idGenerator = new RandomIdGenerator();
    String actualRandomString = idGenerator.generateRandomAlphameric(6);
    Assert.assertNotNull(actualRandomString);
    Assert.assertEquals(6, actualRandomString.length());
    for (char c : actualRandomString.toCharArray()) {
    
    
         Assert.assertTrue(('0' <= c && c <= '9') || ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z'));
    }
  }

  // 此单元测试会失败,因为在代码中没有处理length<=0的情况
  @Test
  public void testGenerateRandomAlphameric_lengthEqualsOrLessThanZero() {
    
    
    RandomIdGenerator idGenerator = new RandomIdGenerator();
    String actualRandomString = idGenerator.generateRandomAlphameric(0);
    Assert.assertEquals("", actualRandomString);

    actualRandomString = idGenerator.generateRandomAlphameric(-1);
    Assert.assertNull(actualRandomString);
  }
}

Let's look at generate()the function . This function is also the only public function we expose for external use. Although the logic is relatively simple, it is best to test it. But, it relies on hostname, random function, time function, how can we test it? Do I need to mock the implementation of these functions?

Actually, it depends on the situation. As mentioned earlier, when writing unit tests, the test object is the function defined by the function, not the specific implementation logic. In this way, after the implementation logic of the function is changed, the unit test cases can still work. What is the function realized by the generate()function ? This is entirely up to the code writer to define

For example, for the code implementation of the same generate()function , there can be 3 different function definitions, corresponding to 3 different unit tests:

  1. generate()If the function of the function is defined as: "generate a random unique ID", then it is only necessary to test generate()whether the ID generated by calling the function multiple times is unique
  2. generate()If the function of the function is defined as: "generate a unique ID containing only numbers, uppercase and lowercase letters, and dashes", then not only the uniqueness of the ID must be tested, but also whether the generated ID contains only numbers, uppercase and lowercase letters and dash
  3. If the function of generate()the function is defined as: "Generate a unique ID, the format is: {hostname substr}-{timestamp}-{8-digit random number}. When the hostname fails to be obtained, return: null-{timestamp}- {8-digit random number}”, then not only to test the uniqueness of the ID, but also to test whether the generated ID fully complies with the format requirements

How to write unit test cases depends on how functions are defined . For the first two definitions of generate()the function , there is no need to mock the function of obtaining the host name, random function, time function, etc., but for the third definition, it is necessary to mock the function of obtaining the host name and let it return null to test whether the code runs as expected

Finally, let's look at getLastfieldOfHostName()the function . In fact, this function is not easy to test, because it calls a static function ( InetAddress.getLocalHost().getHostName();), and this static function depends on the runtime environment. However, the implementation of this function is very simple, and the naked eye can basically rule out obvious bugs, so it is not necessary to write unit test code for it. After all, the purpose of writing unit tests is to reduce code bugs, not to write unit tests for the sake of writing unit tests

Of course, if you really want to test it, there is a way. One approach is to use a more advanced testing framework. For example, PowerMock, which can mock static functions. Another way is to repackage the logic of obtaining the hostname as a new function. However, the latter method will cause the code to be too fragmented, and will also slightly affect the readability of the code. This requires you to weigh the pros and cons to make a choice

6.4.3 Adding comments

Comments can't be too much, and can't be too little, mainly added to classes and functions. Some people say that good naming can replace comments and clearly express meaning. This is true for variable naming, but not necessarily true for classes or functions. The logic contained in a class or function is often complex, and it is difficult to clearly indicate what function is realized by simply naming it. At this time, it needs to be supplemented by comments. For example, the three function definitions mentioned above for generate()the function cannot be reflected in the name, and need to be added to the comment

For how to write comments, the main thing is to write clearly: what to do, why, how to do, how to use, explain some boundary conditions, special circumstances, and explain function input, output, and exceptions

/**
 * Id Generator that is used to generate random IDs.
 *
 * <p>
 * The IDs generated by this class are not absolutely unique,
 * but the probability of duplication is very low.
 */
public class RandomIdGenerator implements LogTraceIdGenerator {
    
    
  private static final Logger logger = LoggerFactory.getLogger(RandomIdGenerator.class);

  /**
   * Generate the random ID. The IDs may be duplicated only in extreme situation.
   *
   * @return an random ID
   */
  @Override
  public String generate() {
    
    
    //...
  }

  /**
   * Get the local hostname and
   * extract the last field of the name string splitted by delimiter '.'.
   *
   * @return the last field of hostname. Returns null if hostname is not obtained.
   */
  private String getLastfieldOfHostName() {
    
    
    //...
  }

  /**
   * Get the last field of {@hostName} splitted by delemiter '.'.
   *
   * @param hostName should not be null
   * @return the last field of {@hostName}. Returns empty string if {@hostName} is empty string.
   */
  @VisibleForTesting
  protected String getLastSubstrSplittedByDot(String hostName) {
    
    
    //...
  }

  /**
   * Generate random string which
   * only contains digits, uppercase letters and lowercase letters.
   *
   * @param length should not be less than 0
   * @return the random string. Returns empty string if {@length} is 0
   */
  @VisibleForTesting
  protected String generateRandomAlphameric(int length) {
    
    
    //...
  }
}

6.5 Exception handling

Function results can be divided into two categories. One category is the expected result, which is what the function outputs under normal circumstances. One category is unexpected results, that is, the results output by the function under abnormal (or called error) conditions. For example, the above function to obtain the local host name, under normal circumstances, the function returns the local host name in string format; under abnormal circumstances, if the acquisition of the local host name fails, the function returns the UnknownHostException exception object

Under normal circumstances, the type of data returned by the function is very clear, but in exceptional cases, the type of data returned by the function is very flexible, and there are many choices. In addition to the exception objects like UnknownHostException just mentioned, functions can also return error codes, NULL values, special values ​​(such as -1), empty objects (such as empty strings, empty collections), etc.

Each exception return data type has its own characteristics and applicable scenarios. But sometimes, under abnormal circumstances, it is not so easy to judge what data type the function should return. For example, what should generate()the function ? Is it abnormal? Null character? Or a NULL value? Or other special values ​​(such as null-15293834874-fd3A9KBn, null means that the host name has not been obtained)?

Functions are a very important unit of writing code, and the exception handling of functions must be considered at all times when writing functions. Therefore, how to design the return data type of the function in exceptional cases is very important

Earlier, a very simple ID generator code was reconstructed from "usable" to "easy to use". The final code seems to be perfect, but if you think about it more carefully, there is still room for further optimization in the way of error handling in the code

public class RandomIdGenerator implements IdGenerator {
    
    
    private static final Logger logger = LoggerFactory.getLogger(RandomIdGenerator.class);

    @Override
    public String generate() {
    
    
        String substrOfHostName = getLastFiledOfHostName();
        long currentTimeMillis = System.currentTimeMillis();
        String randomString = generateRandomAlphameric(8);
        String id = String.format("%s-%d-%s",
                substrOfHostName, currentTimeMillis, randomString);
        return id;
    }
    private String getLastFiledOfHostName() {
    
    
        String substrOfHostName = null;
        try {
    
    
            String hostName = InetAddress.getLocalHost().getHostName();
            substrOfHostName = getLastSubstrSplittedByDot(hostName);
        } catch (UnknownHostException e) {
    
    
            logger.warn("Failed to get the host name.", e);
        }
        return substrOfHostName;
    }
    @VisibleForTesting
    protected String getLastSubstrSplittedByDot(String hostName) {
    
    
        String[] tokens = hostName.split("\\.");
        String substrOfHostName = tokens[tokens.length - 1];
        return substrOfHostName;
    }
    @VisibleForTesting
    protected String generateRandomAlphameric(int length) {
    
    
        char[] randomChars = new char[length];
        int count = 0;
        Random random = new Random();
        while (count < length) {
    
    
            int maxAscii = 'z';
            int randomAscii = random.nextInt(maxAscii);
            boolean isDigit= randomAscii >= '0' && randomAscii <= '9';
            boolean isUppercase= randomAscii >= 'A' && randomAscii <= 'Z';
            boolean isLowercase= randomAscii >= 'a' && randomAscii <= 'z';
            if (isDigit|| isUppercase || isLowercase) {
    
    
                randomChars[count] = (char) (randomAscii);
                ++count;
            }
        }
        return new String(randomChars);
    }
}

There are four functions in this code. Regarding the error handling methods of these four functions, there are several problems as follows:

  1. For generate()the function , if the host name fails to be obtained, what does the function return? Is such a return value reasonable?
  2. For getLastFiledOfHostName()the function , should the UnknownHostException be swallowed inside the function (try-catch and print the log)? Or should the exception continue to be thrown up? If it is thrown upwards, should the UnknownHostException be thrown as it is, or encapsulated into a new exception?
  3. For getLastSubstrSplittedByDot(String hostName)the function , if hostName is NULL or an empty string, what should the function return?
  4. For generateRandomAlphameric(int length)the function , if length is less than 0 or equal to 0, what should this function return?

6.5.1 What should the function return when an error occurs?

Regarding the data type returned by the function error, four situations are summarized, they are: error code, NULL value, empty object, exception object

6.5.1.1 Error code returned

There is no grammatical mechanism such as exception in C language, so returning an error code is the most commonly used error handling method. In relatively new programming languages ​​such as Java and Python, in most cases, exceptions are used to handle function errors, and error codes are rarely used.

In the C language, there are two ways to return the error code: one is to directly occupy the return value of the function, and the return value of the normal execution of the function is placed in the output parameter; the other is to define the error code as a global variable, in the function When an error occurs, the caller of the function obtains the error code through this global variable. For these two methods, examples are as follows:

// 错误码的返回方式一:pathname/flags/mode为入参;fd为出参,存储打开的文件句柄。
int open(const char *pathname, int flags, mode_t mode, int* fd) {
    
    
    if (/*文件不存在*/) {
    
    
        return EEXIST;
    }
    if (/*没有访问权限*/) {
    
    
        return EACCESS;
    }
    if (/*打开文件成功*/) {
    
    
        return SUCCESS; // C语言中的宏定义:#define SUCCESS 0
    }
    // ...
}
//使用举例
int fd;
int result = open(“c:\test.txt”, O_RDWR, S_IRWXU|S_IRWXG|S_IRWXO, &fd);
if (result == SUCCESS) {
    
    
    // 取出fd使用
} else if (result == EEXIST) {
    
    
    //...
} else if (result == EACESS) {
    
    
    //...
}

// 错误码的返回方式二:函数返回打开的文件句柄,错误码放到errno中。
int errno; // 线程安全的全局变量
int open(const char *pathname, int flags, mode_t mode){
    
    
    if (/*文件不存在*/) {
    
    
        errno = EEXIST;
        return -1;
    }
    if (/*没有访问权限*/) {
    
    
        errno = EACCESS;
        return -1;
    }
    // ...
}
// 使用举例
int hFile = open(“c:\test.txt”, O_RDWR, S_IRWXU|S_IRWXG|S_IRWXO);
if (-1 == hFile) {
    
    
    printf("Failed to open file, error no: %d.\n", errno);
    if (errno == EEXIST ) {
    
    
        // ...
    } else if(errno == EACCESS) {
    
    
        // ...
    }
    // ...
}

In fact, if there is a grammatical mechanism of exceptions in the programming language you are familiar with, try not to use error codes. Compared with error codes, exceptions have many advantages, for example, they can carry more error information (the exception can contain message, stack trace, etc.), etc.

6.5.1.2 Returning NULL values

In most programming languages, NULL is used to represent the semantics of "not existing". However, many people on the Internet do not recommend that functions return NULL values. They think this is a bad design idea. There are two main reasons:

  1. If a function may return a NULL value, when using it, if you forget to make a NULL value judgment, a Null Pointer Exception (Null Pointer Exception, abbreviated as NPE) may be thrown
  2. If you define many functions whose return value may be NULL, the code will be filled with a lot of NULL value judgment logic. On the one hand, it is cumbersome to write, and on the other hand, they are coupled with normal business logic, which will affect the code. readability

For example:

public class UserService {
    
    
    private UserRepo userRepo; // 依赖注入
    public User getUser(String telephone) {
    
    
        // 如果用户不存在,则返回null
        return null;
    }
}

// 使用函数getUser()
User user = userService.getUser("18917718965");
if (user != null) {
    
     // 做NULL值判断,否则有可能会报NPE
    String email = user.getEmail();
    if (email != null) {
    
     // 做NULL值判断,否则有可能会报NPE
        String escapedEmail = email.replaceAll("@", "#");
    }
}

Is it possible to replace the NULL value with an exception, and let the function throw UserNotFoundException when the search user does not exist? Personally, although returning NULL values ​​has many disadvantages, for search functions starting with words such as get, find, select, search, query, etc., the absence of data is not an abnormal situation, but a normal behavior. Therefore, it is more reasonable to return a NULL value indicating that there is no semantics than to return an exception

However, having said that, the reason just mentioned is not particularly convincing. For the case where the search data does not exist, whether the function should use a NULL value or an exception, an important reference standard is to see how other similar search functions in the project are defined, as long as the entire project follows a unified agreement. . If the project is developed from scratch and there is no unified convention and code that can be referred to, then you can choose either of the two. You only need to comment clearly where the function is defined, so that the caller can clearly know what will be returned when the data does not exist

Let me add one more point, for the search function, in addition to returning the data object, some will also return the subscript position, such as indexOf()the function find another substring in a string for the first time where it appears. The return value type of the function is the basic type int. At this time, it is impossible to use the NULL value to represent the non-existence situation. For this situation, there are two ways to deal with it, one is to return NotFoundException, and the other is to return a special value, such as -1. However, obviously -1 is more reasonable, and the reason is the same, which means that "not found" is a normal rather than abnormal behavior

6.5.1.3 Returning an empty object

As mentioned above, returning NULL values ​​has various disadvantages. A classic strategy to deal with this problem is to apply the Null Object Design Pattern. Here are two relatively simple and special empty objects, that is, empty string and empty collection

当函数返回的数据是字符串类型或者集合类型的时候,可以用空字符串或空集合替代 NULL 值,来表示不存在的情况。这样,在使用函数的时候,就可以不用做 NULL 值判断。示例代码如下:

// 使用空集合替代NULL
public class UserService {
    
    
    private UserRepo userRepo; // 依赖注入
    public List<User> getUsers(String telephonePrefix) {
    
    
        // 没有查找到数据
        return Collectiosn.emptyList();
    }
}

// getUsers使用示例
List<User> users = userService.getUsers("189");
for (User user : users) {
    
     //这里不需要做NULL值判断
    // ...
}

// 使用空字符串替代NULL
public String retrieveUppercaseLetters(String text) {
    
    
    // 如果text中没有大写字母,返回空字符串,而非NULL值
    return "";
}
// retrieveUppercaseLetters()使用举例
String uppercaseLetters = retrieveUppercaseLetters("wangzheng");
int length = uppercaseLetters.length(); // 不需要做NULL值判断
System.out.println("Contains " + length + " upper case letters.");

6.5.1.4 抛出异常对象

尽管上面讲了很多函数出错的返回数据类型,但是,最常用的函数出错处理方式就是抛出异常。异常可以携带更多的错误信息,比如函数调用栈信息。除此之外,异常可以将正常逻辑和异常逻辑的处理分离开来,这样代码的可读性就会更好

不同的编程语言的异常语法稍有不同。像 C++ 和大部分的动态语言(Python、Ruby、JavaScript 等)都只定义了一种异常类型:运行时异常(Runtime Exception)。而像 Java,除了运行时异常外,还定义了另外一种异常类型:编译时异常(Compile Exception)

对于运行时异常,在编写代码的时候,可以不用主动去 try-catch,编译器在编译代码的时候,并不会检查代码是否有对运行时异常做了处理。相反,对于编译时异常,在编写代码的时候,需要主动去 try-catch 或者在函数定义中声明,否则编译就会报错。所以,运行时异常也叫作非受检异常(Unchecked Exception),编译时异常也叫作受检异常(Checked Exception)

如果你熟悉的编程语言中,只定义了一种异常类型,那用起来反倒比较简单。如果你熟悉的编程语言中(比如 Java),定义了两种异常类型,那在异常出现的时候,应该选择抛出哪种异常类型呢?是受检异常还是非受检异常?

对于代码 bug(比如数组越界)以及不可恢复异常(比如数据库连接失败),即便捕获了,也做不了太多事情,所以,倾向于使用非受检异常。对于可恢复异常、业务异常,比如提现金额大于余额的异常,更倾向于使用受检异常,明确告知调用者需要捕获处理

如下例,当 Redis 的地址(参数 address)没有设置的时候,直接使用默认的地址(比如本地地址和默认端口);当 Redis 的地址格式不正确的时候,希望程序能 fail-fast,也就是说,把这种情况当成不可恢复的异常,直接抛出运行时异常,将程序终止掉

// address格式:"192.131.2.33:7896"
public void parseRedisAddress(String address) {
    
    
    this.host = RedisConfig.DEFAULT_HOST;
    this.port = RedisConfig.DEFAULT_PORT;
    if (StringUtils.isBlank(address)) {
    
    
        return;
    }
    String[] ipAndPort = address.split(":");
    if (ipAndPort.length != 2) {
    
    
        throw new RuntimeException("...");
    }
    this.host = ipAndPort[0];
    // parseInt()解析失败会抛出 NumberFormatException 运行时异常
    this.port = Integer.parseInt(ipAndPort[1]);
}

实际上,Java 支持的受检异常一直被人诟病,很多人主张所有的异常情况都应该使用非受检异常。支持这种观点的理由主要有以下三个:

  1. 受检异常需要显式地在函数定义中声明。如果函数会抛出很多受检异常,那函数的定义就会非常冗长,这就会影响代码的可读性,使用起来也不方便
  2. 编译器强制必须显示地捕获所有的受检异常,代码实现会比较繁琐。而非受检异常正好相反,不需要在定义中显示声明,并且是否需要捕获处理,也可以自由决定
  3. 受检异常的使用违反开闭原则。如果给某个函数新增一个受检异常,这个函数所在的函数调用链上的所有位于其之上的函数都需要做相应的代码修改,直到调用链中的某个函数将这个新增的异常 try-catch 处理掉为止。而新增非受检异常可以不改动调用链上的代码。可以灵活地选择在某个函数中集中处理,比如在 Spring 中的 AOP 切面中集中处理异常

不过,非受检异常也有弊端,它的优点其实也正是它的缺点。从刚刚的表述中,可以看出,非受检异常使用起来更加灵活,怎么处理的主动权这里就交给了程序员。前面也讲到,过于灵活会带来不可控,非受检异常不需要显式地在函数定义中声明,那在使用函数的时候,就需要查看代码才能知道具体会抛出哪些异常。非受检异常不需要强制捕获处理,那程序员就有可能漏掉一些本应该捕获处理的异常

对于应该用受检异常还是非受检异常,网上的争论有很多,但并没有一个非常强有力的理由能够说明一个就一定比另一个更好。所以,只需要根据团队的开发习惯,在同一个项目中,制定统一的异常处理规范即可

讲了两种异常类型,再来讲下,如何处理函数抛出的异常?总结一下,一般有下面三种处理方法:

1、直接吞掉

public void func1() throws Exception1 {
    
    
    // ...
}
public void func2() {
    
    
    //...
    try {
    
    
        func1();
} catch(Exception1 e) {
    
    
    log.warn("...", e); //吐掉:try-catch打印日志
}
    //...
}

2、原封不动地 re-throw

public void func1() throws Exception1 {
    
    
    // ...
}
public void func2() throws Exception2 {
    
    
    //...
    try {
    
    
        func1();
    } catch(Exception1 e) {
    
    
        throw new Exception2("...", e); // wrap成新的Exception2然后re-throw
}
    //...
}

3、包装成新的异常 re-throw

public void func1() throws Exception1 {
    
    
    // ...
}
public void func2() throws Exception2 {
    
    
    //...
    try {
    
    
        func1();
    } catch(Exception1 e) {
    
    
        throw new Exception2("...", e); // wrap成新的Exception2然后re-throw
    }
    //...
}

When faced with a function throwing an exception, which of the above processing methods should you choose? The following three reference principles are summarized here:

  1. If the exception func1()thrown by is recoverable, and func2()the caller of does not care about the exception func2(), func1()the exception thrown by can be completely swallowed within
  2. If the exception func1()thrown func2()by is understandable and caring to the caller of , and has certain relevance in the business concept, you can choose to directly re-throw the exception thrown by func1
  3. If the exception func1()thrown func2()by is too low-level, for the caller of , there is no background to understand, and the business concept has nothing to do with it, it can be repackaged into a new exception that the caller can understand, and then re-throw

In short, whether to continue throwing up depends on whether the upper layer code cares about this exception. Throw it out if you care, otherwise just swallow it. Whether it needs to be wrapped into a new exception and thrown depends on whether the upper-level code can understand the exception and whether it is business-related. If it can be understood and business related, it can be thrown directly, otherwise it will be encapsulated into a new exception and thrown

6.5.2 Refactoring the exception handling code of each function in the ID generator project

In normal software design and development, in addition to ensuring that the logic under normal conditions runs correctly, it is also necessary to write a large amount of additional code to deal with possible abnormal situations, so as to ensure that the code is in our database under any circumstances. Under control, there will be no unexpected operating results. Program bugs often appear in some boundary conditions and abnormal situations, so the quality of exception handling directly affects the robustness of the code. Comprehensive and reasonable handling of various exceptions can effectively reduce code bugs and is also an important means to ensure code quality

1. Refactoring generate()function

First of all, for generate()the function , if the local machine name fails to be obtained, what does the function return? Is such a return value reasonable?

public String generate() {
    
    
    String substrOfHostName = getLastFiledOfHostName();
    long currentTimeMillis = System.currentTimeMillis();
    String randomString = generateRandomAlphameric(8);
    String id = String.format("%s-%d-%s",
        substrOfHostName, currentTimeMillis, randomString);
    return id;
}

The ID consists of three parts: local name, timestamp and random number. The generation functions of timestamp and random number will not make mistakes, but the host name may fail to be obtained. In the current code implementation, if the host name fails to be obtained and substrOfHostName is NULL, the generate()function will return data like "null-16723733647-83Ab3uK6". If the host name fails to be obtained and substrOfHostName is an empty string, the generate()function will return data like "-16723733647-83Ab3uK6"

Is it reasonable to return the above two special ID data formats under abnormal circumstances? This is actually difficult to say, it depends on how the specific business is designed. However, it is preferable to explicitly notify the caller of the exception. Therefore, it is better to throw a checked exception instead of a special value here

According to this design idea, we refactor generate()the function . The code after refactoring looks like this:

public String generate() throws IdGenerationFailureException {
    
    
    String substrOfHostName = getLastFiledOfHostName();
    if (substrOfHostName == null || substrOfHostName.isEmpty()) {
    
    
        throw new IdGenerationFailureException("host name is empty.");
    }
    long currentTimeMillis = System.currentTimeMillis();
    String randomString = generateRandomAlphameric(8);
    String id = String.format("%s-%d-%s",
    substrOfHostName, currentTimeMillis, randomString);
    return id;
}

2. Refactoring getLastFiledOfHostName()function

For getLastFiledOfHostName()the function , should the UnknownHostException be swallowed inside the function (try-catch and log), or should the exception continue to be thrown? If it is thrown upwards, should the UnknownHostException be thrown as it is, or encapsulated into a new exception?

private String getLastFiledOfHostName() {
    
    
    String substrOfHostName = null;
    try {
    
    
        String hostName = InetAddress.getLocalHost().getHostName();
        substrOfHostName = getLastSubstrSplittedByDot(hostName);
    } catch (UnknownHostException e) {
    
    
        logger.warn("Failed to get the host name.", e);
    }
    return substrOfHostName;
}

The current processing method is that when the acquisition of the host name fails, getLastFiledOfHostName()the function returns a NULL value. As mentioned earlier, whether to return a NULL value or an abnormal object depends on whether the failure to obtain data is a normal behavior or an abnormal behavior. Failure to obtain the host name will affect the processing of subsequent logic, which is not what we expect, so it is an abnormal behavior. It is better to throw an exception here instead of returning a NULL value

As for whether to throw UnknownHostException directly or repackage it into a new exception, it depends on whether the function has a business relationship with the exception. getLastFiledOfHostName()The function is used to obtain the last field of the host name. The UnknownHostException exception indicates that the host name acquisition failed. The two are related to business, so UnknownHostException can be thrown directly without repackaging into a new exception

According to the design ideas above, refactor getLastFiledOfHostName()the function . The refactored code looks like this:

private String getLastFiledOfHostName() throws UnknownHostException{
    
    
    String substrOfHostName = null;
    String hostName = InetAddress.getLocalHost().getHostName();
    substrOfHostName = getLastSubstrSplittedByDot(hostName);

    return substrOfHostName;
}

getLastFiledOfHostName()After the function is modified, generate()the function should also be modified accordingly. It is necessary to catch the UnknownHostException thrown by in generate()the function . getLastFiledOfHostName()When this exception is caught, how should it be handled?

According to the previous analysis, when the ID generation fails, the caller needs to be clearly informed. Therefore, the UnknownHostException cannot be swallowed in generate()the function . Should it be thrown intact, or encapsulated into a new exception thrown?

Choose the latter here. In generate()the function , UnknownHostException needs to be caught and rewrapped into a new exception IdGenerationFailureException and thrown upward. This is done for three reasons:

  1. When using generate()the function only needs to know that it generates a random unique ID, and does not care how the ID is generated. In other words, this is programming that relies on abstraction rather than implementation. If generate()the function throws UnknownHostException directly, it actually exposes the implementation details
  2. From the perspective of code encapsulation, it is not desirable to expose UnknownHostException, a relatively low-level exception, to the upper-level code, that is, the code that calls generate()the function . Moreover, when the caller gets the exception, he can't understand what the exception represents, and he doesn't know how to deal with it.
  3. The UnknownHostException exception has no correlation with generate()the function in terms of business concept

According to the design idea above, the function generate()of is refactored. The refactored code looks like this:

public String generate() throws IdGenerationFailureException {
    
    
    String substrOfHostName = null;
    try {
    
    
        substrOfHostName = getLastFiledOfHostName();
    } catch (UnknownHostException e) {
    
    
        throw new IdGenerationFailureException("host name is empty.");
    }

    long currentTimeMillis = System.currentTimeMillis();
    String randomString = generateRandomAlphameric(8);
    String id = String.format("%s-%d-%s",
    substrOfHostName, currentTimeMillis, randomString);

    return id;
}

3. Refactoring getLastSubstrSplittedByDot()function

For getLastSubstrSplittedByDot(String hostName)the function , if hostName is NULL or an empty string, what should the function return?

@VisibleForTesting
protected String getLastSubstrSplittedByDot(String hostName) {
    
    
    String[] tokens = hostName.split("\\.");
    String substrOfHostName = tokens[tokens.length - 1];
    return substrOfHostName;
}

Theoretically speaking, the correctness of parameter passing should be guaranteed by the programmer, and there is no need to make judgments and special handling of NULL values ​​or empty strings. The caller should never pass a NULL value or an empty string to getLastSubstrSplittedByDot()the function . If passed, it's a code bug and needs to be fixed. But, having said that, there is no guarantee that programmers will not pass NULL values ​​or empty strings. So should we judge whether it is a NULL value or an empty string?

If the function is private to the class and is only called inside the class, it is completely under your own control, and you can ensure that you do not pass NULL values ​​or empty strings when calling this private function. Therefore, you don't need to judge NULL value or empty string in the private function. If the function is public, you have no control over who will call it and how it will be called (it is possible that a colleague may be negligent and pass in a NULL value, which also exists). In order to improve the robustness of the code as much as possible, it is best It is the judgment of NULL value or empty string in the public function

It may be said here that getLastSubstrSplittedByDot()it is protected, neither a private function nor a public function, so should we judge whether it is a NULL value or an empty string?

The reason why it is set to protected is to facilitate writing unit tests. However, unit tests may need to test some corner cases, such as the case where the input is a NULL value or an empty string. Therefore, it is best to add judgment logic for NULL values ​​or empty strings here. Although there is some redundancy, it will not be wrong to add more tests

According to this design idea, we refactor getLastSubstrSplittedByDot()the function . The code after refactoring looks like this:

@VisibleForTesting
protected String getLastSubstrSplittedByDot(String hostName) {
    
    
    if (hostName == null || hostName.isEmpty()) {
    
    
        throw IllegalArgumentException("..."); //运行时异常
    }
    String[] tokens = hostName.split("\\.");
    String substrOfHostName = tokens[tokens.length - 1];
    return substrOfHostName;
}

According to the above, when using this function, you must ensure that you do not pass NULL values ​​or empty strings into it. Therefore, getLastFiledOfHostName()the code of the function should be modified accordingly. The modified code is as follows:

private String getLastFiledOfHostName() throws UnknownHostException{
    
    
    String substrOfHostName = null;
    String hostName = InetAddress.getLocalHost().getHostName();
    if (hostName == null || hostName.isEmpty()) {
    
     // 此处做判断
        throw new UnknownHostException("...");
    }
    substrOfHostName = getLastSubstrSplittedByDot(hostName);
    return substrOfHostName;
}

4. Refactoring generateRandomAlphameric()function

For generateRandomAlphameric(int length)the function , if length < 0 or length = 0, what should this function return?

@VisibleForTesting
protected String generateRandomAlphameric(int length) {
    
    
    char[] randomChars = new char[length];
    int count = 0;
    Random random = new Random();

    while (count < length) {
    
    
        int maxAscii = 'z';
        int randomAscii = random.nextInt(maxAscii);
        boolean isDigit= randomAscii >= '0' && randomAscii <= '9';
        boolean isUppercase= randomAscii >= 'A' && randomAscii <= 'Z';
        boolean isLowercase= randomAscii >= 'a' && randomAscii <= 'z';

        if (isDigit|| isUppercase || isLowercase) {
    
    
            randomChars[count] = (char) (randomAscii);
            ++count;
        }
    }
    return new String(randomChars);
}

Let's first look at the case of length < 0. Generating a random string with a negative length is illogical and anomalous. Therefore, when the incoming parameter length < 0, an IllegalArgumentException is thrown

Let's look at the case of length = 0. Is length = 0 abnormal behavior? It depends on how it is defined. It can be defined as an abnormal behavior by throwing IllegalArgumentException, or it can be defined as a normal behavior, allowing the function to return an empty string directly when the input parameter length = 0. No matter which processing method you choose, the most important point is to clearly tell what kind of data will be returned when length = 0 in the function comment

RandomIdGenerator code after refactoring

public class RandomIdGenerator implements IdGenerator {
    
    
    private static final Logger logger = LoggerFactory.getLogger(RandomIdGenerator.class);

    @Override
    public String generate() throws IdGenerationFailureException {
    
    
        String substrOfHostName = null;
        try {
    
    
            substrOfHostName = getLastFiledOfHostName();
        } catch (UnknownHostException e) {
    
    
            throw new IdGenerationFailureException("...", e);
        }
        long currentTimeMillis = System.currentTimeMillis();
        String randomString = generateRandomAlphameric(8);
        String id = String.format("%s-%d-%s",
                substrOfHostName, currentTimeMillis, randomString);
        return id;
    }
    private String getLastFiledOfHostName() throws UnknownHostException{
    
    
        String substrOfHostName = null;
        String hostName = InetAddress.getLocalHost().getHostName();
        if (hostName == null || hostName.isEmpty()) {
    
    
            throw new UnknownHostException("...");
        }
        substrOfHostName = getLastSubstrSplittedByDot(hostName);
        return substrOfHostName;
    }
    @VisibleForTesting
    protected String getLastSubstrSplittedByDot(String hostName) {
    
    
        if (hostName == null || hostName.isEmpty()) {
    
    
            throw new IllegalArgumentException("...");
        }
        String[] tokens = hostName.split("\\.");
        String substrOfHostName = tokens[tokens.length - 1];
        return substrOfHostName;
    }
    @VisibleForTesting
    protected String generateRandomAlphameric(int length) {
    
    
        if (length <= 0) {
    
    
            throw new IllegalArgumentException("...");
        }
        char[] randomChars = new char[length];
        int count = 0;
        Random random = new Random();
        while (count < length) {
    
    
            int maxAscii = 'z';
            int randomAscii = random.nextInt(maxAscii);
            boolean isDigit= randomAscii >= '0' && randomAscii <= '9';
            boolean isUppercase= randomAscii >= 'A' && randomAscii <= 'Z';
            boolean isLowercase= randomAscii >= 'a' && randomAscii <= 'z';
            if (isDigit|| isUppercase || isLowercase) {
    
    
                randomChars[count] = (char) (randomAscii);
                ++count;
            }
        }
        return new String(randomChars);
    }
}

7. Summary

Including the first two articles:

insert image description here

7.1 Code quality evaluation criteria

7.1.1 How to evaluate the quality of the code?

The evaluation of code quality is highly subjective, and there are many words to describe code quality, such as readability, maintainability, flexibility, elegance, and simplicity. These vocabularies evaluate code quality from different dimensions. They interact and are not independent. For example, good readability and scalability of the code means good maintainability of the code. The quality of the code is a conclusion obtained by combining various factors. We cannot evaluate the quality of a piece of code through a single dimension

7.1.2 What are the most commonly used evaluation criteria?

The most commonly used criteria for judging code quality are: maintainability, readability, scalability, flexibility, simplicity, reusability, and testability. Among them, maintainability, readability, and scalability are the most mentioned and most important three evaluation criteria

7.1.3 How can I write high-quality code?

To write high-quality code, you need to master some more detailed and practical programming methodology, which includes object-oriented design ideas, design principles, design patterns, coding specifications, refactoring techniques, etc.

insert image description here

7.2 Object Orientation

7.2.1 Object Oriented Overview

Now, there are three mainstream programming paradigms or programming styles, which are process-oriented, object-oriented, and functional programming. The object-oriented programming style is the most mainstream among them. Most of the more popular programming languages ​​are object-oriented programming languages. Most projects are also developed based on object-oriented programming style. Because of its rich features (encapsulation, abstraction, inheritance, polymorphism), object-oriented programming can realize many complex design ideas, and is the basis for many design principles and design pattern coding implementations

7.2.2 Four Object-Oriented Features

Encapsulation is also known as information hiding or data access protection. By exposing a limited access interface, the class authorizes the outside to access internal information or data only through the methods provided by the class. It requires the programming language to provide permission access control syntax to support, such as private, protected, public keywords in Java. The significance of the existence of the encapsulation feature, on the one hand, is to protect the data from being modified at will and improve the maintainability of the code; on the other hand, it only exposes limited necessary interfaces to improve the usability of the class

如果说封装主要讲如何隐藏信息、保护数据,那抽象就是讲如何隐藏方法的具体实现,让使用者只需要关心方法提供了哪些功能,不需要知道这些功能是如何实现的。抽象可以通过接口类或者抽象类来实现。抽象存在的意义,一方面是修改实现不需要改变定义;另一方面,它也是处理复杂系统的有效手段,能有效地过滤掉不必要关注的信息

继承用来表示类之间的 is-a 关系,分为两种模式:单继承和多继承。单继承表示一个子类只继承一个父类,多继承表示一个子类可以继承多个父类。为了实现继承这个特性,编程语言需要提供特殊的语法机制来支持。继承主要是用来解决代码复用的问题

多态是指子类可以替换父类,在实际的代码运行过程中,调用子类的方法实现。多态这种特性也需要编程语言提供特殊的语法机制来实现,比如继承、接口类、duck-typing。多态可以提高代码的扩展性和复用性,是很多设计模式、设计原则、编程技巧的代码实现基础

7.2.3 面向对象 VS 面向过程

面向对象编程相比面向过程编程的优势主要有三个:

  1. 对于大规模复杂程序的开发,程序的处理流程并非单一的一条主线,而是错综复杂的网状结构。面向对象编程比起面向过程编程,更能应对这种复杂类型的程序开发
  2. 面向对象编程相比面向过程编程,具有更加丰富的特性(封装、抽象、继承、多态)。利用这些特性编写出来的代码,更加易扩展、易复用、易维护
  3. 从编程语言跟机器打交道方式的演进规律中,可以总结出:面向对象编程语言比起面向过程编程语言,更加人性化、更加高级、更加智能

面向对象编程一般使用面向对象编程语言来进行,但是,不用面向对象编程语言,照样可以进行面向对象编程。反过来讲,即便我们使用面向对象编程语言,写出来的代码也不一定是面向对象编程风格的,也有可能是面向过程编程风格的

面向对象和面向过程两种编程风格并不是非黑即白、完全对立的。在用面向对象编程语言开发的软件中,面向过程风格的代码并不少见,甚至在一些标准的开发库(比如 JDK、Apache Commons、Google Guava)中,也有很多面向过程风格的代码

不管使用面向过程还是面向对象哪种风格来写代码,最终的目的还是写出易维护、易读、易复用、易扩展的高质量代码。只要能避免面向过程编程风格的一些弊端,控制好它的副作用,在掌控范围内为我们所用,就大可不用避讳在面向对象编程中写面向过程风格的代码

7.2.4 面向对象分析、设计与编程

面向对象分析(OOA)、面向对象设计(OOD)、面向对象编程(OOP),是面向对象开发的三个主要环节。简单点讲,面向对象分析就是要搞清楚做什么,面向对象设计就是要搞清楚怎么做,面向对象编程就是将分析和设计的的结果翻译成代码的过程

需求分析的过程实际上是一个不断迭代优化的过程。不要试图一下就给出一个完美的解决方案,而是先给出一个粗糙的、基础的方案,有一个迭代的基础,然后再慢慢优化。这样一个思考过程能让我们摆脱无从下手的窘境

面向对象设计和实现要做的事情就是把合适的代码放到合适的类中。至于到底选择哪种划分方法,判定的标准是让代码尽量地满足“松耦合、高内聚”、单一职责、对扩展开放对修改关闭等各种设计原则和思想,尽量地做到代码可复用、易读、易扩展、易维护

面向对象分析的产出是详细的需求描述。面向对象设计的产出是类。在面向对象设计这一环节中,将需求描述转化为具体的类的设计。这个环节的工作可以拆分为下面四个部分:

  1. 划分职责进而识别出有哪些类
    根据需求描述,把其中涉及的功能点,一个一个罗列出来,然后再去看哪些功能点职责相近,操作同样的属性,可否归为同一个类
  2. 定义类及其属性和方法
    识别出需求描述中的动词,作为候选的方法,再进一步过滤筛选出真正的方法,把功能点中涉及的名词,作为候选属性,然后同样再进行过滤筛选
  3. 定义类与类之间的交互关系
    UML 统一建模语言中定义了六种类之间的关系。它们分别是:泛化、实现、关联、聚合、组合、依赖。从更加贴近编程的角度,对类与类之间的关系做了调整,保留了四个关系:泛化、实现、组合、依赖
  4. 将类组装起来并提供执行入口
    将所有的类组装在一起,提供一个执行入口。这个入口可能是一个 main() 函数,也可能是一组给外部用的 API 接口。通过这个入口,就能触发整个代码跑起来

7.2.5 接口 VS 抽象类

抽象类不允许被实例化,只能被继承。它可以包含属性和方法。方法既可以包含代码实现,也可以不包含代码实现。不包含代码实现的方法叫作抽象方法。子类继承抽象类,必须实现抽象类中的所有抽象方法

接口不能包含属性(Java 可以定义静态常量),只能声明方法,方法不能包含代码实现(Java8 以后可以有默认实现)。类实现接口的时候,必须实现接口中声明的所有方法

抽象类是对成员变量和方法的抽象,是一种 is-a 关系,是为了解决代码复用问题。接口仅仅是对方法的抽象,是一种 has-a 关系,表示具有某一组行为特性,是为了解决解耦问题,隔离接口和具体的实现,提高代码的扩展性

什么时候该用抽象类?什么时候该用接口?实际上,判断的标准很简单。如果要表示一种 is-a 的关系,并且是为了解决代码复用问题,就用抽象类;如果要表示一种 has-a 关系,并且是为了解决抽象而非代码复用问题,那就用接口

7.2.6 基于接口而非实现编程

应用这条原则,可以将接口和实现相分离,封装不稳定的实现,暴露稳定的接口。上游系统面向接口而非实现编程,不依赖不稳定的实现细节,这样当实现发生变化的时候,上游系统的代码基本上不需要做改动,以此来降低耦合性,提高扩展性

实际上,“基于接口而非实现编程”这条原则的另一个表述方式是,“基于抽象而非实现编程”。后者的表述方式其实更能体现这条原则的设计初衷。在软件开发中,最大的挑战之一就是需求的不断变化,这也是考验代码设计好坏的一个标准

The more abstract, the more top-level, and the more detached from a specific implementation design, the more flexible the code can be, and the better it can respond to future demand changes. A good code design can not only meet the current needs, but also be able to respond flexibly without destroying the original code design when the needs change in the future. Abstraction is one of the most effective means to improve code scalability, flexibility, and maintainability

7.2.7 Use more composition and less inheritance

Why is inheritance deprecated?

Inheritance is one of the four major characteristics of object-oriented. It is used to represent the is-a relationship between classes and can solve the problem of code reuse. Although inheritance has many functions, if the inheritance level is too deep and too complicated, it will also affect the maintainability of the code. In this case, it should be used sparingly or even without inheritance

What are the advantages of composition over inheritance?

Inheritance has three main functions: expressing is-a relationship, supporting polymorphic features, and code reuse. And these three functions can be achieved through three technical means of combination, interface and entrustment. In addition, the use of composition can also solve the problem that the deep and complex inheritance relationship affects the maintainability of the code

How to judge whether to use composition or inheritance?

Although it is encouraged to use more composition and less inheritance, composition is not perfect, and inheritance is not useless. In actual project development, it is still necessary to choose whether to use inheritance or combination according to the specific situation. If the inheritance structure between classes is stable, the hierarchy is relatively shallow, and the relationship is not complicated, you can boldly use inheritance. Conversely, try to use composition instead of inheritance. In addition, there are some design patterns and special application scenarios that will always use inheritance or combination

7.2.8 Anemia Model VS Congestion Model

Most of the business development of Web projects is based on the MVC three-tier architecture of the anemia model, which is called the traditional development model. The reason why it is called "traditional" is relative to the emerging DDD development model based on the congestion model. The traditional development model based on the anemia model is a typical process-oriented programming style. On the contrary, the DDD development model based on the congestion model is a typical object-oriented programming style

However, DDD is not a silver bullet. For system development with uncomplicated business, the traditional development model based on the anemia model is simple and sufficient, while the DDD development model based on the hyperemia model is a bit overqualified and cannot play a role. On the contrary, for system development with complex business, the DDD development model based on the hyperemia model needs to invest more time and energy in the design in the early stage to improve the reusability and maintainability of the code, so compared with the model based on anemia The development model has more advantages

Compared with the traditional development model based on anemia model, the DDD development model based on the hyperemia model mainly differs in the Service layer. In the development mode based on the hyperemia model, part of the original business logic in the Service class is moved to a hyperemia Domain domain model, so that the implementation of the Service class depends on the Domain class. However, the Service class is not completely removed, but is responsible for some functionality that would not fit in the Domain class. For example, responsible for dealing with the Repository layer, business aggregation functions of cross-domain models, idempotent transactions and other non-functional work

Compared with the traditional development model based on the anemia model, the codes of the Controller layer and the Repository layer are basically the same. This is because the Entity life cycle of the Repository layer is limited, and the VO of the Controller layer is simply a DTO. The business logic of the two parts will not be too complicated. Business logic is mainly concentrated in the Service layer. Therefore, it is no problem for the Repository layer and Controller layer to continue to use the design idea of ​​the anemia model

insert image description here

7.3 Design principles

7.3.1 SOLID Principle: SRP Single Responsibility Principle

A class is only responsible for completing one responsibility or function. The Single Responsibility Principle improves class cohesion by avoiding designing large and comprehensive classes and coupling unrelated functions together. At the same time, the responsibility of the class is single, and the number of other classes that the class depends on and depends on will be reduced, reducing the coupling of the code, so as to achieve high cohesion and loose coupling of the code. However, if the split is too fine, it will actually be counterproductive, which will reduce the cohesion and affect the maintainability of the code

Different application scenarios, demand backgrounds at different stages, and different business levels may have different judgment results on whether the responsibility of the same class is single. In fact, some side judgment indicators are more instructive and executable. For example, the following situations may indicate that this type of design does not meet the single responsibility principle:

  • Too many lines of code, functions, or properties in a class
  • The class depends on too many other classes or the class depends on too many other classes
  • too many private methods
  • It is more difficult to give a suitable name to the class
  • A large number of methods in the class focus on certain attributes in the class

7.3.2 SOLID Principle: OCP Open-Closed Principle

How to understand "open for extension, closed for modification"?

Adding a new function should be done by extending the code (adding modules, classes, methods, attributes, etc.) on the basis of existing codes, rather than modifying existing codes (modifying modules, classes, methods, attributes, etc.) Finish. There are two things to note about definitions. The first point is that the principle of opening and closing does not mean completely eliminating modifications, but to complete the development of new functions at the cost of minimal code modification. The second point is that the same code change may be identified as "modification" under coarse code granularity; under fine code granularity, it may be identified as "extension"

How to achieve "open for extension, close for modification"?

We must always have the awareness of expansion, abstraction, and encapsulation. When writing code, you need to spend more time thinking about what requirements may change for this code in the future, how to design the code structure, and reserve extension points in advance so that when future requirements change, the overall structure of the code will not be changed. In the case of minimal code changes, new code can be flexibly inserted into the extension point

Many design principles, design ideas, and design patterns are aimed at improving the scalability of the code. In particular, most of the 23 classic design patterns are summed up to solve the problem of code scalability, and they are all based on the principle of opening and closing. The methods most commonly used to improve code extensibility are: polymorphism, dependency injection, programming based on interfaces rather than implementations, and most design patterns (e.g., decorations, strategies, templates, chain of responsibility, state)

7.3.3 SOLID principle: LSP Li style substitution principle

Subclass objects (object of subtype/derived class) can replace any place where parent class objects (object of base/parent class) appear in the program (program), and ensure that the logical behavior (behavior) of the original program remains unchanged and correctness is not destroyed

The Li-style substitution principle is a principle used to guide how to design subclasses in the inheritance relationship. The core of understanding the Li-style replacement principle is to understand the words "design by contract, design according to the agreement". The parent class defines the "contract" (or protocol) of the function, and the subclass can change the internal implementation logic of the function, but cannot change the original "contract" of the function. The "agreement" here includes: the function to be implemented by the function declaration; the agreement on input, output, and exception; even any special instructions listed in the comments

To understand this principle, we must also understand the difference between the Li style substitution principle and polymorphism. Although polymorphism and Li-type substitution are somewhat similar in terms of definition description and code implementation, they focus on different angles. Polymorphism is a major feature of object-oriented programming and a syntax of object-oriented programming languages. It is an idea of ​​​​code implementation. Literary replacement is a design principle, which is used to guide how to design subclasses in the inheritance relationship. The design of subclasses should ensure that when replacing the parent class, the logic of the original program will not be changed and the correctness of the original program will not be destroyed. sex

7.3.4 SOLID principle: ISP interface isolation principle

The description of the interface segregation principle is: A client should not be forced to depend on interfaces it does not need. The "client" can be understood as the caller or user of the interface. The key to understanding the "interface segregation principle" is to understand the word "interface". There are three different interpretations here:

  1. If you understand "interface" as a set of interfaces, it can be an interface of a microservice or an interface of a class library. If some interfaces are only used by some callers, you need to isolate this part of the interface and use it for this part of the caller alone, without forcing other callers to also rely on this part of the interface that will not be used
  2. If "interface" is understood as a single API interface or function, and some callers only need part of the functions in the function, then the function needs to be split into multiple functions with finer granularity, so that the caller only depends on the detail it needs. granular function
  3. If "interface" is understood as an interface in OOP, it can also be understood as an interface syntax in an object-oriented programming language. The design of the interface should be as simple as possible, and the implementation class and caller of the interface should not rely on unnecessary interface functions

The single responsibility principle is aimed at the design of modules, classes, and interfaces. Compared with the single responsibility principle, the interface isolation principle focuses more on the design of the interface on the one hand, and on the other hand, its thinking angle is also different. The interface segregation principle provides a standard for judging whether the responsibility of an interface is single: indirectly by how the caller uses the interface. If the caller only uses part of the interface or part of the function of the interface, then the design of the interface is not enough to have a single responsibility

7.3.5 SOLID Principle: DIP Dependency Inversion Principle

Inversion of control: In fact, inversion of control is a relatively general design idea, not a specific implementation method, and is generally used to guide the design at the framework level. The "control" mentioned here refers to the control of the program execution flow, and "reversal" refers to the fact that the programmer controls the execution of the entire program before using the framework. After using the framework, the execution flow of the entire program is controlled by the framework. Control of the process is "inverted" from the programmer to the framework

Dependency Injection: Dependency Injection is the exact opposite of Inversion of Control, it is a concrete coding technique. Instead of creating an object of a dependent class inside the class by means of new, after the dependent class object is created externally, it is passed (or "injected") to the class for use through constructors, function parameters, etc.

Dependency injection framework: Through the extension points provided by the dependency injection framework, simply configure all the required classes and their dependencies between classes, and the framework can automatically create objects, manage object life cycles, dependency injection, etc. Things that would have required a programmer to do

Dependency Inversion Principle: The Dependency Inversion Principle is also known as the Dependency Inversion Principle. This principle is somewhat similar to inversion of control, and is mainly used to guide the design at the framework level. High-level modules do not depend on low-level modules, they all depend on the same abstraction. Abstraction does not need to depend on specific implementation details, specific implementation details depend on abstraction

7.3.6 KISS, YAGNI principles

The Chinese description of the KISS principle is: try to keep it simple. The KISS principle is an important means of keeping code readable and maintainable. "Simple" in the KISS principle is not measured in terms of lines of code. The fewer lines of code does not mean the simpler the code, but also consider the logical complexity, implementation difficulty, code readability, etc. Moreover, solving complex problems in a complex way does not violate the KISS principle. In addition, the same code, which satisfies the KISS principle in a certain business scenario, may not be satisfied in another application scenario

For how to write code that meets the KISS principle, the following guiding principles are summarized:

  • Don't implement code using techniques your colleagues may not understand
  • Don't reinvent the wheel, be good at using existing tool libraries
  • don't over optimize

The full English name of the YAGNI principle is: You Ain't Gonna Need It. The literal translation is: you won't need it. This principle is also a panacea. When used in software development, it means: don't design functions that are not currently used; do not write code that is not currently used. In fact, the core idea of ​​this principle is: don't over-design

The YAGNI principle is not the same thing as the KISS principle. The KISS principle is about "how to do it" (keep it as simple as possible), while the YAGNI principle is about "doing it or not" (don't do it if you don't need it now)

7.3.7 DRY principle

The Chinese description of the DRY principle is: Don’t repeat yourself, apply it in programming, it can be understood as: Don’t write repeated code, here are three situations of code repetition: implementation logic repetition, functional semantic repetition, code execution repetition

  • Code that achieves logical repetition but does not repeat functional semantics does not violate the DRY principle
  • Code that does not repeat logic but repeats functional semantics is also a violation of the DRY principle
  • Duplication of code execution is also a violation of the DRY principle

In addition, some methods to improve code reusability are also mentioned, including: reducing code coupling, satisfying the principle of single responsibility, modularization, separation of business and non-business logic, general code sinking, inheritance, polymorphism, abstraction, Encapsulation, application templates and other design patterns. Awareness of reuse is also very important. When designing each module, class, and function, think about its reusability like designing an external API

When writing code for the first time, if there is no need for reuse at present, and the need for future reuse is not particularly clear, and the cost of developing reusable code is relatively high, then there is no need to consider the reusability of the code. When developing new functions later, if you find that the code you wrote before can be reused, then refactor the code to make it more reusable

Compared with the reusability of code, the DRY principle is more applicable. You can not write reusable code, but you must not write repetitive code

7.3.8 LOD principles

How to understand "high cohesion, loose coupling"?

"High cohesion and loose coupling" is a very important design idea, which can effectively improve the readability and maintainability of the code, and reduce the scope of code changes caused by functional changes. "High cohesion" is used to guide the design of the class itself, and "loose coupling" is used to guide the design of dependencies between classes. The so-called high cohesion means that similar functions should be placed in the same class, and dissimilar functions should not be placed in the same class. Similar functions are often modified at the same time, put them in the same class, and the modification will be more concentrated. The so-called "loose coupling" means that in the code, the dependencies between classes are simple and clear. Even if two classes have dependencies, code changes in one class will not or rarely cause code changes in dependent classes

How to understand "Demeter's Law"?

The description of Dimit's law is: there should be no dependencies between classes that should not have direct dependencies; between classes that have dependencies, try to only rely on the necessary interfaces. Dimit's law is to reduce the coupling between classes, and make the classes as independent as possible. Each class should know little about other parts of the system. Once a change occurs, fewer classes need to know about the change

insert image description here

7.4 Specification and refactoring

7.4.1 Refactoring overview

The purpose of refactoring: Why refactor (why)?

For the project, refactoring can keep the code quality in a controllable state, and it will not be corrupted to the point of no cure. For individuals, refactoring is a great exercise in one's code ability, and it is a very fulfilling thing. It is a training ground for theoretical knowledge such as classic design ideas, principles, patterns, and programming specifications that we learn

Objects of refactoring: what to refactor?

According to the scale of refactoring, refactoring can be roughly divided into large-scale high-level refactoring and small-scale low-level refactoring. Large-scale high-level refactoring includes code layering, modularization, decoupling, sorting out the interaction between classes, abstracting and reusing components, and so on. This part of the work uses more abstract and top-level design ideas, principles, and patterns. Small-scale and low-level refactoring includes standard naming, annotations, correcting too many function parameters, eliminating super-large classes, extracting repetitive codes and other programming details, mainly for refactoring at the class and function levels. Small-scale and low-level refactoring is more about using the theoretical knowledge of coding standards

The timing of refactoring: when to refactor (when)?

Be sure to establish continuous refactoring awareness, and integrate refactoring into development as an essential part of development, instead of waiting until there are big problems in the code, and then refactoring drastically

Refactoring method: how to refactor (how)?

Large-scale, high-level refactoring is relatively difficult, and it needs to be carried out in an organized and planned manner, taking small steps in stages, and keeping the code in a runnable state at all times. And small-scale low-level refactoring, because the scope of influence is small, the change takes a short time, so as long as you are willing and have time, you can do it anytime, anywhere

7.4.2 Unit testing

What is unit testing?

Unit testing is a code-level test that is used to test the logical correctness of the code written by "self". As the name suggests, unit testing is to test a "unit", which is generally a class or function, not a module or system

Why write unit tests?

Unit testing can effectively find bugs in the code and problems in code design. The process of writing unit tests is itself a process of code refactoring. Unit testing is a powerful supplement to integration testing. It can help us quickly familiarize ourselves with the code. It is a compromise solution for TDD that can be implemented on the ground.

How to write unit tests?

Writing unit tests is the process of designing test cases covering various inputs, exceptions, and boundary conditions for codes, and translating them into codes. Some testing frameworks can be utilized to simplify writing test code. For unit testing, the following correct cognitions need to be established:

  1. Writing unit tests, while tedious, is not too time consuming
  2. The quality requirements of unit tests can be slightly lowered
  3. It is unreasonable to use coverage as the only criterion for measuring the quality of unit testing
  4. Writing unit tests generally does not require understanding the implementation logic of the code
  5. The failure of the unit test framework to test is mostly due to the poor testability of the code

Why is unit testing difficult to implement?

On the one hand, writing unit tests is cumbersome and the technical challenges are not great, so many programmers are unwilling to write them. On the other hand, domestic research and development tends to be "fast and rough". It is easy to cause the execution of unit tests to be overwhelmed due to the tight development progress. Finally, without establishing a correct understanding of unit tests, they feel that they are dispensable, and it is difficult to implement them only by supervision. well done

7.4.3 Code Testability

What is code testability?

Roughly speaking, the so-called testability of the code is the ease of writing unit tests for the code. For a piece of code, if it is difficult to write a unit test for it, or the unit test is very laborious to write and needs to rely on the very advanced features of the unit test framework, it often means that the code design is not reasonable enough and the testability of the code is not good

The most efficient means of writing testable code

Dependency injection is the most effective means of writing testable code. Through dependency injection, when writing unit test code, uncontrollable dependencies can be made controllable through the mock method, which is also the most technically challenging part in the process of writing unit tests. In addition to the mock method, secondary encapsulation can also be used to solve the situation where some code behavior is uncontrollable

Common Anti-Patterns

Typical, common test-unfriendly codes include the following 5 types:

  1. Code contains pending action logic
  2. Misuse of mutable global variables
  3. Abuse of static methods
  4. Use complex inheritance relationships
  5. highly coupled code

7.4.4 Large Refactoring: Decoupling

Why is "decoupling" so important?

Overly complex code is often unfriendly in terms of readability and maintainability. Decoupling, ensuring code loose coupling and high cohesion, is an effective means to control code complexity. If the code has high cohesion and loose coupling, which means that the code structure is clear, layered, modularized, dependencies are simple, and the coupling between modules or classes is small, then the overall quality of the code will not be bad

Does the code need to be "decoupled"?

There are many indirect metrics, such as: whether there are many modules or classes affected by changing the code of a module or class, whether the modules or classes that the code of a module or class depends on need to be changed, whether the code is testable, etc. wait. The direct measure is to draw the dependencies between modules and between classes and classes, and judge whether decoupling refactoring is needed according to the complexity of the dependency graph

How to "decouple" the code?

The methods for code decoupling include: encapsulation and abstraction, middle layer, modularization, and some other design ideas and principles, such as: single responsibility principle, programming based on interface rather than implementation, dependency injection, multi-use combination and less use of inheritance, Di Mitter's law. Of course, there are some design patterns, such as the observer pattern

7.4.5 Small refactoring: Coding conventions

Naming and Notes

  1. The key to naming is to be able to express the meaning accurately. For the naming of different scopes, you can choose different lengths appropriately. For naming with small scopes, such as temporary variables, you can choose shorter naming methods appropriately. In addition, some familiar abbreviations can also be used in naming
  2. Use class information to simplify the naming of attributes and functions, and use function information to simplify the naming of function parameters
  3. Names should be readable and searchable. Don't use uncommon, difficult-to-pronounce English words to name. In addition, the naming must conform to the unified specification of the project, and do not use counter-intuitive naming
  4. Interfaces can be named in two ways. One is prefixed with "I" in the interface, and the other is suffixed with "Impl" in the implementation class of the interface. Both naming methods are acceptable, the key is to be unified in the project. For the naming of abstract classes, the prefix "Abstract" is preferred
  5. The purpose of comments is to make the code easier to understand, as long as it meets this requirement, it can be written. To sum up, comments mainly include three aspects: what to do, why, and how to do it. For some complex classes and interfaces, we may also need to write "how to use"
  6. The annotation itself has a certain maintenance cost, so the more the better. Classes and functions must write comments, and they must be written as comprehensively and detailedly as possible, while there will be relatively few comments inside functions. Generally, good naming and extraction of functions, explanatory variables, and summary comments are used to achieve code easy to read

programming skills

  1. Distill complex logic into functions and classes
  2. Handle too many parameters by splitting them into multiple functions
  3. Handle too many arguments by encapsulating them as objects
  4. Do not use parameters in functions to control code execution logic
  5. Remove too deep nesting levels, methods include: remove redundant if or else statements, use continue, break, return keywords to exit nesting early, adjust execution order to reduce nesting, and abstract part of nesting logic into functions
  6. Replace magic numbers with literal constants
  7. Use explanatory variables to explain complex expressions

insert image description here

Guess you like

Origin blog.csdn.net/ACE_U_005A/article/details/127573227