"Philosophy of Software Design" (19) [The code should be obvious]

Chapter 18 The code should be obvious

Chapter 18 Code Should be Obvious

Obscurity is one of the two main causes of complexity described in Section 2.3. Obscurity occurs when important information about a system is not obvious to new developers. The solution to the obscurity problem is to write code in a way that makes it obvious; this chapter discusses some of the factors that make code more or less obvious.

Obscure is one of the two main causes of complexity described in Section 2.3. Vagueness occurs when important information about the system is not obvious to new developers. The solution to obscure problems is to write code in an obvious way. This chapter discusses some of the factors that make the code more or less obvious.

If code is obvious, it means that someone can read the code quickly, without much thought, and their first guesses about the behavior or meaning of the code will be correct. If code is obvious, a reader doesn’t need to spend much time or effort to gather all the information they need to work with the code. If code is not obvious, then a reader must expend a lot of time and energy to understand it. Not only does this reduce their efficiency, but it also increases the likelihood of misunderstanding and bugs. Obvious code needs fewer comments than nonobvious code.

If the code is obvious, it means that someone can quickly read the code without thinking, and their initial guess about the behavior or meaning of the code will be correct. If the code is obvious, then readers don't need to spend a lot of time or energy to gather all the information they need to use the code. If the code is not obvious, then the reader must spend a lot of time and effort to understand it. This not only reduces their efficiency, but also increases the possibility of misunderstandings and errors. Obvious code requires fewer comments than unobvious code.

“Obvious” is in the mind of the reader: it’s easier to notice that someone else’s code is nonobvious than to see problems with your own code. Thus, the best way to determine the obviousness of code is through code reviews. If someone reading your code says it’s not obvious, then it’s not obvious, no matter how clear it may seem to you. By trying to understand what made the code nonobvious, you will learn how to write better code in the future.

The reader's thinking is "obvious": it is much easier to notice that other people's code is not obvious than to find that your own code has problems. Therefore, the best way to determine whether the code is obvious is through a code review. If someone says that it is not obvious when reading your code, it is not obvious no matter how clear you look. By trying to understand what makes the code less obvious, you will learn how to write better code in the future.

18.1 Things that make code more obvious

Two of the most important techniques for making code obvious have already been discussed in previous chapters. The first is choosing good names (Chapter 14). Precise and meaningful names clarify the behavior of the code and reduce the need for documentation. If a name is vague or ambiguous, then readers will have read through the code in order to deduce the meaning of the named entity; this is time-consuming and error-prone. The second technique is consistency (Chapter 17). If similar things are always done in similar ways, then readers can recognize patterns they have seen before and immediately draw (safe) conclusions without analyzing the code in detail.

The two most important techniques for making code obvious have been discussed in the previous chapters. The first is to choose a good name (Chapter 14). Precise and meaningful names can clarify the behavior of the code and reduce the need for documentation. If the name is ambiguous or ambiguous, the reader will read through the code to infer the meaning of the named entity; this is time-consuming and error-prone. The second technique is consistency (Chapter 17). If similar things are always done in a similar way, then readers can recognize patterns they have seen before and immediately draw (safe) conclusions without having to analyze the code in detail.

Here are a few other general-purpose techniques for making code more obvious:

Here are some other general techniques to make the code more obvious:

Judicious use of white space. The way code is formatted can impact how easy it is to understand. Consider the following parameter documentation, in which whitespace has been squeezed out:

Use whitespace wisely. The way the code is formatted affects its ease of understanding. Consider the following parameter document, where spaces have been compressed:

/**
 *  ...
 *  @param numThreads The number of threads that this manager should
 *  spin up in order to manage ongoing connections. The MessageManager
 *  spins up at least one thread for every open connection, so this
 *  should be at least equal to the number of connections you expect
 *  to be open at once. This should be a multiple of that number if
 *  you expect to send a lot of messages in a short amount of time.
 *  @param handler Used as a callback in order to handle incoming
 *  messages on this MessageManager's open connections. See
 *  {@code MessageHandler} and {@code handleMessage} for details.
 */

It’s hard to see where the documentation for one parameter ends and the next begins. It’s not even obvious how many parameters there are, or what their names are. If a little whitespace is added, the structure suddenly becomes clear and the documentation is easier to scan:

It is difficult to see where the documentation for one parameter ends and where the documentation for the next parameter begins. I don't even know how many parameters there are or what their names are. If you add some white space, the structure will suddenly become clear and the document will be easier to scan:

/**
 *  @param numThreads
 *           The number of threads that this manager should spin up in
 *           order to manage ongoing connections. The MessageManager spins
 *           up at least one thread for every open connection, so this
 *           should be at least equal to the number of connections you
 *           expect to be open at once. This should be a multiple of that
 *           number if you expect to send a lot of messages in a short
 *           amount of time.
 *  @param handler
 *           Used as a callback in order to handle incoming messages on
 *           this MessageManager's open connections. See
 *           {@code MessageHandler} and {@code handleMessage} for details.
 */

Blank lines are also useful to separate major blocks of code within a method, such as in the following example:

Blank lines can also be used to separate the main code blocks in a method, such as the following example:

void* Buffer::allocAux(size_t numBytes) {
    
    
    //  Round up the length to a multiple of 8 bytes, to ensure alignment.
    uint32_t numBytes32 =  (downCast<uint32_t>(numBytes) + 7) & ~0x7;
    assert(numBytes32 != 0);

    //  If there is enough memory at firstAvailable, use that. Work down
    //  from the top, because this memory is guaranteed to be aligned
    //  (memory at the bottom may have been used for variable-size chunks).
    if  (availableLength >= numBytes32) {
    
    
        availableLength -= numBytes32;
        return firstAvailable + availableLength;
    }

    //  Next, see if there is extra space at the end of the last chunk.
    if  (extraAppendBytes >= numBytes32) {
    
    
        extraAppendBytes -= numBytes32;
        return lastChunk->data + lastChunk->length + extraAppendBytes;
    }

    //  Must create a new space allocation; allocate space within it.
    uint32_t allocatedLength;
    firstAvailable = getNewAllocation(numBytes32, &allocatedLength);
    availableLength = allocatedLength numBytes32;
    return firstAvailable + availableLength;
}

This approach works particularly well if the first line after each blank line is a comment describing the next block of code: the blank lines make the comments more visible.

If the first line after each blank line is a comment describing the next code block, this method is particularly effective: the blank makes the comment more visible.

White space within a statement helps to clarify the structure of the statement. Compare the following two statements, one of which has whitespace and one of which doesn’t:

The white space in the sentence helps clarify the structure of the sentence. Compare the following two statements, one of which has spaces and one of them has no spaces:

for(int pass=1;pass>=0&&!empty;pass--) {
    
    

for (int pass = 1; pass >= 0 && !empty; pass--) {
    
    

Comments. Sometimes it isn’t possible to avoid code that is nonobvious. When this happens, it’s important to use comments to compensate by providing the missing information. To do this well, you must put yourself in the position of the reader and figure out what is likely to confuse them, and what information will clear up that confusion. The next section shows a few examples.

Comment. Sometimes non-obvious code cannot be avoided. When this happens, it is important to use comments to provide missing information to compensate. To do this, you must put yourself in the position of the reader, figuring out what might confuse them, and what information can eliminate this confusion. Some examples are shown in the next section.

18.2 Things that make code less obvious

There are many things that can make code nonobvious; this section provides a few examples. Some of these, such as event-driven programming, are useful in some situations, so you may end up using them anyway. When this happens, extra documentation can help to minimize reader confusion.

There are many things that can make the code less obvious. This section provides some examples. Some of these features, such as event-driven programming, are useful in certain situations, so you might end up using them. When this happens, additional documentation can help minimize reader confusion.

Event-driven programming. In event-driven programming, an application responds to external occurrences, such as the arrival of a network packet or the press of a mouse button. One module is responsible for reporting incoming events. Other parts of the application register interest in certain events by asking the event module to invoke a given function or method when those events occur.

Event-driven programming. In event-driven programming, the application responds to external events, such as the arrival of a network packet or the pressing of a mouse button. One module is responsible for reporting incoming events. Other parts of the application register interest in certain events by requiring the event module to call a given function or method when the event occurs.

Event-driven programming makes it hard to follow the flow of control. The event handler functions are never invoked directly; they are invoked indirectly by the event module, typically using a function pointer or interface. Even if you find the point of invocation in the event module, it still isn’t possible to tell which specific function will be invoked: this will depend on which handlers were registered at runtime. Because of this, it’s hard to reason about event-driven code or convince yourself that it works.

Event-driven programming makes it difficult to follow the control flow. Never call event handlers directly. They are called indirectly by the event module, usually using function pointers or interfaces. Even if you find the call site in the event module, you still cannot determine which specific function will be called: it will depend on which handlers are registered at runtime. Therefore, it is difficult to reason about event-driven code or convince yourself that it is feasible.

To compensate for this obscurity, use the interface comment for each handler function to indicate when it is invoked, as in this example:

To compensate for this ambiguity, use interface annotations for each handler function to indicate when to call the function, as shown in the following example:

/**
 * This method is invoked in the dispatch thread by a transport if a
 * transport-level error prevents an RPC from completing.
 */
void Transport::RpcNotifier::failed() {
    
    
    ...
}

img Red Flag: Nonobvious Code img

If the meaning and behavior of code cannot be understood with a quick reading, it is a red flag. Often this means that there is important information that is not immediately clear to someone reading the code.

If the meaning and behavior of the code cannot be understood by quick reading, it is a danger sign. Usually, this means that some important information cannot be cleared immediately for the person reading the code.

Generic containers. Many languages provide generic classes for grouping two or more items into a single object, such as Pair in Java or std::pair in C++. These classes are tempting because they make it easy to pass around several objects with a single variable. One of the most common uses is to return multiple values from a method, as in this Java example:

Universal container. Many languages ​​provide general classes for combining two or more items into one object, such as Pair in Java or std::pair in C++. These classes are attractive because they make it easy to pass multiple objects easily using a single variable. One of the most common uses is to return multiple values ​​from a method, as shown in the following Java example:

return new Pair<Integer, Boolean>(currentTerm, false);

Unfortunately, generic containers result in nonobvious code because the grouped elements have generic names that obscure their meaning. In the example above, the caller must reference the two returned values with result.getKey() and result.getValue(), which give no clue about the actual meaning of the values.

Unfortunately, the common container leads to unclear code because the common names of the grouped elements obscure their meaning. In the above example, the caller must use result.getKey() and result.getValue() to refer to two returned values, and neither of these two values ​​provide the actual meaning of these values.

Thus, it’s better not to use generic containers. If you need a container, define a new class or structure that is specialized for the particular use. You can then use meaningful names for the elements, and you can provide additional documentation in the declaration, which is not possible with the generic container.

Therefore, it is best not to use universal containers. If you need a container, define a new class or structure dedicated to a specific purpose. You can then use meaningful names for the elements, and you can provide additional documentation in the declaration, which is not possible with regular containers.

This example illustrates a general rule: software should be designed for ease of reading, not ease of writing. Generic containers are expedient for the person writing the code, but they create confusion for all the readers that follow. It’s better for the person writing the code to spend a few extra minutes to define a specific container structure, so that the resulting code is more obvious.

This example illustrates a general rule: software should be designed to be easy to read, not easy to write. General purpose containers are convenient for people who write code, but they will confuse all subsequent readers. For the person writing the code, it is better to spend some extra time to define a specific container structure in order to make the generated code more obvious.

Different types for declaration and allocation. Consider the following Java example:

Different types of declarations and assignments. Consider the following Java example:

private List<Message> incomingMessageList;
...
incomingMessageList = new ArrayList<Message>();

The variable is declared as a List, but the actual value is an ArrayList. This code is legal, since List is a superclass of ArrayList, but it can mislead a reader who sees the declaration but not the actual allocation. The actual type may impact how the variable is used (ArrayLists have different performance and thread-safety properties than other subclasses of List), so it is better to match the declaration with the allocation.

The variable is declared as a List, but the actual value is an ArrayList. This code is legal because List is a superclass of ArrayList, but it will mislead readers who see the declaration but not the actual allocation. The actual type may affect how the variable is used (ArrayList has different performance and thread-safe properties compared to other subclasses of List), so it is best to match the declaration with the assignment.

Code that violates reader expectations. Consider the following code, which is the main program for a Java application

Code that violates reader expectations. Consider the following code, which is the main program of the Java application

public static void main(String[] args) {
    
    
    ...
    new RaftClient(myAddress, serverAddresses);
}

Most applications exit when their main programs return, so readers are likely to assume that will happen here. However, that is not the case. The constructor for RaftClient creates additional threads, which continue to operate even though the application’s main thread finishes. This behavior should be documented in the interface comment for the RaftClient constructor, but the behavior is nonobvious enough that it’s worth putting a short comment at the end of main as well. The comment should indicate that the application will continue executing in other threads. Code is most obvious if it conforms to the conventions that readers will be expecting; if it doesn’t, then it’s important to document the behavior so readers aren’t confused.

Most applications exit when their main program returns, so readers might think that this will happen here. But that is not the case. The constructor of RaftClient creates other threads, even if the main thread of the application is completed, the thread can continue to run. This behavior should be documented in the interface comment of the RaftClient constructor, but the behavior is not obvious enough, so it is worth adding a short comment at the end of main. The comment should indicate that the application will continue to execute in other threads. If the code conforms to the convention the reader expects, then it is most obvious. If not, it is important to document the behavior so as not to confuse the reader.

18.3 Conclusion

Another way of thinking about obviousness is in terms of information. If code is nonobvious, that usually means there is important information about the code that the reader does not have: in the RaftClient example, the reader might not know that the RaftClient constructor created new threads; in the Pair example, the reader might not know that result.getKey() returns the number of the current term.

Another way of thinking about obviousness is information. If the code is not obvious, it usually means that there is important information about the code that the reader does not have: in the RaftClient example, the reader may not know that the RaftClient constructor creates a new thread; in the "pairing" example, the reader may not know result.getKey() returns the number of the current item.

To make code obvious, you must ensure that readers always have the information they need to understand it. You can do this in three ways. The best way is to reduce the amount of information that is needed, using design techniques such as abstraction and eliminating special cases. Second, you can take advantage of information that readers have already acquired in other contexts (for example, by following conventions and conforming to expectations) so readers don’t have to learn new information for your code. Third, you can present the important information to them in the code, using techniques such as good names and strategic comments.

To make the code clearly visible, you must ensure that the reader always has the information needed to understand them. You can do this in three ways. The best way is to use design techniques such as abstraction and eliminate special situations to reduce the amount of information required. Second, you can use information that readers have already obtained in other situations (for example, by following conventions and meeting expectations), so that readers do not have to learn new information for the code. Third, you can use techniques such as good names and strategic notes to provide them with important information in the code.

Guess you like

Origin blog.csdn.net/WuLex/article/details/108618100