[Transfer] Four Strategies for Organizing Code

Reprinted from: http://www.infoq.com/cn/articles/four-strategies-for-organizing-code?utm_source=infoq&utm_medium=popular_widget&utm_campaign=popular_content_list&utm_content=homepage

 

Four Strategies for Organizing Code

Author  Martin Sandin  , translator  Zhang Tianlei  Posted on April 27, 2016  | 3 Discuss

 

This article has been translated from a well- received article on Medium , authored by Martin Sandin, and the translation has been authorized by me.

This article introduces four strategies for organizing code: component organization, toolbox organization, layer organization, and category organization. I think these four strategies form a hierarchy, each targeting a different type of code cohesion. From my personal experience, these four strategies cover all the situations you might encounter when actually organizing your code. There are probably countless strategies for organizing code, but I've never seen anyone organize packages in a project by creation date, or classes in a package by initial order.

Reasons and Definitions for Organizing Codes

It's interesting that most of the advice you get on how to develop programs basically teaches you how to organize your code, not computer technology. As far as the computer itself is concerned, coupling and cohesion are largely irrelevant. Computers don't care if you write all your code in one file, or if you sort classes alphabetically, or if you name all your variables with a single letter. The purpose of rationally organizing code is not to let the computer understand your code, but to allow other people to read the code you write well, and then maintain the code and do secondary development efficiently and confidently to some extent.

 

Code should be written so that other people can understand it first, and machines can execute it second.
- "The Structure and Interpretation of Computer Programs" (Abelson, Sussman)

When a piece of code is written too long and contains too many elements, the code becomes very complex, making it difficult for people to locate information, get an overview, and make it difficult for people to understand the function of each part of the code. The best way to solve this problem is to "divide and conquer" - breaking a large, complex piece of code into smaller pieces, each of which can be understood independently. For classes, this approach helps us create cohesive logical objects, and it also applies to domain models. For separately compiled projects, we must eliminate circular dependencies and ensure that there is a logical and stable interface between projects. There are many variants of how code is organized at the level between projects and classes (packages in Java or namespaces in C#). In my own experience, many developers choose a strategy for organizing their code without much thought, yet they don't understand why they're using their chosen strategy.

The first three strategies described in this article apply to code organization at the class, package, project, etc. level, while the last category organization strategy is more or less specific to code organization at the package level.

策略一——元件组织法

元件组织法可以使代码的复杂程度最小化,它主要关心代码单元(比如包)外部的衔接性和内部的内聚力。外部衔接性是指包拥有最少的接口,接口的功能与元件提供的服务关联性很强;内聚力则是指包内部的代码拥有较强的内在关联性。

完全独立的电子元件

关于一份优秀的代码抽象应当包含怎样的内容,这个主题可以衍生出不少的文章,而且现在已经有很多文章介绍这个问题了。如果在这里我们也来讨论代码的抽象化原则,即使仅仅涉及问题的一小部分,也会使本文篇幅过长。可以这么说,代码抽象最好的入门方法是学习SOLID原则。在学习过程中,实践并思考每个流程的原理至关重要。而在这篇文章里,我只会介绍在我自己的实践经历中使代码库复杂度急剧增长的一个最普遍的原因。许多人也确实在代码库上尝试使用“分而治之”的方法进行过代码组织,但却最终没能成功将包分解为元件。

创建一个新的代码单元,通常的做法是识别一个或多个已有包中的一部分功能并生成一个新的抽象。这就意味着代码单元的总数变多了,相应地每个代码单元的体量变小了,代码更容易被理解消化。然而这还只是第一步,总体的复杂度还没有降低。接下来我们需要消除依赖关系。

我认为,含有相互依赖关系的包不能被视为独立的代码单元,这是因为单独只看一个包的内容并不能完全理解它的代码。在上面的例子中可以直观地看到,Graph类与GraphStorage类关联,GraphStorage类不允许被修改。不仅graph_storage包依赖着许多graph包的域模型,而且这些包相互间也有着依赖关系。最容易消除的依赖关系通常是新创建的包对旧包的依赖:

之所以认为这是一个提升,最重要的原因是当我们在阅读storage的代码时,我们可以确定代码功能所涉及的对象都包含在了Storable接口中。

客户端不应该依赖它不需要的接口。
——接口隔离原则

下一步则是消除graph包对storage包的直接关联。举例来说,一种消除关联的方法是在graph包中创建一个GraphPersister接口,让更高一层的包与Graph包对接。这样最大的好处又是使graph包所依赖的storage包的功能变得清晰明确了。

这个过程理论上听起来很简单,但实际上确定合适的元件和分离策略需要花费许多工夫。通常你会在过程中发现提取的抽象不正确,一切所做的更改又要推翻重来。然而,合理分离好元件的回报也是巨大的,你可以获得容易理解的代码,代码也能简单地升级、测试以及重复利用。

策略二——工具箱组织法

工具箱组织法主要关注外部衔接性,它提供了一种稳定的工具箱,使用者可以从工具箱中选取自己需要的东西。这个策略使用的前提是代码具有很强的内聚力。工具箱一般由接口的互补执行机制组成,使用者可以选取需要的执行机制或是将多个执行机制组合起来使用,但在一次执行时并不同时使用多个机制。

  • 集合库的组织方法就是典型的工具箱组织法,涉及一系列集合接口的互补执行机制,这些集合接口的特性受到时间复杂度、内存占用率等因素的影响。工具箱也可能拥有一个统一的主题,比如只包含基于磁盘的数据结构。
  • 日志库本身不一定是工具箱,但它通常包含一个日志写入器的工具箱。

正因为用户可以很方便地使用工具箱,并且工具箱中的每一个“工具”尽管各自独立但都不够大不足以授权自身的代码单元,所以工具箱得以发展完善。GUI库中的每个元件可能拥有各自的包,但如果给每个元件都建一个工程就会造成不必要的浪费。相似地,每个集合实现可能都分别适用于一个单独的包,然而把它们分别放在不同的包中则会产生大量冗余。不过在这个例子中,为了符合外部一致性原则而保持简洁的外观,一个包含了若干类的集合实现则需要拥有其单独的包。

符合外部一致性原则的DiskList工具箱

策略三——层组织法

层组织法的重点主要是促进工作流的内聚力,而不是通过最小化跨单元的耦合项来降低代码的复杂程度。它根据部署方案等规则划分层的边界,进而将代码进行分割。这个策略与工具箱组织法不同,层与层之间并不存在一个最小的连贯接口。层接口的构成要素很多,它们可以被用户层中对应的构成要素分别访问。

跨层的元件耦合项

层组织法的典型特征是跨层逻辑元件间的逻辑耦合关联比同一层内的逻辑元件更强。这个策略失效最常见的情况是在实施代码组织时,需要跨所有层创建文件,这也就是教科书上定义的紧密耦合的实例。

给定两个代码单元A、B,当A改变时B也必须跟着改变,则称A与B耦合。
——C2 Wiki中的定义

在这种情况下,逻辑元件内的依赖通常会使多个解耦后的层变成一个非常复杂的单元。

多个解耦后的层形成一个极为复杂的单元

实际中应当谨慎使用层组织法,因为层组织策略常常提高而非降低了系统总体的复杂度。不过在某些情况下,层组织法所带来的好处远远大于它的缺点。这时,将层的依赖隔离在用户代码中的一处就非常值得一试了。

策略四——类别组织法

类别组织法适合整理过于复杂的代码单元,它将不同的代码部分放在相应的基于类或接口类别的bucket储存单元中。在这一分类过程中,依赖关系、概念联系以及一些典型的生成包(名称通常为exception、interface、manager、helper、entity等)都被忽略了。

类别组织法也与工具箱组织法不同,它舍弃了一些表象的东西,比如包中互补、可互换的类可以组成一种合理的库。在我认识的人中,没有人主张用类别组织法在不同的类或工程中组织代码。

类别组织法组织的工程

I think the category organization method is not suitable for organizing code, because it hides the actual problems of complex code, which will mislead developers into thinking that the problems in the code have been fixed, but in fact the problem is not completely solved, the overall complexity Nor did it decrease. Another big disadvantage of the category organization strategy is that, in extreme cases, each category can be classified into an exact category. I've run into this extreme situation where, just to have a matching package for all code, weird things are created throughout the codebase, like code managers, helpers, etc.

I see category organization as a "code smell", but from my personal experience, category organization is widely used to organize code in commercial software (mostly written in Java or C#). The category organization method provides a simple solution in dividing large packages, and for most people, the size of the package is not the main contradiction, the category organization method successfully solves the problem of independence of each part of the code, so it is often used in commercial software development.

Summarize

For software developers, organizing code is a core skill. As with any skill, the quickest way to improve is to think carefully about why you gave up on previous choices. There are many different strategies for organizing code, and the most important thing is to learn to discern which of these strategies are effective and which are dangerous.

View the original link: Four Strategies for Organizing Code

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326935560&siteId=291194637