How Google Writes Design Documents

Author: Malte is the CTO of Vercel. Before that, Malte was lead engineer responsible for Google's search rendering, and director of engineering for Search on Laptops, Tablets, and Desktop.

The translator, Xu Xiaobin, is currently a senior technical expert at Alibaba and the author of "Maven in Action".

This article is authorized by the author and translator, and is publicly released in Technical Trivia.

One of the key elements of Google's software engineering culture is the use of design documents to define software designs. These documents are usually not very formal and are mainly written by authors of software systems or applications before they start writing code. These design documents document high-level implementation strategies and key design decisions, with the latter emphasizing the trade-offs considered during the decision-making process.

As software engineers, our job is not to produce code itself, but to solve problems. Unstructured text, for example in the form of design documents, may be a more appropriate tool for problem solving early in a project. Because design documents may be more concise and easier to understand, communicating problems and solutions at a higher level than code.

In addition to being used as the original documentation of software design, design documentation also fulfills the following functions in the software development life cycle:

●Identify design flaws as early as possible when the modification cost is relatively low.

● Build consensus around design in the organization.

● Ensure cross-cutting concerns (Cross-cutting concern) are fully considered.

● Spread the knowledge of senior engineers in the organization.

• Form an organizational memory base for design decisions.

● Become a summary artifact in the software designer's technical assets.

1. Composition of design documents

Design documents are informal documents, so their content does not need to follow strict guidelines. So the first rule is: write in whatever form makes the most sense for a given project.

In addition to this principle, Google has also established an effective design document structure.

1.1 Context and scope

This section gives the reader a very rough overview of where the system is being built, and what will be built. This is not a requirements document. Keep it short and to the point! The goal here is to get readers into the state quickly. It can be assumed that readers know some pre-requisite knowledge, and relevant details can be given in links. This section should focus entirely on objective background facts.

1.2 Goals and non-goals

Give a simple list of what the goals of the system are. Sometimes it is more important to tell what the non-goal is. Note that the non-target is not a negation of the target, such as "the system should not crash", but the content that is not the target is explicitly selected. A good example is "ACID compliant", when designing a database you definitely want to know if this is the goal or not. Further, you can still choose a solution to achieve the non-goal, as long as it does not impose unnecessary tradeoffs on achieving the goal.

1.3 The actual design

This section should start with an overview and gradually expand into details.

8401fc56d6adb659961df8b321f65e30.png

Design is documentation is where you document design   trade-offs as you design your software . These trade-offs should be noted to produce a document of long-term value. Specifically, given the context (facts), goals and non-goals (requirements), the design document should propose solutions and explain why a particular solution is the best solution to meet those goals.

The point of writing a document, as opposed to a more formal medium, is the flexibility to express the problem set at hand in an appropriate way. Therefore, there are no explicit guidelines on how to describe the design.

Having said that, a few best practices and recurring themes make sense for most design documents:

1.3.1 System context diagram (System-context-diagram)

System context diagrams are useful for many documents. This type of diagram shows the current system as part of a larger technical picture, allowing readers to understand the new design in a context they are already familiar with.

aaa47758e3392ffdb91eaee93267a007.png

Example of a System Context Diagram

1.3.2 APIs

If the system being designed exposes the API, it is usually a good idea to draft the API. In most cases, though, we should resist the urge to copy-paste the formal interface and data definitions into the documentation, as doing so can make the documentation overly long, contain unnecessary detail, and quickly become out of date. Correspondingly, we should focus on the part of the API related to design and trade-offs.

1.3.3 Data storage

Systems that require a storage system should discuss how and in what form the data is stored. As with the previous advice describing the API, and for the same reasons, copying and pasting the full schema definition should be avoided, and the correct approach is to focus on those parts that are relevant to the design tradeoffs.

1.3.4 Code and pseudo-code

Design documents should rarely contain code or disguise code, except in some cases where new algorithms need to be described. It makes sense to give a link to the prototype implementation of the design.

1.3.5 Constraints (Degree of constraint)

A major influence on the shape of a software design (and thus design documentation) is the constraints in the solution space.

The extreme case of one direction is a "greenfield software project (greenfield)", in this case we know all the goals, and the solution is not limited as long as it is reasonable. Such a document may appear very broad, but it should also quickly define a set of rules so that everyone can focus on a set of controllable solutions as soon as possible.

At the other extreme, all possible solutions are well defined, but not how to combine them to achieve a goal. This is often because the legacy system is difficult to change, or the legacy system was not designed to solve the current problem, or a class library design requires us to work within the constraints of its host programming language.

In this case, you may be able to go through all the simple methods that are feasible, but you need to creatively integrate all these methods to achieve the goal. There may be multiple options, each of which is not particularly good, so documentation should focus on how to choose the most appropriate option from the identified trade-offs.

1.4 Alternatives considered

This subsection lists alternative designs that would also achieve similar outputs. The focus here should be on the trade-offs of the various solutions, and how the comparison of these trade-offs leads to the final design - the core theme of the document.

Although the description of the candidate design can be concise, this section is actually very core, because it shows very clearly why the final solution was selected under the given project goal and all alternatives. Under the given goal How the trade-off judgments are made is at the heart of the document's readers.

1.5 Cross-cutting Concerns

Here organizations can ensure that some cross-cutting concerns like security, privacy, observability are always considered. This section is usually relatively short and simply explains how the design affects crosscutting concerns and how those effects can be addressed. Teams should standardize on concerns in their scenarios.

For example, because privacy is very important, Google's project must write a special privacy design document, which will be specially used to review privacy and security. While reviews should only be done before the project kicks off, it's often best to involve privacy and security teams early on to ensure that their input is valued from the start. Regarding the more specific details of this part of the content, the core documentation does not necessarily have to include all of them, and sometimes it is enough to give references to these specialized documents.

1.6 The length of a design doc

Design documents should be detailed enough, but short enough to be read by busy people. For a large project, the optimal length seems to be 10 to 20 pages. If your content exceeds this size, it may make more sense to divide the problem into more manageable sub-problems. Note, of course, that it is entirely possible to write a 1-3 page "mini design doc". This kind of documentation is especially useful for incremental improvements or sub-tasks in agile projects - but you still need to perform the same steps as writing long documentation, the difference is only to make the content more concise and only focus on a limited set of problems.

2. When not to write a design doc

Writing design documents has a cost. The decision of whether to write a design document is actually a trade-off. One side of the trade-off is the benefits of forming an organizational consensus around design, documentation, and high-level review, and the other side of the trade-off is the energy cost of this work. At the heart of the decision is whether the solution to the design problem is ambiguous—often because of problem complexity, or solution complexity (or both). If this problem does not exist, then the value of going through the process of writing a design document is limited.

A clear sign that a design document might not be necessary: ​​Design documents are really implementation manuals . If the documentation basically says "here's how we're going to do it" without discussing in depth the trade-offs, alternatives, without explaining the decision (or the solution is so obvious that there are no trade-offs to discuss), then chances are Writing real code directly is a better option.

Finally, the investment of creating and reviewing design documents may not be compatible with rapidly prototyping and iterating ideas. But most software projects have a set of actually known problems . Embracing agile methods should not be an excuse for not taking the time to find the right solution to a known problem. Additionally, prototyping may itself be part of the process of creating design documentation. "I tried it, and it works" is one of the arguments for choosing the best design.

3. The design doc lifecycle

The life cycle of a design document includes the following phases:

1. Create and iterate quickly

2. Review (maybe multiple rounds)

3. Implement and iterate

4. Maintenance and Learning

3.1 Creation and rapid iteration

You write documentation, sometimes with several collaborators.

This stage quickly evolves into a period of rapid iteration. The document is shared with some colleagues who have the most knowledge about the problem space (usually belonging to the same team). Through their continuous clarification and suggestions, the document gradually Form the first relatively stable version.

You'll find that many engineers and teams prefer to use version control and code review tools to manage documentation, but most of Google's documentation is created using Google Docs and makes heavy use of collaborative features.

3.2 Review (Review)

During the review phase, design documents are shared with a wider audience beyond the original author and close collaborators. Reviews can add a lot of value to documentation, but can also be a dangerous investment cost trap, so treat them wisely.

Reviews can come in many forms: the lightest version is simply to send the documentation to the wider team, giving everyone a chance to take a look. The ensuing discussion takes place primarily in the comments section of the documentation. The heavier form of review is to initiate a formal design document review meeting, in which the author presents the document (usually a dedicated presentation) to an audience of usually more senior engineers. Many teams at Google schedule periodic meetings for this purpose, and engineers can sign up to initiate design reviews. Naturally, waiting for such a meeting to review the design docs slows development down considerably. Engineers can reduce this risk by getting the most critical feedback directly from colleagues without blocking broader reviews.

When Google was a smaller company, it was customary to send designs to a central mailing list, where senior engineers found time to review them. This approach may be fine for the company. One of the great benefits of this approach is that it establishes a relatively consistent software design culture across the company. But as the company gradually grew and formed a larger team of engineers, maintaining this centralized approach became unfeasible.

The main added value of a design review is that it forms an opportunity for the organization's combined experience to be incorporated into the design. How to make the design fully consider cross-cutting concerns such as observability, security, and privacy can be guaranteed in the review stage very consistently. The main value of the review is not the discovery of the problem itself, but to allow the problem to be discovered at a relatively early stage of software development, that is, when the repair cost is relatively low.

3.3 Implementation and iteration

When things have progressed enough that further reviews seem unlikely to require major design changes, it's time to start implementing them. When the plan and reality conflict, it is inevitable to find design flaws, requirements that have not been fully considered, or speculation based on experience is actually wrong, and then it is found that the design needs to be modified. Updating the design docs is strongly recommended in this case, generally speaking: if the system is not live yet, then it's pretty sure the docs should be updated. In practice, we mere mortals don't do a very good job of updating docs, and for some other practical reason, updates usually fall into new, separate docs. This leads to a final state that approximates the U.S. Constitution: a collection of amendments rather than a coherent document. For the poor programmer who maintains the system in the future, when they dig through historical design documents like archaeologists, trying to understand the target system, it will be very helpful to have links to these "amendments" in the original documents.

3.4 Maintenance and learning

When Google engineers first approach a system, their first question is usually "Where is the design document?". While design documents, like all other documents, will become inconsistent with reality over time, they are still the best starting material for learning the thinking behind the system when it was first created.

As an author, it's okay to re-read your own design docs after a year or two, for your own sake. Where did you do it right? Where did I go wrong? What decisions would you make differently today? For engineers, answering these questions is an excellent way to improve their software design capabilities and self-improvement.

4. Conclusions

In software projects, design documents are an excellent way to gain clarity and build consensus around solving the hardest problems. Design docs save money because enough research upfront helps avoid getting into coding details too early and failing to accomplish project goals; design docs cost money because writing and reviewing docs takes time. So choose wisely in your project.

When considering whether to write a design document, consider the following questions:

● Are you unsure about proper software design? Is it reasonable to spend time upfront to gain certainty?

●Is it helpful to introduce relevant senior engineers in the design stage? They may not have time to review all code.

● Is the software design ambiguous, even controversial? So it would be valuable to have an organizational consensus around this issue?

● Does my team sometimes forget to consider privacy, security, logging, or other cross-cutting concerns in the design?

● Is there a strong need for documentation of legacy system designs in the organization? This allows everyone to quickly understand the system at the high-level.

If you answered "yes" to three or more of the above questions, then there's a good chance that design documentation is a good way to start your next software project.

Past recommendations:

technical trivia 

Based on distributed design, architecture, and system thinking, it also discusses bits and pieces related to R&D, not limited to code, quality system, and R&D management.

Guess you like

Origin blog.csdn.net/u013527895/article/details/128739250