42 | Practical combat (2): Practical combat on the backend of the "Paint" program

In the previous chapter, we implemented a mock version of the server. The code is as follows:

https://github.com/qiniu/qpaint/tree/v31/paintdom

Next, we will iterate step by step and turn it into a production-level server program.

We have mentioned before that the business logic of the server program is divided into two layers: the bottom layer is the implementation layer of the business logic, which we usually organize consciously into a DOM tree. The upper layer is the RESTful API layer, which is responsible for receiving user network requests and converting them into method calls on the underlying DOM tree.

In the previous lecture we focused on the RESTful API layer. In order to implement it, we introduced the RPC frameworkrestrpc and the unit testing frameworkqiniutest.

In this lecture, we focus on the underlying business logic implementation layer.

Use interface (interface)

Let’s first look at the user interface (interface) of this layer. From the perspective of the DOM tree, before this lecture, its logical structure was as follows:

<Drawing1>
  <Shape11>
  ...
  <Shape1M>
...
<DrawingN>

In terms of the large hierarchy, there are only three levels:

Document => Drawing => Shape

So, what changes will happen to the DOM tree after the introduction of multi-tenancy (that is, multiple users, each user has its own uid)?

For example, should we turn it into four layers:

Document => User => Drawing => Shape

<User1>
  <Drawing11>
    <Shape111>
    ...
    <Shape11M>
  ...
  <Drawing1N>
...
<UserK>

My answer is: multi-tenancy should not affect the structure of the DOM tree. So the correct design should be:

<Drawing1>, 隶属于某个<uid>
  <Shape11>
  ...
  <Shape1M>
  ...
<DrawingN>, 隶属于某个<uid>

In other words, multi-tenancy will only lead to some additional conventions in the DOM tree. Generally, we should regard it as a certain degree of security convention to avoid accessing resources that we do not have permission to access.

So multi-tenancy will not lead to changes in the DOM hierarchy, but it will lead to changes in interface methods. For example, let's look at the methods of the Document class. Previously, the Document class interface looked like this:

func (p *Document) Add() (drawing *Drawing, err error)
func (p *Document) Get(dgid string) (drawing *Drawing, err error)
func (p *Document) Delete(dgid string) (err error)

Now it becomes:

// Add 创建新drawing。
func (p *Document) Add(uid UserID) (drawing *Drawing, err error)

// Get 获取drawing。
// 我们会检查要获取的drawing是否为该uid所拥有，如果不属于则获取会失败。
func (p *Document) Get(uid UserID, dgid string) (drawing *Drawing, err error)

// Delete 删除drawing。
// 我们会检查要删除的drawing是否为该uid所拥有，如果不属于删除会失败。
func (p *Document) Delete(uid UserID, dgid string) (err error)

As mentioned in the comments, passing in uid is a constraint. Whether we obtain or delete a drawing, we will see whether the drawing belongs to the user.

For the QPaint program, the interfaces of other classes except the Document class have not changed. For example, the interface of the Drawing class is as follows:

func (p *Drawing) GetID() string
func (p *Drawing) Add(shape Shape) (err error)
func (p *Drawing) List() (shapes []Shape, err error)
func (p *Drawing) Get(id ShapeID) (shape Shape, err error)
func (p *Drawing) Set(id ShapeID, shape Shape) (err error)
func (p *Drawing) SetZorder(id ShapeID, zorder string) (err error)
func (p *Drawing) Delete(id ShapeID) (err error)
func (p *Drawing) Sync(shapes []ShapeID, changes []Shape) (err error)

But this is just because the business logic of the QPaint program is relatively simple. Although we need to try our best to avoid interface changes due to multi-tenancy, this impact is sometimes unavoidable.

In addition, when describing the user interface of a class, we cannot only describe the language-level conventions. For example, in the Drawing class above, when we refer to the Shape object, we use the interface of the Go language. as follows:

type ShapeID = string

type Shape interface {
    GetID() ShapeID
}

But, is this interface all the constraints of Shape?

The answer is obviously not.

Let's first look at the most basic constraint: considering that the List of Drawing classes and the Shape instances returned by Get will be returned directly as the results of the RESTful API. Therefore, one of the known constraints of Shape is that its json.Marshal results must meet the expectations of the API layer.

As for what kind of complete constraints we have on Shape under the code implementation of "Practical Combat 2", please leave a message for discussion.

data structure

After clarifying the user interface, the next step is to consider implementing related content. Maybe everyone has heard this saying:

Program = data structure + algorithm

It is a good guiding ideology. So when we talk about the implementation of a program, we always describe it from the two dimensions of data structure and algorithm.

Let's look at the data structure first.

For server-side programs, the data structure is not completely within our control. In the lecture "36 | Business Status and Storage Middleware" we said that storage is the data structure. Therefore, the most important thing about the data structure of the server program is to choose the appropriate storage middleware. Then we organize our data on top of that storage middleware.

For the QPaint server program, we chose mongodb.

Why mongodb and not some kind of relational database?

The most important reason is because of the openness of Shape objects. Because there are many types of graphics, its Schema is not what we can predict in advance today. Therefore, a document database is more suitable.

After determining the storage middleware based on mongodb, our next step is to define the table structure. Of course, table is what is called in relational database, and in mongodb we call it collection. But out of convention, we often use the term "define table structure" to express what we want to do.

We define two tables (Collection): drawing and shape. Among them, the drawing table records all drawings, and the shape table records all shapes. details as follows:

We focus on the design of the index.

In the drawing table, we have an index for uid. This is easier to understand: although we currently do not provide a method to list all drawings of a user, this will happen sooner or later.

In the shape table, we create a joint unique index for (dgid, spid). This is because spid, as ShapeID, is unique within drawing, not globally unique. So, it requires union dgid as unique index.

algorithm

After talking about the data structure clearly, let's talk about the algorithm.

In the statement "program = data structure + algorithm", what does "algorithm" refer to?

During the architecture process and the requirements analysis phase, we focus on the precise expression of user needs. We will introduce roles, which are various participants in the system, and the interaction methods between roles, which are user stories.

In the detailed design stage, roles and user stories become the user interface (interface) of a subsystem, module, class or function. We have been emphasizing before that the user interface (interface) should naturally reflect business needs, which means that the program serves user needs. Our architectural design also has natural continuity between requirements analysis and subsequent outline design, detailed design and other processes.

So algorithm, in its most straightforward meaning, refers to the implementation mechanism behind user stories.

Data structure + algorithm is to meet the initial role and user story definition, which is the core focus of the detailed design phase of the architecture. Here are some typical user stories:

Create new drawing (uid):

dgid = newObjectId()
db.drawing.insert({_id: dgid, uid: uid, shapes:[]})
return dgid

Get the content of drawing (uid, dgid):

doc = db.drawing.findOne({_id: dgid, uid: uid})
shapes = []
foreach spid in doc.shapes {
    o = db.shape.findOne({dgid: dgid, spid: spid})
    shapes.push(o.shape)
}
return shapes

Delete drawing (uid, dgid):

if db.drawing.remove({_id: dgid, uid: uid}) { // 确保用户可删除该drawing
    db.shape.remove({dgid: dgid})
}

Create new shape (uid, dgid, shape):

if db.drawing.find({_id: dgid, uid: uid}) { // 确保用户可以操作该drawing
    db.shape.insert({dgid: dgid, spid: shape.id, shape: shape})
    db.drawing.update({$push: {shapes: shape.id}})
}

Delete shape (uid, dgid, spid):

if db.drawing.find({_id: dgid, uid: uid}) { // 确保用户可以操作该drawing
    if db.drawing.update({$pull: {shapes: spid}}) {
        db.shape.remove({dgid: dgid, spid: spid})
    }
}

The overall expression of these algorithms is a kind of pseudo code. But it's not exactly pseudocode either. If you have used the mongo shell, you can actually know that every mongo database operation code in it is real and effective.

In addition, from a strict perspective, any modification operations involved in the above algorithm should be done in the form of transactions. For example, delete the drawing code:

if db.drawing.remove({_id: dgid, uid: uid}) { // 确保用户可删除该drawing
    db.shape.remove({dgid: dgid})
}

If the remove operation of the drawing table in the first sentence is executed successfully, but a shutdown event occurs at this time and the remove of the shape table is not completed, then everything is normal from the perspective of the user's business logic, but from the perspective of system maintenance , there are some orphan shape objects left in the system, which will never have a chance to be cleared.

Network protocol

Considering that the underlying business logic implementation layer already supports multi-tenancy, our network protocol also needs to be modified accordingly. In this lesson, we only make the simplest adjustments and introduce a mock authorization mechanism. as follows:

Authorization QPaintStub <uid>

Now that we have Authorization, we can no longer use restrpc.Env as the environment for RPC requests. We implement an Env ourselves, as follows:

type Env struct {
    restrpc.Env
    UID UserID
}

func (p *Env) OpenEnv(rcvr interface{}, w *http.ResponseWriter, req *http.Request) error {
    auth := req.Header.Get("Authorization")
    pos := strings.Index(auth, " ")
    if pos < 0 || auth[:pos] != "QPaintStub" {
        return errBadToken
    }
    uid, err := strconv.Atoi(auth[pos+1:])
    if err != nil {
        return errBadToken
    }
    p.UID = UserID(uid)
    return p.Env.OpenEnv(rcvr, w, req)
}

By replacing all restrpc.Env with our own Env, and making some fine-tuning of the code (adding env.UID parameters to calls to the Document class), we have completed the basic multi-tenant transformation.

The complete RESTful API layer code after transformation is as follows:

https://github.com/qiniu/qpaint/blob/v42/paintdom/service.go

Conclusion

To summarize today’s content. Today we are mainly transforming the underlying business logic implementation layer.

On the one hand, we have made a multi-tenant transformation to the user interface (interface). From the perspective of network protocols, multi-tenant transformation mainly involves increasing authorization (Authorization). From the perspective of the underlying DOM interface, it is mainly the Document class that adds the uid parameter.

On the other hand, we completed a new implementation based on mongodb. We provide a detailed description of data structures and algorithms. For a more complete understanding of the implementation details, please focus on the following two documents:

https://github.com/qiniu/qpaint/blob/v42/paintdom/README_IMPL.md

https://github.com/qiniu/qpaint/blob/v42/paintdom/drawing.go