Distributed software architecture - RESTful service

RESTful(Representational State Transfer)

RESTful is a web application design style and development method, based on HTTP, which can be defined in XML format or JSON format. RESTFUL is suitable for scenarios where mobile Internet manufacturers serve as business interfaces, and realizes the function of third-party OTT calling mobile network resources. The action types are adding, changing, and deleting the called resources. It is a dominant style of remote service access.

Comparison of REST and RPC

The core of the ideological difference between REST and RPC is the difference in abstract goals, that is, the difference between resource-oriented programming ideas and process-oriented programming ideas.

The conceptual difference between the two means that REST is not a remote service invocation protocol , it is not a protocol. Because protocols are standardized and mandatory, at least there should be a specification document, such as JSON-RPC. No matter how simple it is, there must be a "JSON-RPC Specification" (JSON-RPC 2.0 Specification )
to Specifies the format details, exceptions, response codes and other information of the protocol. But REST doesn't define these, and while it has some guidelines, it's not really bound by any enforcement.

REST

The concept of REST comes from Roy Fielding's doctoral thesis published in 2000: "Architectural Styles and the Design of Network-based Software Architectures" . For the Chinese version, see Architectural Style and Network Software Architecture Design

Roy Fielding (Roy Fielding) is a very good software engineer. His main titles are as follows:
1) The core developer of the Apache server, and later became the co-founder of the famous Apache Software Foundation
2) HTTP Member of the expert group of the 1.0 protocol (released in 1996)
3) Head of the HTTP 1.1 protocol (released in 1999)
4) The theory and ideas that guided the design of the HTTP 1.1 protocol were initially communicated among the members of the expert group in the form of a memorandum , this memo is actually the prototype of REST.

What is "representational state transition"? That is, what is REST (Representational State Transfer).

Hypertext

First understand what HTTP is, and then make an analogy with some practical examples, you will find that "REST" is actually a further abstraction of "HTT" (Hyper Text Transfer, Hypertext Transfer), they are like interfaces and implementation classes The relationship between. The term "Hypertext" used in HTTP was proposed by American sociologist Ted H. Nelson (Theodor Holm Nelson) in the article "Brief Words on the Hypertext" in 1967.

Nelson's revised definition in 1992:

By now the word “hypertext” has become generally accepted for branching and responding text,
but the corresponding word “hypermedia”, meaning complexes of
branching and responding graphics, movies and sound – as well as text – is much less used.
Instead they use the strange term “interactive multimedia”:
this is four syllables longer, and does not express the idea of ​​extending hypertext. ——
Theodor Holm Nelson Literary Machines, 1992
Manipulate the text (or sound, image, etc.) to judge and respond to"

Resource

The content itself (it can be regarded as some kind of information, data), we call it "resource"

Representation

The concept of "representation" refers to the representation of information when interacting with users:
such as: the same resource:

  • Data in HTML format returned by the server to the browser
  • Data in PDF format returned by the server to the browser
  • Data in Markdown format returned by the server to the browser
  • The data in RSS format returned by the server to the browser
  • ...
    the above are multiple representations of the same resource

state

The context information that can only be generated in a specific context is called "state". For example: after
reading this article, when you want to read the content of the next article, you send a request to the server "give me the next article ".
But "next article" is a relative concept and must depend on "which article you are currently reading", so that the server can respond correctly

Stateful or Stateless is only relative to the server:

  1. The server remembers the user's state, which is stateful
  2. The client remembers the state and explicitly tells the server when requesting that it is stateless

Transfer

In some way, the server transforms "the article currently read by the user" into "the next article", which is called "representation state transfer"

  1. Uniform Interface:

"Unified interface", including: GET, HEAD, POST, PUT, DELETE, TRACE, OPTIONS seven basic operations

Any server that supports the HTTP protocol will abide by this set of regulations, and take these actions for a specific URI, and the server will trigger the corresponding representation state transition.

  1. Hypertext Driven:

As the browser is a general client, any website navigation (state transfer) behavior cannot be preset in the browser code,
but is driven by the request response information (hypertext) sent by the server.
This is very different from other software with clients.
In those software, business logic is often pre-installed in the program code, and
there is a dedicated page controller (whether in the server or in the client) To drive the state transition of the page.

  1. Self-Descriptive Messages:

A widely used self-describing method is to identify the Internet media type (MIME type) in the HTTP Header named "Content-Type", such as "
Content-Type : application/json; charset=utf-8"

Internet media type (MIME type):
https://zh.wikipedia.org/wiki/%E4%BA%92%E8%81%94%E7%BD%91%E5%AA%92%E4%BD%93 %E7%B1%BB%E5%9E%

System features of RESTful style

Fielding believes that an ideal system that fully meets REST should satisfy the following six characteristics:

  1. Separation of server and client (Client-Server)
  2. Stateless
  3. Cacheability
  4. Layered System
  5. Uniform Interface
  6. Code-On-Demand

The ideological difference between REST and RPC is that the basic idea of ​​REST is to abstract problems oriented to resources. It is essentially different from the popular process-oriented programming idea in the abstract subject.

RPC migrates the idea of ​​local method invocation to remote method invocation. Developers design the interaction between the two systems around the "remote method", such as CORBA, RMI, DCOM, and so on. The disadvantage of this is not only "how to express a method between heterogeneous systems", but also "how to obtain the list of methods that the interface can provide", which has become a problem that needs to be solved by a special protocol (one of the three basic problems of RPC) More importantly, each method of the service is different, and service users must learn one by one to use them correctly. Google once wrote such a passage in the "Goole API Design Guide":

Traditionally, people design RPC APIs in terms of API interfaces and methods, such as CORBA and Windows COM. As time goes by, more and more interfaces and methods are introduced. The end result can be an overwhelming number of interfaces and methods, each of They are different from the others. Developers have to learn each one carefully in order to use it correctly, which can be both time consuming and error prone. In the past,
people designed RPC APIs for methods, such as CORBA and DCOM. Interfaces and methods are more and more different, and developers must understand each method to use them correctly, which is time-consuming and error-prone.
——Google API Design Guide, 2017

REST proposes a style of service design with resources as the main body, which brings many benefits to it.

  1. Reduce the learning cost of the service interface.
    The Uniform Interface is an important symbol of REST. It maps the standard operations on resources to the standard HTTP methods. Similarly, there is no need to deliberately learn, and there is no need for any protocols such as Interface Description Language to exist.
  2. Resources naturally have collections and hierarchies.
    Abstract interfaces centered on methods. Since methods are verbs, each interface is logically determined to be independent of each other. However, abstract interfaces centered on resources, because resources are nouns, can naturally be Generate collections and hierarchies. To give a specific example, imagine the interface design of a mall user center: user resources will have multiple different subordinate resources, such as several short message resources, a user profile resource, and a shopping cart resource. It will have its own subordinate resources, such as multiple book resources. It is easy to construct the collection relationship and hierarchical relationship of these resources in the program interface, and it is in line with people's intuition of managing data in a stand-alone or network environment for a long time. I believe that you don't need to read the interface specification specifically, and you can easily infer that the REST interface for obtaining the second book in the shopping cart of user icyfenix should be expressed as:
GET /users/icyfenix/cart/2
  1. REST is bound to the HTTP protocol
    Resource-oriented programming does not have to be built on top of HTTP, but REST is, which is a disadvantage and an advantage. Because HTTP is originally a network protocol designed for resources, the advantage of purely using HTTP (rather than rebuilding the protocol like SOAP over HTTP) is that there is no need to think about the Wire Protocol in RPC, and REST will reuse The concepts and related basic support already defined in the HTTP protocol are used to solve the problem. The HTTP protocol has been operating effectively for thirty years, and its related technical infrastructure has been tempered and mature. The downside, of course, is that when you want to consider features that HTTP doesn't provide, you're completely helpless.

Richardson Maturity

Leonard Richardson, the author of "RESTful Web APIs" and "RESTful Web Services", once proposed a Richardson Maturity Model (Richardson Maturity Model) to measure "how REST the service is", so that those systems that do not use REST originally can gradually import REST. Richardson ranks the service interface "degree of REST" from 0 to 3, from low to high:

  • 0. The Swamp of Plain Old XML: Not REST at all. In addition, regarding the Plain Old XML statement, SOAP expressed that it felt offended.
  • 1.Resources: Start to introduce the concept of resources.
  • 2.HTTP Verbs: Introduce a unified interface and map to the methods of the HTTP protocol.
  • 3.Hypermedia Controls: Hypermedia control is referred to as "hypertext-driven" in this article, and "Hypertext As The Engine Of Application State, HATEOAS" in Fielding's paper. They all refer to the same thing.
    REST Maturity Model

Through the actual examples in the article about the RMM maturity model written by Martin Fowler (the original text is written in XML, here is simplified as JSON representation), to specifically show the four different degrees of REST reflected in the actual interface what will it be. Assuming you are a software engineer, the UserStory description of the requirements received (the requirements in the original text are more complicated, simplified here) is as follows - develop a doctor appointment system (patients can know the designated doctor on a specified date through the system) availability to make an appointment).

Maturity Level 0: The Swamp of Plain Old XML

The hospital has opened a Web API of /appointmentService, and the date and doctor's name are passed in as parameters, and the doctor's free time can be obtained in the time period. An HTTP call of the API is as follows:

POST /appointmentService?action=query HTTP/1.1
{
    
    
  "date": "2020-03-04",
  "doctor": "mjones"
}

The server returns the result:

HTTP/1.1 200 OK
[
  {
    
    
    "start": "14:00",
    "end": "14:50",
    "doctor": "mjones"
  },
  {
    
    
    "start": "16:00",
    "end": "16:50",
    "doctor": "mjones"
  }
]

After getting the result that the doctor was free, I felt that 14:00 was more appropriate, so I made an appointment confirmation and submitted my basic information:

POST /appointmentService?action=confirm HTTP/1.1
{
    
    
  "appointment": {
    
    
    "date": "2020-03-04",
    "start": "14:00",
    "doctor": "mjones"
  },
  "patient": {
    
    
    "name": "icyfenix",
    "age": 30
  }
}

If the reservation is successful, then I can receive a reservation success response:

HTTP/1.1 200 OK
{
    
    
  "code": 0,
  "message": "Successful confirmation of appointment"
}

If something goes wrong, like someone has booked ahead of me, then I get some kind of error message in the response:

HTTP/1.1 200 OK
{
    
    
  "code": 1,
  "message": "doctor not available"
}

At this point, the entire reservation service is declared complete. It is straightforward. We have adopted a very intuitive RPC-based service design, which seems to solve all problems easily, but is it really the case?

Level 1 Maturity: Resources

Level 0 is the style of RPC and works perfectly fine if the requirements never change and never increase. However, if you don't want to write additional methods for operations other than booking a doctor's appointment, for obtaining information other than free time, or change the interface of existing methods, you should still consider how to use REST to abstract resources.

The first step towards REST is to introduce the concept of resources. The basic manifestation in API is to design services around resources rather than processes. To put it bluntly, it can be understood that the endpoint of a service should be a noun rather than a verb. In addition, each request should contain the ID of the resource, and all operations are performed through the resource ID, for example, to obtain the free slots at the time specified by the doctor:

POST /doctors/mjones HTTP/1.1
{
    
    
  "date": "2020-03-04"
}

Then the server returns a set of schedule lists containing ID information. Note that ID is the unique number of the resource, and having an ID means that "doctor's schedule" is regarded as a resource:

HTTP/1.1 200 OK
[
  {
    
    
    "id": 1234,
    "start": "14:00",
    "end": "14:50",
    "doctor": "mjones"
  },
  {
    
    
    "id": 5678,
    "start": "16:00",
    "end": "16:50",
    "doctor": "mjones"
  }
]

I still think 14:00 is more appropriate, so I made an appointment confirmation and submitted my basic information:

POST /schedules/1234 HTTP/1.1
{
    
    
  "name": "icyfenix",
  "age": 30,
  "reservation": 1234
}

The subsequent response messages of success or failure of the appointment are consistent with the previous ones at this level, so they will not be repeated. Compared with level 0, level 1 is characterized by the introduction of resources, and interacts with services through resource IDs as the main clue, but there are at least three problems that have not been resolved in level 1: first, it only handles queries and reservations, if I If I want to change the time temporarily, adjust the appointment, or my illness suddenly recovers and I want to delete the appointment, all these need to provide a new service interface. Second, when processing the result response, you can only rely on the code and message fields in the result to make branch judgments. Each set of services must design codes that may cause errors. This is difficult to consider comprehensively, and it is also not conducive to some general applications. The third is that security aspects such as authentication and authorization are not considered. For example, it is required that only logged-in users are allowed to query doctor schedules, some doctors may only be open to VIPs, and patients with a certain level can only make appointments, etc. .

Level 2 Maturity: HTTP Verbs

All three problems left over from Level 1 can be solved by introducing a unified interface. The seven standard methods of the HTTP protocol are carefully designed, and as long as the abstraction ability of the architect is sufficient, they can cover almost all operation scenarios that resources may encounter. The approach of REST is to abstract different business requirements into operations such as adding, modifying, and deleting resources to solve the first problem; using the Status Code of the HTTP protocol, it can cover the exceptions that may occur in most resource operations, and the Status Code can automatically Define extensions to solve the second problem; rely on the additional authentication and authorization information carried in the HTTP Header to solve the third problem, which is not reflected in actual combat. Please refer to the "credentials" related content in the security architecture.

According to this idea, the GET operation with query semantics should be used to obtain the doctor's schedule:

GET /doctors/mjones/schedule?date=2020-03-04&status=open HTTP/1.1

The server then sends back a response containing the requested information:

HTTP/1.1 200 OK
[
  {
    
    
    "id": 1234,
    "start": "14:00",
    "end": "14:50",
    "doctor": "mjones"
  },
  {
    
    
    "id": 5678,
    "start": "16:00",
    "end": "16:50",
    "doctor": "mjones"
  }
]

I still think that 14:00 is more appropriate, so I made an appointment confirmation and submitted my basic information to create an appointment, which conforms to the semantics of POST:

POST /schedules/1234 HTTP/1.1
{
    
    
  "name": "icyfenix",
  "age": 30,
  "reservation": 1234
}

If the reservation is successful, then I can receive a reservation success response:

HTTP/1.1 201 Created

Successful confirmation of appointment

If something goes wrong, like someone has booked ahead of me, then I get some kind of error message in the response:

HTTP/1.1 409 Conflict

doctor not available

Level 3 Maturity: Hypermedia Controls

Level 2 is the REST level reached by most systems at present, but it is still not perfect. At least there is still one problem: how do you know that you need to visit the /schedules/1234 service Endpoint when you make an appointment with Dr. mjones? Maybe you can't even understand why I have such a question at the first time, this is of course written by the program code! But REST doesn't subscribe to this idea that has been seared in programmers' minds for a long time. Hypermedia Controls in RMM, HATEOAS in Fielding's paper, and the "hypertext driver" that are mentioned more now, what I hope is that except the first request is driven by your input in the browser address bar, other requests All should be able to describe clearly the subsequent state transitions that may occur, driven by the hypertext itself. So, after you enter the query command:

GET /doctors/mjones/schedule?date=2020-03-04&status=open HTTP/1.1

The response information returned by the server should include possible follow-up operations such as how to make an appointment and how to know the doctor's information:

HTTP/1.1 200 OK
{
    
    
  "schedules": [
    {
    
    
      "id": 1234,
      "start": "14:00",
      "end": "14:50",
      "doctor": "mjones",
      "links": [
        {
    
    
          "rel": "comfirm schedule",
          "href": "/schedules/1234"
        }
      ]
    },
    {
    
    
      "id": 5678,
      "start": "16:00",
      "end": "16:50",
      "doctor": "mjones",
      "links": [
        {
    
    
          "rel": "comfirm schedule",
          "href": "/schedules/5678"
        }
      ]
    }
  ],
  "links": [
    {
    
    
      "rel": "doctor info",
      "href": "/doctors/mjones/info"
    }
  ]
}

If level 3 REST is achieved, then the API on the server side and the client side are completely decoupled. It will be very easy for you to adjust the number of services, or to upgrade the API of the same service.

Insufficiency and controversy

Resource-oriented programming ideas are only suitable for CRUD, and process-oriented and object-oriented programming can handle truly complex business logic

This is the most frequently encountered problem. The four most basic commands of HTTP, POST, GET, PUT, and DELETE, are easily reminiscent of CRUD operations, so that there is a direct correspondence in mind. Of course, the scope that REST can cover is far more than this, but there is nothing wrong with saying that POST, GET, PUT and DELETE correspond to CRUD, but this CRUD must be generalized to understand, they cover information between the client and the server. There are several main ways of how information flows between the server and the client. All network-based operation logic can be understood by corresponding to how information flows between the server and the client. Some scenarios are more intuitive, while others may be more abstract.
For those more abstract scenarios, if it is really difficult to map HTTP methods to the required operations of resources, REST is not a rigid dogma. Users can use custom methods. According to the REST API style recommended by Google, custom methods should Placed at the end of the resource path, embedded with a colon plus the suffix of the custom verb. For example, I can map the delete operation to the standard DELETE method. If I also provide an undelete API, it may be designed as:

POST /user/user_id/cart/book_id:undelete

If you don't want to use a custom method, then design a recycle bin resource, where the goods that can still be restored are kept, and the restoration and deletion are regarded as a modification of a state value of the resource, which is mapped to the PUT or PATCH method , which is also a perfectly feasible design.
Resource-oriented programming ideas are different from the other two mainstream programming ideas when they are only abstract issues. There is only a choice issue, and there is no distinction between superior and inferior:

  • When process-oriented programming, why focus on algorithms and processing procedures, input data, and output results? Of course, it is to conform to the mainstream interaction method in the computer world.
  • In object-oriented programming, why should data and behavior be unified and encapsulated into objects? Of course, it is to conform to the mainstream interaction method in the real world.
  • In resource-oriented programming, why should data (resources) be regarded as the main body of abstraction and behavior as a unified interface? Of course, it is to conform to the mainstream interaction mode in the online world.

REST is fully bound to HTTP and is not suitable for scenarios requiring high-performance transmission

The author personally agrees with this point of view to a large extent, but I don't think this is a defect of REST. The fact that the hammer cannot be used as a wrench is not a problem with the quality of the hammer. Resource-oriented programming has nothing to do with protocols, but REST (specifically referring to the REST defined in Fielding's paper, rather than generally referring to resource-oriented ideas) does rely on standard methods, status codes, protocol headers, and other aspects of the HTTP protocol. HTTP is not a transport layer protocol, it is an application layer protocol, and it would be inappropriate to think of HTTP only as a transport (SOAP: I feel offended again). For scenarios that require direct control of transmission, such as binary details, encoding form, message format, connection method, etc., REST is really not suitable. These scenarios often exist between internal nodes of the service cluster, which is also mentioned before. , REST and RPC Although the application scenarios do overlap, but the extent of the overlap is a matter of opinion.

REST is bad for transactional support

This question first depends on how you view the concept of "Transaction". If "transaction" refers to the rigid ACID transaction in the narrow sense of the database, then unless the state is not held at all, the distributed system itself is in conflict with this (CAP cannot have both), which is a distributed problem rather than Not a REST problem. If "transaction" refers to obtaining a unified coordination capability (2PC/3PC) for simultaneous submission of multiple data in a distributed service through a service agreement or architecture, such as functional protocols such as WS-AtomicTransaction and WS-Coordination, this REST does not support it. If you have understood the cost of doing this and still decide to do it, Web Service is a better choice. If "transaction" only refers to the hope of ensuring the final consistency of data, it means that you have given up rigid transactions. This is the normal interaction mode in distributed systems. Using REST will definitely not hinder, and it is not "unfavorable" . Of course, REST is not helpful for this, it all depends on the transaction design of your system, we will discuss it in detail in transaction processing.

REST has no transport reliability support

Yes and no. In HTTP, when you send out a request, you usually receive a corresponding response, such as HTTP/1.1 200 OK or HTTP/1.1 404 Not Found and so on. But if you don't receive any response, you can't be sure whether the message was not sent out, or whether it was not returned from the server. The key difference is whether the server has been triggered for some processing? The simplest and rude way to deal with transmission reliability is to resend the message again. The premise of this simple processing is that the service should be idempotent (Idempotency), that is, the effect of executing the service repeatedly is equal to executing it once. The HTTP protocol requires that GET, PUT, and DELETE should be idempotent, and when we map REST services to these methods, we should also ensure idempotency. For the POST method, there have been some special proposals (such as POE, POST Once Exactly), but they have not been approved by the IETF. For repeated POST submissions, the browser will display corresponding warnings, such as the prompt "Confirm to resubmit the form" in Chrome. For the server side, pre-verification should be done. If it is found that there may be duplication, HTTP/1.1 425 Too Early will be returned. In addition, there is a WS-ReliableMessaging function protocol in Web Service to support reliable delivery of messages. Similarly, since REST does not use additional Wire Protocol, in addition to functions such as transactions and reliable transmission, many features that REST does not support must be found in the WS-* protocol.

REST lacks "partial" and "batch" capabilities for resources

The author agrees with this point of view, which is likely to be the development direction of resource-oriented thinking and API design style in the future. REST pioneered a resource-oriented style of service, but it's certainly not perfect. Based on the HTTP protocol, it brings great convenience to REST (no need for additional protocols, no need to repeatedly solve a bunch of basic network problems, etc.), but it is also HTTP itself that has become an invisible cage that restricts REST. Here is still a specific example to explain the limitations of REST: For example, if you only want to get the name of a certain user, you can design a "getUsernameById" service in the RPC style and return a string, although the versatility of this service is really It can't be called "design", but it does work; and in the REST style, you will request the entire user object from the server, and then discard other attributes of the user in the returned result except the username. "Overfetching". REST's solution is to alleviate this problem through caching at intermediate nodes or clients, but the essence of this defect is that the HTTP protocol has no structural description capability for requested resources at all (but has the ability to obtain unstructured partial content, That is, the Range Header that is mostly used for breakpoint resume today), so what content of the resource to return, what data type to return, etc., cannot be supported at the protocol level. To do it, you can only do it yourself on the Endpoint of the GET method. Design various parameters to achieve. On the other hand, the opposite defect is the support for batch operations on resources. Sometimes we have to design some abstract resources specifically for this purpose.. For example, if you are going to add a "VIP" prefix to a user's name, just submit a PUT request to modify the user's name, and when you want to add VIP to 1000 users, if you really call 1000 PUT times, the browser will It will respond to your HTTP/1.1 429 Too Many Requests, and the boss will beat you up. At this point, you have to first create a task resource (such as "VIP-Modify-Task"), give the IDs of 1000 users to this task, and finally drive the task into the execution state. For another example, if you go to an online store to buy something, the series of steps of placing an order, freezing inventory, paying, adding points, and deducting inventory will involve changes in multiple resources. You may have to create an abstract resource of a "transaction" , or use a specific resource (such as "settlement") throughout the process, and carry the ID of the transaction or settlement every time you operate other resources. Due to its stateless nature, the HTTP protocol is relatively unsuitable (not impossible) to handle such business scenarios.
At present, a theoretically better solution that can solve the above types of problems is GraphQL, which is a data query language for resource APIs proposed and open sourced by Facebook. Like SQL, it has a "query language" name, but in fact CRUD has been involved. Compared with REST, which relies on HTTP and has no protocol, GraphQL can be said to be another "protocol" and more thorough resource-oriented service method. However, everything has two sides. Without HTTP, it faces the problem of how to promote the interactive interface encountered by almost all RPC frameworks.

reference link

1. REST concept

Guess you like

Origin blog.csdn.net/zkkzpp258/article/details/131157241