Should I use RESTful API or all POST for back-end development?

This is a very controversial topic. Let me make my point clear first. I strongly urge that it must be used RESTful API. As for the reason, Uncle Xiao has already made it very clear. You can read his article below.

The reason for writing this article is mainly because of this post on V2EX, which says——

"The interface to connect with colleagues, all the interfaces he defined are post requests. The reason is that https is more secure with post. I used to use restful api before. If only post requests are safe for https? Then why do we need get, put and delete? ?How should I refute him?"

Then a large number of replies in this post have the following arguments: 1) POST is very good, so it should be done like this, there is less communication, 2) Take the shuttle, finish it early and go home early, 3) So what if you win the argument? ? It’s just work, elegance cannot be eaten. While the comments weren't one-sided, there was plenty of support. Then, I ridiculed on Twitter that using POST to do everything is like seeing a decoration worker come to your house and say, "My job is to use nails to nail everything, screws, bolts, buckles, latches... I don't need to nail them all." A gun and a shuttle, convenient, fast, safe, you can go home early after work... However, there are still some netizens who think that using POST is good and can save time. So, just in time, I am in "My Principles for System Architecture" In the "Principle 5", it is against the API return code of 200 whether it is right or wrong. I wrote this article specifically to set things straight.

This article is mainly divided into the following parts:

  1. Why use different HTTP verbs?
  2. Restful for complex queries
  3. Responses to several major questions
    • Is POST more secure?
    • Can using POST save time and reduce communication?
    • The right way to go home early
    • It’s just work, elegance cannot be eaten as food

Why use different HTTP verbs

Generally speaking, there are two types of logic in the programming world: "business logic" and "control logic".

  • Business logic. It is the code that you implement the function of business requirements, and it is the code that is strongly related to user needs. For example, save the data submitted by the user, query the user's data, complete an order transaction, refund the user... etc., these are business logic
  • control logic. It is the non-functional code we use to control the running of the program. For example, the variables and conditions used to control the program loop, the use of multi-threaded or distributed technology, the use of HTTP/TCP protocol, what kind of database to use, what kind of middleware... etc., these things have nothing to do with user needs .

The same is true for network protocols. Generally speaking, almost all mainstream network protocols have two parts, one is the protocol header and the other is the protocol body. The protocol header is the data used by the protocol itself, and the protocol body is the user's data. Therefore, the protocol header is mainly used for the control logic of the protocol, while the protocol body is the business logic.

The verb (or Method) of HTTP is in the protocol header, so it is mainly used for control logic.

The following is the verb specification of HTTP. Generally speaking, REST API requires developers to strictly follow the following standard specification (see RFC7231 Chapter 4.2.2 – Idempotent Methods)

method describe idempotent
GET Used for query operations, corresponding to the select operation of the database ✔︎
PUT Used for all information updates, corresponding to the update operation of the database ✔︎
DELETE Used for update operations, corresponding to the delete operation of the database ✔︎
POST Used for new operations, corresponding to the insert operation of the database
HEAD Used to return the "metadata" of a resource object, or to detect whether the API is healthy. ✔︎
PATCH Used to update local information, corresponding to the update operation of the database
OPTIONS Get information about API ✔︎

Among them, PUT and PACTH both update the business resource information. If the resource object does not exist, you can create a new one, but the difference between them is that PUT is used to update all the complete information of a business object, just like we submit all the resources through the form. data, while PACTH operates on more API-oriented data update operations, and only needs more fields that need to be updated (see RFC 5789).

Of course, in the real world, the API may not be understood strictly in accordance with the CRUD of database operations. For example, you have a login API /login. Do you think this API should be GET, POST, PUT or PATCH? When logging in, users need Enter the user name and password, and then compare it with the one in the database (select operation), and return a login session token, and then this token is used as the status token for user login. According to the above table, it should be a select operation to perform GET, but semantically speaking, login is not a query information, it should be an update of the user status or a new operation (new session), so POST should still be used, and /logout You can use DELETE. Let me explain here, don't use the CRUD of the database to correspond to these verbs mechanically. In many cases, you still need to analyze the business semantics.

In addition, we noticed that "whether it is idempotent" was added to the last column of this table. The idempotence of the API is a very important thing for the control logic. The so-called idempotence means that the result of executing the API multiple times is exactly the same as executing it once, without any side effects.

  • POST is used to add new data, for example, to add a trading order. This must not be idempotent.
  • DELETE is used to delete data. The result of deleting a piece of data multiple times is the same as deleting it once, so it is idempotent.
  • PUT is used to update all numbers, so it is idempotent.
  • PATCH is used for local updates. For example, updating a certain field cnt = cnt+1 is obviously not an idempotent operation.

The feature of idempotence is a very important thing for remote calls. That is to say, remote calls often cause timeout calls due to network reasons. For timeout requests, we cannot know whether the server has received the request and executed it. Now, at this point, we can't retry the request rashly, it will be disastrous for calls that are not idempotent. For example, for business logic such as transfer, the result of one transfer is different from that of multiple transfers. If it is repeated, it may be transferred one more time. So, at this time, if your API complies with the specification of HTTP verbs, then you can understand under which verbs you can retry and under which verbs you cannot retry when you write the program. If you use POST to express all APIs, it will be completely out of control.

In addition to idempotent control logic, you may also have the following control logic requirements:

  • cache. Cache the API through CDN or gateway. Obviously, we need to recommend caching for query GET operations.
  • Flow Control. You can use HTTP verbs to perform more granular flow control, such as limiting the frequency of API requests, which should be different for read operations and write operations.
  • routing. For example: write requests are routed to the write service, and read requests are routed to the read service.
  • permission. More fine-grained permission control and auditing can be obtained.
  • monitor. Because the API performance of different methods is different, performance analysis can be done separately.
  • pressure test. When you need to stress test the API, if there is no distinction between verbs, I believe your stress test will be difficult.
  • ……etc

Maybe, you will say, my business is too simple, there is no need to make it so complicated. OK, there is no problem, but I think that in the worst case, you need to achieve "separation of reading and writing", that is, at least two verbs, GET means a read operation, and POST means a write operation.

RESTful complex query

Generally speaking, for query APIs, there are mainly four operations to be completed: sorting, filtering, searching, and paging. Below are some relevant specifications. Refer to the two best Restful API specification documents I think are written, Microsoft REST API Guidelines, Paypal API Design Guidelines.

  • Sort. For sorting the result set, use the sort keyword, and the related syntax {field_name}|{asc|desc},{field_name}|{asc|desc}. For example, an API needs to return a list of companies, sorted by certain fields, such as: GET /admin/companies?sort=rank|asc or GET /admin/companies?sort=rank|asc,zip_code|desc
  • filter. For filtering the result set, use the filter keyword and the {field_name} op{value} syntax. For example: GET /companies?category=banking&location=china . However, sometimes, we need more flexible expressions, and we need to construct our expressions on the URL. Here you need to define six comparison operations: =, <, >, <=, >=, and three logical operations: and, or, not. (Some special characters in the expression need to be escaped, for example: >= into ge) So, we will have the following query expression: GET /products?$filter=name eq 'Milk' and price lt 2.55 Find all milk with a price less than 2.55.
  • search. For related searches, use the search keyword, as well as keywords. For example: GET /books/search?description=algorithm or direct full-text search GET /books/search?key=algorithm.
  • paging. For paging the result set, paging must be a default behavior so that a large amount of returned data will not be generated.
    • Use page and per_page to represent the page number and the amount of data per page, for example: GET /books?page=3&per_page=20.
    • Optional. The page method mentioned above uses relative positions to obtain data, which may have two problems: performance (large data volume) and data deviation (high-frequency updates). At this time, the absolute position can be used to obtain data: record in advance the ID, time and other information of the last piece of data in the currently obtained data, so as to obtain "data before this ID" or "data before this moment". Example: GET /news?max_id=23454345&per_page=20 or GET /news?published_before=2011-01-01T00:00:00Z&per_page=20.

Note: It is important to note here that theoretically GET can take a body, but many program libraries or middleware do not support GET with a body, so you can only use POST to pass parameters. The principle here is:

  1. For simple queries, many parameters are designed in the restful API path, and filter/sort/pagination will not bring a lot of complexity, so GET should be used
  2. For complex queries, there may be very complex query parameters, such as the DSL in index/_search on ElasticSearch. You should also use GET instead of POST unless objective conditions do not support GET. The same is said in the official documentation of ElasticSearch.

The authors of Elasticsearch prefer using GET for a search request because they feel that it describes the action—​retrieving information—​better than the POST verb. However, because GET with a request body is not universally supported, the search API also accepts POST requests (unless your class library or server does not support GET with parameters, you can use POST, we support both) Chen Hao's note: But after ElasticSearch
7.11 , GET does not support body anymore. This is the design and implementation of ElasticSearch does not correspond.

In addition, for some more complex operations, it is recommended to complete them by calling multiple APIs respectively. Although this will increase the number of network requests, it can make the back-end program and data less coupled, making it easier to become a micro-system. Service architecture.

Finally, if you want to use a query language like GraphQL in Rest, you can consider a solution like OData. OData is an acronym for Open Data Protocol, originally developed by Microsoft in 2007. It is an open protocol that enables you to create and use queryable and interoperable RESTful APIs in a simple and standard way.

Responses to several major questions

The following is a direct response to a few questions. If you need more questions from me, you can leave a message later, and I will add the questions and my responses below.

1) Why does the API need to be Restful and comply with specifications?

Restful API can be regarded as an HTTP specification and standard. You can say it is a best practice. In short, it is a consensus on HTTP API in the world. Based on this consensus, you can enjoy many technical dividends at no cost, such as: CDN, API gateway, service governance, monitoring...etc. These are the reasons that allow you to greatly reduce R&D costs and avoid pitfalls.

2) Why doesn't "premature optimization" apply to API design?

Because the API is a contract, once it is used, it is difficult to change it. Even if you release a new version of the API, you have to drive various callers to upgrade their calling methods. Therefore, interface design is like designing a database schema. Once designed, it will be more difficult to change it in the future. Therefore, it is still necessary to design well. Just like the several documents I gave above - Microsoft REST API Guidelines, Paypal API Design Guidelines or Google API Design Guide are all good guidelines for you to design your API well.

3) Is POST more secure?

Won't.

Many students think that GET request data is in the URL, but POST is not, so they think POST is more secure. This is not the case. The HTTP URL PATH of the entire request will be encapsulated in the HTTP protocol header. As long as it's HTTPS, it's safe. Of course, some gateways such as nginx will log the URL or put it in the browser's history, so some people will say that GET requests are unsafe. However, POST is not much better. CSRF is the most common In terms of security issues, it is completely targeted at POST. Security is a very complex matter, and no matter which method or verb you use, it will not mean that you will be safer.

in addition,

  • If you want to prevent your GET from containing sensitive information, you should encrypt it. This is the same as POST.
  • If you want to prevent GET from being modified by a man-in-the-middle, you should do a URL signature. (Generally speaking, we all sign on GET and forget to do POST)
  • If you want to prevent someone from sending some malicious links to hack your users (the legendary GET is not as safe as POST), you should use authentication technology such as HMAC for authentication (see HTTP API authentication and authorization technology).

In short, you have to understand that the security issues of GET and POST are the same. No one is more secure than the other, and then you can take it lightly. Security must be taken very seriously.

4) Can using POST save time and reduce communication?

Not only is it not possible, it is even worse.

I feel like people who say this don’t know how to think.

  • One, assigning different verbs to the API takes almost no time. Writing CRUD under different functions is also a good programming style. In addition, almost all development frameworks now support very fast CRUD development, such as Spring Boot. Writing CRUD for the database basically does not require writing SQL language-related query code, which is very convenient.
  • Secondly, using a standardized approach can save learning costs for new team members and greatly reduce cross-team communication costs. Norms and standards actually save team time and improve overall efficiency. They are the basis for collaboration as a whole. Therefore, there are many standards in the world. As long as you follow these standards, the parts you produce can be adapted to other manufacturers' products. without the need to communicate with each other.
  • Third, the POST interface is all used, and it is not standardized. People who use your copycat API will have to keep asking you, which actually increases communication. In addition, you may develop business functions quickly, but when you are doing control logic, you have to rework. In the long run, you owe technical debt, which in turn leads to greater costs.

5) Correct posture to go home early

Don't think that you'll be fine if you go home early. If your code has problems like this and other people can't understand it, or problems arise due to misuse of your code, then what's the point of going home early? You will still be interrupted and even called to the company to deal with the problem. Therefore, what you should do is to "go home early for the long term" rather than "go home early for the short term". It is like going home early for the long term. Generally speaking, it is like this:

  • Organize and design the code well for better scalability. In this way, when facing new needs, you can change the code less or even not at all. This way you can go home early. Otherwise, every time the demand comes, you have to write it again. How can you go home early?
  • The quality of your code is good, with good documentation and comments. Therefore, others will not always come to you with problems, or ask you to deal with problems after you get off work. It can even be easy for anyone to take over your code, so you can really not be bothered

6) It’s just work, elegance cannot be eaten as food

Two points in response:

First, it’s just following a standard. Calling “normal” “elegant” shows how low the standard is. Such a low standard can only "survive for food".

Secondly, as a "professional programmer", you must learn to love and respect your profession. The most important thing about loving your profession is not to let outsiders look down on this profession. If you don't respect this profession yourself, how can you let others respect it? Respecting one's profession is not only about getting enviable rewards, but also about making one's profession more valuable.

I hope everyone can respect their profession and become a real professional programmer, not a code farmer!

Your work gives you power, but only your actions will give you respect
Your work gives you power, and only your actions give you respect.

(Full text ends)

Guess you like

Origin blog.csdn.net/yilovexing/article/details/132587175