One article to understand the HTTP protocol


foreword

Before that, you can read my previous article, which is also about the protocol.

When we open a web page, we usually notice that there is a unified logo http://(or https://) in front of the URL, which means that this visit uses the HTTP protocol for communication.

So here comes the question: Why do we use this protocol when communicating?

To put it simply, the so-called 协议is actually a norm , a standard , and everyone abides by it. By using a unified specification, the two parties in the communication can effectively structure the information and let the corresponding information belong to its place, which is exactly "God's to God, Caesar's to Caesar". In this way, the cost of information transmission can be greatly reduced.

Suppose, what happens if we don't follow a certain protocol when transmitting information on the Internet? The most direct impact is that chickens talk to ducks.

concept

HTTP, the full name of " HyperText Transfer Protocol (HyperText Transfer Protocol)", is the basis for building the World Wide Web as we know it today, and it is also the most common type of protocol when we access the Internet.

image-20210827173938365

protocol

In life, we can also see "agreement" everywhere, for example: when looking for a house, we will sign a "rental agreement"

The protocol in life is essentially the same as the protocol in the computer. The characteristics of the protocol are:

  • The word " Association " means that there must be more than two participants . There are two parties to a rental agreement: you and the landlord.
  • The word "" means a kind of behavioral agreement and norms for participants . The lease agreement stipulates the lease term, monthly rent amount, and how to deal with breach of contract.

For the HTTP protocol , we can understand it this way.

HTTP is a protocol used in the computer world . It uses a language that computers can understand to establish a specification for communication between computers ( more than two participants ), as well as various related control and error handling methods ( behavioral conventions and specifications ).

transmission

The so-called "transfer", well understood, is to move a bunch of things from point A to point B, or from point B to point A.

Don't underestimate this simple action, it contains at least two important information.

The HTTP protocol is a two-way protocol .

When we surf the Internet, the browser is the requester A, and the Baidu website is the responder B. The two parties agree to use the HTTP protocol to communicate, so the browser sends the request data to the website, and the website returns some data to the browser, and finally the browser renders it on the screen, and you can see pictures and videos.

image-20210827175005300

Although the data is transmitted between A and B, it is allowed to have transit or relay in the middle .

It’s as if the students in the first row want to pass a note to the students in the last row, so many students (intermediaries) need to pass through during the delivery process. This kind of transmission method changes from "A < — > B" to "A <-> N <-> M <-> B".

In HTTP, however, a middleman is required to comply with the HTTP protocol, and any extras can be added as long as they do not disturb the basic data transmission.

For transmission , we can further understand HTTP.

HTTP is a convention and specification for transferring data between two points in the computer world .

Hypertext

  • 文本: In the early days of the Internet, it was just simple character text, but now the meaning of "text" can be expanded to include pictures, videos, compressed files, etc., which are all counted as "text" in the eyes of HTTP.

  • 超文本: It is a text beyond ordinary text . It is a mixture of text, pictures, videos, etc. The most important thing is hyperlinks, which can jump from one hypertext to another hypertext.

HTML is the most common hypertext. It itself is just a plain text file, but inside it uses many tags to define the links of pictures, videos, etc. After interpreted by the browser, what is presented to us is a web page with text and pictures. .

HTTP is a "convention and specification" for "transmitting" text, pictures, audio, video and other "hypertext" data between "two points" in the computer world.

Format of the HTTP protocol

We divide the entire protocol into two categories: request and response.

HTTP request

Generally, the HTTP protocol format is mainly divided into four parts: start line , message header , blank line , and message body ; as shown in the figure.

image-20210827184000299

Among them, the start line contains three pieces of information: method , URI , HTTP protocol version .

Method : refers to the operation to be performed by this request, sometimes also called "HTTP verb" or "HTTP verb". The common method is GETwith POSTthese two: GETit means that the client wants to obtain resources from the server; and POSTit means that the client wants the server to transmit some form data.

URI : Generally speaking, it will be an absolute path, which can be followed by a question mark "?" and a query string at the end; when using a proxy, it will be a complete URL.

HTTP protocol version : It literally means, tell the other party which version of the HTTP protocol they are using, so as to avoid confusion.

The most common form of a start line looks like this:

GET /user/3940246036953293 HTTP/1.1

When using a proxy it becomes (the URL is fictitious):

GET https://juejin.cn/user/3940246036953293 HTTP/1.1

Message header : Contains some description information about the message, the format is <field>:<value>. Specifically, various message headers are divided into four categories: general headers, request headers, response headers (for response messages) and entity headers.

Empty line : The function is to prompt the end of the message header and the beginning of the message body, without spending any more pen and ink.

Message body : that is, the body of an HTTP message to be transmitted. However, it is a little embarrassing that for some methods, no other information needs to be transmitted, only the start line and message header are enough (such as GETmethods), so this part is not only not necessarily the longest, it may even be is empty.

HTTP response

Similar to the request message, the HTTP response message is also divided into four parts: status line , message header , blank line , and message body , as shown in the figure:

image-20210827184052605

The latter three parts are basically consistent with the HTTP request message, so focus on the status line .

The status line also consists of three parts: HTTP protocol version , status code , and status text . There is no need to elaborate on the HTTP protocol version.

Status code : In fact, we are very familiar with it. The most typical one is that whenever a URL we visit does not exist, we will get a 404status code. So the status code is actually a number used to identify whether the request was successful or not. In addition 404, typical status codes include 200(request successful), 301(resource is permanently moved), 302(resource is temporarily moved), etc.

According to the first digit, the status code can be roughly divided into 5 types:

Status text : It is a short and pure message, which describes the actual status represented by the status code, for the convenience of human-computer interaction.

So a typical status line might look like this:

HTTP/1.1 404 Not Found

When the client uses GETthe method to request a webpage from the server, if the request is successful, the message body of the server's HTTP response message will contain the HTML text of the webpage.

Summarize

This article just briefly introduces the HTTP protocol, and has a superficial understanding of some knowledge related to it. In the following articles, we will have a deeper understanding of HTTP knowledge.

Guess you like

Origin blog.csdn.net/jiang_wang01/article/details/131394868