Article Directory
foreword
Before that, you can read my previous article, which is also about the protocol.
When we open a web page, we usually notice that there is a unified logo http://
(or https://
) in front of the URL, which means that this visit uses the HTTP protocol for communication.
So here comes the question: Why do we use this protocol when communicating?
To put it simply, the so-called 协议
is actually a norm , a standard , and everyone abides by it. By using a unified specification, the two parties in the communication can effectively structure the information and let the corresponding information belong to its place, which is exactly "God's to God, Caesar's to Caesar". In this way, the cost of information transmission can be greatly reduced.
Suppose, what happens if we don't follow a certain protocol when transmitting information on the Internet? The most direct impact is that chickens talk to ducks.
concept
HTTP, the full name of " HyperText Transfer Protocol (HyperText Transfer Protocol)", is the basis for building the World Wide Web as we know it today, and it is also the most common type of protocol when we access the Internet.
protocol
In life, we can also see "agreement" everywhere, for example: when looking for a house, we will sign a "rental agreement"
The protocol in life is essentially the same as the protocol in the computer. The characteristics of the protocol are:
- The word " Association " means that there must be more than two participants . There are two parties to a rental agreement: you and the landlord.
- The word "议" means a kind of behavioral agreement and norms for participants . The lease agreement stipulates the lease term, monthly rent amount, and how to deal with breach of contract.
For the HTTP protocol , we can understand it this way.
HTTP is a protocol used in the computer world . It uses a language that computers can understand to establish a specification for communication between computers ( more than two participants ), as well as various related control and error handling methods ( behavioral conventions and specifications ).
transmission
The so-called "transfer", well understood, is to move a bunch of things from point A to point B, or from point B to point A.
Don't underestimate this simple action, it contains at least two important information.
The HTTP protocol is a two-way protocol .
When we surf the Internet, the browser is the requester A, and the Baidu website is the responder B. The two parties agree to use the HTTP protocol to communicate, so the browser sends the request data to the website, and the website returns some data to the browser, and finally the browser renders it on the screen, and you can see pictures and videos.
Although the data is transmitted between A and B, it is allowed to have transit or relay in the middle .
It’s as if the students in the first row want to pass a note to the students in the last row, so many students (intermediaries) need to pass through during the delivery process. This kind of transmission method changes from "A < — > B" to "A <-> N <-> M <-> B".
In HTTP, however, a middleman is required to comply with the HTTP protocol, and any extras can be added as long as they do not disturb the basic data transmission.
For transmission , we can further understand HTTP.
HTTP is a convention and specification for transferring data between two points in the computer world .
Hypertext
-
文本
: In the early days of the Internet, it was just simple character text, but now the meaning of "text" can be expanded to include pictures, videos, compressed files, etc., which are all counted as "text" in the eyes of HTTP. -
超文本
: It is a text beyond ordinary text . It is a mixture of text, pictures, videos, etc. The most important thing is hyperlinks, which can jump from one hypertext to another hypertext.
HTML is the most common hypertext. It itself is just a plain text file, but inside it uses many tags to define the links of pictures, videos, etc. After interpreted by the browser, what is presented to us is a web page with text and pictures. .
HTTP is a "convention and specification" for "transmitting" text, pictures, audio, video and other "hypertext" data between "two points" in the computer world.
Format of the HTTP protocol
We divide the entire protocol into two categories: request and response.
HTTP request
Generally, the HTTP protocol format is mainly divided into four parts: start line , message header , blank line , and message body ; as shown in the figure.
Among them, the start line contains three pieces of information: method , URI , HTTP protocol version .
Method : refers to the operation to be performed by this request, sometimes also called "HTTP verb" or "HTTP verb". The common method is GET
with POST
these two: GET
it means that the client wants to obtain resources from the server; and POST
it means that the client wants the server to transmit some form data.
URI : Generally speaking, it will be an absolute path, which can be followed by a question mark "?" and a query string at the end; when using a proxy, it will be a complete URL.
HTTP protocol version : It literally means, tell the other party which version of the HTTP protocol they are using, so as to avoid confusion.
The most common form of a start line looks like this:
GET /user/3940246036953293 HTTP/1.1
When using a proxy it becomes (the URL is fictitious):
GET https://juejin.cn/user/3940246036953293 HTTP/1.1
Message header : Contains some description information about the message, the format is <field>:<value>
. Specifically, various message headers are divided into four categories: general headers, request headers, response headers (for response messages) and entity headers.
Empty line : The function is to prompt the end of the message header and the beginning of the message body, without spending any more pen and ink.
Message body : that is, the body of an HTTP message to be transmitted. However, it is a little embarrassing that for some methods, no other information needs to be transmitted, only the start line and message header are enough (such as GET
methods), so this part is not only not necessarily the longest, it may even be is empty.
HTTP response
Similar to the request message, the HTTP response message is also divided into four parts: status line , message header , blank line , and message body , as shown in the figure:
The latter three parts are basically consistent with the HTTP request message, so focus on the status line .
The status line also consists of three parts: HTTP protocol version , status code , and status text . There is no need to elaborate on the HTTP protocol version.
Status code : In fact, we are very familiar with it. The most typical one is that whenever a URL we visit does not exist, we will get a 404
status code. So the status code is actually a number used to identify whether the request was successful or not. In addition 404
, typical status codes include 200
(request successful), 301
(resource is permanently moved), 302
(resource is temporarily moved), etc.
According to the first digit, the status code can be roughly divided into 5 types:
Status text : It is a short and pure message, which describes the actual status represented by the status code, for the convenience of human-computer interaction.
So a typical status line might look like this:
HTTP/1.1 404 Not Found
When the client uses GET
the method to request a webpage from the server, if the request is successful, the message body of the server's HTTP response message will contain the HTML text of the webpage.
Summarize
This article just briefly introduces the HTTP protocol, and has a superficial understanding of some knowledge related to it. In the following articles, we will have a deeper understanding of HTTP knowledge.