About the HTTP process and the three-way handshake with four waves


Insert picture description here

Introduction to HTTP

The httpd protocol is a hypertext transfer protocol. It is based on the TCP/IP communication protocol to transfer data (text, pictures, query results, etc.).
When the client accesses static resources, it is a static page. When the client accesses a static resource, it is a dynamic access (some one-sided). ), it is located in the application layer.
Insert picture description here


Three characteristics of HTTP

No connection:
For example, download data, disconnect after downloading data.
Independent: It
can transmit
video, audio and pictures in any data type MIME-type content.
Stateless:
no memory capacity.
For example, it needs to be retransmitted after lack of information. Unlike tcp, tcp has memory.


HTTP message structure

HTTP is based on the client/server (C/S) architecture model, which exchanges information through a reliable link, and is a stateless request/response protocol.
An HTTP "client" is an application (Web browser or any other client) that connects to the server to send one or more HTTP requests to the server.
An HTTP "server" is also an application (usually a web service, such as Nginx, Apache web server or IIS server, etc.), which receives requests from the client and sends HTTP response data to the client.
HTTP uses Uniform Resource Identifiers (URI) to transmit data and establish connections.


HTTP request flow

Insert picture description here

1. DNS resolution The
browser checks its own DNS cache, and reads the local HOST file without reading the cache. Initiate a DNS system call (like an operator initiates DNS cache resolution). The operator checks its own cache. Initiate an iterative DNS resolution request (root DNS server -> com domain DNS server), return the result to the kernel of the operating system, and cache it (for the next use). This is a simple DNS resolution to get the IP corresponding to the domain name of the server.

2. Three handshake
based on TCP protocol
Insert picture description here

In the TCP/IP protocol, the TCP protocol provides a reliable connection service and uses a three-way handshake to establish a connection.
The first handshake: When establishing a connection, the client sends a syn packet (syn=j) to the server, and enters the SYN_SEND state, waiting for the server to confirm; the
second handshake: the server receives the syn packet, and must confirm the client's SYN (ack= j+1), at the same time it sends a SYN packet (syn=k), that is, SYN+ACK packet, at this time the server enters the SYN_RECV state; The third handshake: the client receives the SYN+ACK packet from the server and sends an acknowledgement packet to the server ACK(ack=k+1), this packet is sent, the client and server enter the ESTABLISHED state, and the three-way handshake is completed. After completing the three-way handshake, the client and server begin to transmit data.
After completing the three-way handshake, host A and host B begin to transmit data.

3. The client sends a resource request
http protocol to the server , which defines the form of the request message and the response message. Although the http message is transmitted through the "tcp protocol-based data transmission channel", the http protocol does not care about the http message The way of text transmission is how to standardize the request content and response content.
The request message that the client sends an HTTP request to the server includes the following format: request line, request header, blank line, and request data. The following figure shows the general format of the request message. .
Insert picture description here
For example, a request process like Baidu:
Insert picture description hereRequest line:
① HTTP request method:
According to the HTTP standard, HTTP requests can use multiple request methods.
HTTP1.0 defines three request methods: GET, POST and HEAD methods.
HTTP1.1 adds five new request methods: OPTIONS, PUT, DELETE, TRACE and CONNECT methods.
Key methods:
GET: simply get data (get an index.html page)
POST: upload/create a file (new data will be generated)
PUT: save data (overwrite/update files, pictures, etc., no new data will be generated)
DELETE :Delete
②URL:
URL: Uniform Resource Locator, which is an abstract and unique identification method of resource location.
Composition: <protocol>://<host>:<port>/<path>
Port and path can be omitted (HTTP default port number is 80)
③Protocol version:
The format of the protocol version is: HTTP/major version number. minor version Number, commonly used are HTTP/1.0 and HTTP/1.1
Request header
They are unique to the request message. They provide some additional information for the server, such as what type of data the client wants to receive, such as the Accept header.
Request data

4. Server response data
HTTP response also consists of four parts: status line, message header, blank line and response body.
Insert picture description hereStatus line
When a browser visits a web page, the browser of the viewer sends a request to the server where the web page is located. Before the browser receives and displays the webpage, the server where the webpage is located will return a server header containing the HTTP status code to respond to the browser's request.
The English of HTTP status code is HTTP Status Code.
The following are common HTTP status codes:

  • 200-request successful
  • 301-Resources (webpages, etc.) are permanently transferred to other URLs
  • 404-The requested resource (webpage, etc.) does not exist
  • 403-The server understood the request of the requesting client, but refused to execute the request
  • 500-Internal server error
    Insert picture description here

Response header
Response header: It is convenient for the client to provide information, for example, what type of server the customer service is interacting with, such as the Server header.
Entity header: refers to the header used to deal with the body part of the entity. For example, the entity header can be used to describe the data type of the entity body, such as the Content-Type header.

Response header Description
Allow Which request methods the server supports (such as GET, POST, etc.).
Content-Encoding Encode method of the document. Only after decoding can the content type specified by the Content-Type header be obtained. Using gzip to compress documents can significantly reduce the download time of HTML documents. Java's GZIPOutputStream can easily perform gzip compression, but only Netscape on Unix and IE 4 and IE 5 on Windows support it. Therefore, the servlet should check whether the browser supports gzip by looking at the Accept-Encoding header (ie request.getHeader("Accept-Encoding")), and return gzip-compressed HTML pages for browsers that support gzip, and return ordinary pages for other browsers. page.
Content-Length Indicates the content length. This data is only needed when the browser uses a persistent HTTP connection. If you want to take advantage of the persistent connection, you can write the output document to a ByteArrayOutputStream, check its size after completion, and then put the value in the Content-Length header, and finally send the content through byteArrayStream.writeTo(response.getOutputStream().
Content-Type Indicates what MIME type the following document belongs to. Servlet defaults to text/plain, but it usually needs to be explicitly specified as text/html. Since Content-Type is often set, HttpServletResponse provides a dedicated method setContentType.
Date The current GMT time. You can use setDateHeader to set this header to avoid the trouble of converting the time format.
Expires When should I think that the document has expired so that it is no longer cached?
Last-Modified The time when the document was last changed. Customers can provide a date through the If-Modified-Since request header, the request will be treated as a conditional GET, and only the documents whose modification time is later than the specified time will be returned, otherwise a 304 (Not Modified) status will be returned. Last-Modified can also be set with the setDateHeader method.
Location Indicates where the customer should go to retrieve the document. Location is usually not set directly, but through the sendRedirect method of HttpServletResponse, which also sets the status code to 302.
Refresh Indicates the time after which the browser should refresh the document, in seconds. In addition to refreshing the current document, you can also use setHeader("Refresh", "5; URL=http://host/path") to let the browser read the specified page. Note that this function is usually achieved by setting <META HTTP-EQUIV="Refresh" CONTENT="5;URL=http://host/path"> in the HEAD area of ​​the HTML page. This is because automatic refresh or redirection Those HTML writers who cannot use CGI or Servlet are very important. However, for Servlet, it is more convenient to directly set the Refresh header. Note that the meaning of Refresh is "refresh this page or visit the specified page after N seconds", not "refresh this page or visit the specified page every N seconds". Therefore, continuous refresh requires sending a Refresh header every time, and sending a 204 status code can prevent the browser from continuing to refresh, whether it is using the Refresh header or <META HTTP-EQUIV="Refresh" …>. Note that the Refresh header is not part of the official HTTP 1.1 specification, but an extension, but both Netscape and IE support it.
Server The server name. Servlet generally does not set this value, but is set by the Web server itself.
Set-Cookie Set cookies associated with the page. Servlet should not use response.setHeader("Set-Cookie", …), but should use the special method addCookie provided by HttpServletResponse. See the discussion about cookie settings below.

Corresponding data
Root head type returns data

5. Page rendering
The process of modern browsers rendering a page is like this: Parse html to build a DOM tree -> build a render tree -> layout a render tree -> draw a render tree.
The DOM tree is composed of the arrangement of tags in the html file.
The rendering tree is formed by adding style styles in css or html to the DOM tree. The render tree only contains DOM elements that need to be displayed on the page, such as elements or elements whose display attribute value is none are in the render tree. Before the browser receives the complete html file, it starts to render the page.
When encountering externally linked script tags, style tags, and pictures, the http request will be sent again to repeat the above steps. After receiving the css file, it will re-render the rendered pages and add their proper styles. After loading the image file, it will be displayed in the corresponding position immediately. In this process, redrawing or rearrangement of the page may be triggered.

6. The server disconnects (waves four times)
Since the TCP connection is full-duplex, each direction must be closed separately. The principle is that when one party completes its data sending task, it can send a FIN to terminate the connection in this direction. Receiving a FIN only means that there is no data flow in this direction. A TCP connection can still send data after receiving a FIN. The party that shuts down first will execute the active shutdown, and the other party will execute the passive shutdown.

第一次挥手:TCP 客户端发送一个 FIN,用来关闭客户端到服务器的数据传送。
(第一次挥手:由浏览器发起的,发送给服务器,我请求报文发送完了,你准备关闭吧)
第二次挥手:服务器收到这个 FIN,它发回一个 ACK,确认序号为收到的序号加 1 。和 SYN 一样,一个 FIN 将占用一个序号。
(第二次挥手:由服务器发起的,告诉浏览器,我请求报文接受完了,我准备关闭了,你也准备吧)
第三次挥手:服务器关闭客户端的连接,发送一个 FIN 给客户端。
(第三次挥手:由服务器发起,告诉浏览器,我响应报文发送完了,你准备关闭吧)
第四次挥手:客户端发回 ACK 报文确认,并将确认序号设置为收到序号加 1 。
(第四次挥手:由浏览器发起,告诉服务器,我响应报文接受完了,我准备关闭了,你也准备吧)



================================================= ================================================= ================================================= ================================================= ================================================= =============================================
Hard browsing and watching, if right You are helpful, please like it (σ゚∀゚)σ…:*☆

Guess you like

Origin blog.csdn.net/qq_26129413/article/details/112403920