In-depth analysis of the secret behind the Http request (below)

Preface

This side is written in response to the in-depth analysis of the secret behind the Http request (top) above. It mainly explains the process of officially sending data after our client has established a connection channel with the server in the process of an HTTP request. There are two main steps: sending and answering requests, and disconnecting .

Due to the limited space, I will try my best to analyze the content that we usually come into contact with. Some of the ones that do not need to be understood are briefly explained. If it is useful to you, remember to click three links~

1. Request sending and response

When we create the Http connection, the next step is to actually send our request data. After the application sends the data to the protocol stack, the protocol stack will temporarily put the data in the sending buffer and wait for the next segment of the application. data. Then, after reaching a certain amount, it is sent out at one time, so as to make full use of the capacity of each network packet. However, this will lengthen the waiting time for each transmission and ultimately lead to the delay of the request. Therefore, it needs to be considered comprehensively. Different operating systems deal with this in different ways. The main consideration is the balance of two factors: the largest network packet can accommodate Length (MTU) and waiting time . For browsers, which require high time requirements, they generally need to be sent out immediately after the buffer is sent. The general structure of the entire packet sent is as follows.
Insert picture description here
After the data is sent, it does not mean that the transmission is successful. The server needs to return a received response to let the client know that the data has been successfully sent successfully. Otherwise, if the sending fails, the TCP data retransmission mechanism will be triggered. . How to ensure that the data is sent successfully here? This requires relying on the screenshot of the message in our previous article, as follows: The
Insert picture description here
previous article introduced the 32-bit serial number and confirmation number . The serial number is used to tell the other party that the data sent this time is the first number from the beginning of the entire data. The length of the data sent can be obtained by the server itself by subtracting the length of the packet header from the length of the entire packet. After the server receives the data, it needs to return an acknowledgment, and it will write how many bytes it has received so far into the 32-bit acknowledgment number (ACK number). After the client receives it, it can confirm whether the other party has received all the data through the ACK number, so as to consider whether to resend or send the next data packet. The specific process is roughly as follows:
Insert picture description here

Note: The serial number does not start from the initial position , but a random position. This is to ensure the security of data transmission and avoid network attacks. When a network connection is created , the value of the serial number field will be set at the same time during the first handshake. This value is the initial value of the serial number .

It should be noted that when the server or client receives the data packet sent by the other party, it will temporarily store it in the receiving buffer , and then the corresponding application will read the data in this buffer.

In the actual interaction process, in order to reduce the waiting time and improve the transmission efficiency, when the client has not received the ACK number returned by the server, it can also continue to send the next segment of the data packet, so as to reduce the waiting for the ACK number to respond. time. However, this may also cause the server's receiving buffer to overflow . Therefore, the client needs to know the size of the server's receiving buffer at the beginning, and then calculate the server's receiving buffer every time when sending request data. How much space is left to prevent memory overflow. Since the application on the server side may read the data in the receiving buffer, the read data will be deleted, which causes the size of the buffer to actually change, so the server will also need to read the buffer in the application. When zone data, a notification of the change in the size of the space is sent to the client.
At the same time, when the server receives multiple data transmission requests from the client, it can omit the notification of returning multiple ACK numbers and return directly once to indicate the final received length, even when the data in the buffer is just When reading, the notification of sending space size change can be combined with the ACK number return notification into one to improve the efficiency of information transmission. The general process is as follows: The
Insert picture description here
above has roughly finished the process of sending data from the browser to the server. As for the server returning data to the browser, the process and principle are basically the same as the browser sending data, so I will not introduce it here. After the browser sends the data request, it will call the read application in the socket library. The control flow will be transferred to the protocol stack through read and try to read data from the receiving buffer. If the receiving buffer has not been received yet When the data arrives, the application program will be temporarily suspended, and the receiving work will continue after the response message returned by the server arrives. As for how the server sends the response data, this is basically the same as the process of the client sending the request data, so I won't repeat it.

Two, disconnect

When all the data has been transferred, the connection needs to be disconnected. As for which party initiates the disconnection operation, this is allowed on the protocol stack. Take the server-side disconnection process as an example to explain.

  1. The application on the server side will call the close program in the Socket library, and then the server's protocol stack will generate the TCP header of the disconnect message, specifically, the FIN bit in the control bit is set to 1. When the server sends a disconnect request, it will also record information about the disconnect operation in the server's socket.
  2. After the client receives the disconnect request, it will also mark its corresponding socket as a disconnected operation state. Then send a message containing the ACK number to the server to inform the server that the disconnect request has been received. And inform the application (browser) that all the response data has been received.
  3. After the client receives all the data, it will also call close to end the sending and receiving operations. A TCP packet with a FIN of 1 will also be generated and sent to the server to inform the server client to end data transmission and reception.
  4. The server returns a request containing the ACK number to inform the client that it has received its disconnect request.

After completing the above four operations (waves), the socket originally used for communication will be deleted. However, the socket is generally not deleted immediately, but will wait for a period of time to prevent misoperation. There are many situations for this misoperation. For example, when the server does not receive the ACK number, it will resend a request with FIN of 1. If the client socket is deleted, the corresponding port number will also be released. Other applications just use this port to create a socket, and then receive the FIN request from the server, which will cause other applications to trigger the disconnected operation process. Therefore, after the communication is completed, the general socket will be temporarily retained for a period of time before being deleted.

The overall process is roughly as follows. Of course, there are still many points worth exploring and scrutinizing, so I won't go into details here.
Insert picture description here
Insert picture description here
Originality is not easy, welcome one-click triple connection~

Reference materials:

"How the Internet is Connected"

TCP header format

Guess you like

Origin blog.csdn.net/dypnlw/article/details/114121074