What will happen when you enter a url address in the address box in the browser?

1. http request process

http (Hypertext Transfer Protocol) is an application layer protocol based on the TCP protocol.

1.1 url domain name is resolved into ip address (dns resolution)

Because the general browser finds and accesses the server through the IP address, not through the domain name. So you need to try to find the corresponding ip address so that you can access the server.

So how do you find the ip address?

It is searched layer by layer in the following sequence. If one layer finds the ip address, it will directly enter the next part of the TCP connection.

浏览器缓存
系统缓存
路由器
提供商
递归查询
客户端取到ip地址

Browser cache→system cache→router→Internet service provider→recursive query

Next, describe the following search steps in detail

  1. (Browser cache part) Find out whether the browser cache has the ip address corresponding to the input domain name (whether the cache has not expired), if yes, end the parsing, otherwise, go to step 2.
  2. (System cache) Find out whether there is a corresponding DNS cache in the system cache (whether the cache has not expired), if so, end the parsing, otherwise, go to step 3.
  3. (Router) Send a request to the router to which the LAN belongs, and if the request has an ip address, the parsing ends, otherwise, enter step 4.
  4. (Internet service provider) sends a request to the router to which the LAN belongs, and if the request has an ip address, the parsing ends, otherwise, it enters step 5.
  5. (Recursive query) Use recursive query, from the root domain name server to the top-level domain name server, from the top-level domain name server to the queried server, until the IP address is found, the analysis ends.

1.2 Establish a three-way handshake TCP connection

Because TCP is similar to connection-oriented, it is a secure transmission between the server and the client. As long as a TCP connection is established, it proves that both parties can send or receive data normally.

The message format of normal Internet transmission will be as follows:
insert image description here
To confirm the sending or receiving data of both parties, it is confirmed by the transmission of three sending packets, commonly known as TCP's three-way handshake.

The flow chart of TCP's three-way handshake is as follows:
insert image description here
According to the process in the above figure, it can be divided into the following three steps:

  1. The client sends a SYN/Seq message to the server (SYN=1, Seq=X), and the server judges that the sending by the client is normal.
  2. The server receives the message in step 1, and sends a SYN/ACK/Seq message to the client (SYN=1, ACK=1, ack=X+1, Seq=Y), at this time, the client judges the sending of the server and receive normally.
  3. The client receives the message in step 2 and sends an ACK/Seq message (ACK=1, ack=Y+1, Seq=X+1) to the server. At this time, the client judges that the sending and receiving of the server are normal. The server judges that the sending and receiving of the client are normal.

1.2.1 SYN, Seq, ack, ACK in the message

The following are used in the TCP three-way handshake:
① SYN (synchronous): When it is 1, it means that it is determined that it needs to establish a connection request.
② Seq (Sequence number sequence number): Indicates the data sent by itself.
③ ack (Acknowledge number confirmation number): Indicates the data sent by the other party.
④ ACK (acknowledgment confirmation): When it is 1, it means that the request is confirmed to be received.

1.2.2 Why do we need to use the three-way handshake?

It is to ensure that the connection between the server and the client is normal, and prevent errors in the connection between sending wrong messages to each other.

Anyway, if the following one or two handshakes will happen as follows:

  • One handshake: The server value only judges that the client's transmission is normal.
  • Two handshakes: the server value only judges that the client's sending is normal, and the client only judges that the server's sending and receiving are normal.

So a three-way handshake is required, and the usual communication is:
Customer: Are you at the server? I want to connect.
Clothing: I am here, and I will establish a connection with you.
Guest: OK, I have received your reply and can start connecting.

1.3 Send http request

When the url address is entered in the address box in the browser, a get request is generally executed, and opening the webpage is only to obtain the page data from the backend of the server for analysis. After the TCP connection is established between the server and the client, the corresponding HTTP request will be sent.
Now that we talk about http requests, let's describe the principle of interface requests in more depth.
Request methods are generally divided into: get , post , put , delete , options , head , etc. Generally, projects use get and post requests.
The request address of the project is to use <protocol (http or https)>://<host>:<port>/<path (the path name of the front-end setting page)>?<parameter (the parameter used when the front-end stores the get request) > Composition.

For example, when we enter Baidu and observe the console, we can view the request address and request method corresponding to the Baidu homepage, as shown in the following figure:
insert image description here

1.3.1 Difference between get request and post request in http request

get:
1. It is generally used to obtain data from the database, and there will be a cache when the requested data is a static resource.
2. The parameters of the front-end request are placed on the url, and the user can actually see the content of the requested parameters, so the security is not very good.
3. The length of the request data is limited, generally within 1k.
4. It can be saved as a browser bookmark, because the request address and parameters are on the url, which can be saved in the browser browsing history.
5. Browser back and refresh operations have no effect on data.
6. The encoding format is application/x-www-form-urlencoded.
7. The requested parameters only allow ASCII characters to be transmitted.
8. When sending a request, the http header and data will be sent together. (one TCP packet)

post:
1. It is generally used for operations such as adding, modifying, and deleting data.
2. The parameters of the front-end request are sent on the body of the http message, rather than written on the body, so the security is relatively safe.
3. There is no limit to the length of the request data (the data length is determined by the browser and server)
4. It cannot be saved as a browsing bookmark, because the request parameters are not written on the request address, and the page will report an error even if it is accessed.
5. When the browser goes back and refreshes, it will send a request to the backend again, which is to modify the data in the database again.
6. The encoding format not only includes application/x-www-form-urlencoded, but also multipart/form-data encoding.
7. There is no limit to the parameters of the request.
8. When sending a request, the http header will be sent first, and the data will be sent when the backend returns a status code of 100 indicating that it can continue. (two TCP packets)

1.4 The server processes the request and responds accordingly

Generally, after the data request is executed, if there is no cross-domain, the request interface does not exist, or the request interface times out, a status code, msg operation prompt and data will be returned from the backend.
The status code is basically as follows:

  • 1XX: (instruction information) means that the request has been received and will be further processed.
  • 2XX: (Success) means that the request is successful and the corresponding data has been returned.
  • 3XX: (Redirect) To continue with the request, further action is required.
  • 4XX: (client error) There is a writing error in the request interface, or the parameter is wrong (value, type or format) and the request operation cannot be performed.
  • 5XX: (Server Error) The server failed to fulfill a legitimate request.

For example, when we enter Baidu and observe the console, we can see the status code of the request to obtain the Baidu homepage, as shown in the following figure:
insert image description here

Personal experience sharing:
Generally, when the status code returned by the front-end docking interface is not 2XX, the following step-by-step analysis is required for troubleshooting.
1. First observe whether the return result of the backend prompts which lines of PHP report errors, and communicate with the backend father if there is any. 2. The
interface request address is written incorrectly
3. Whether the request method is get type or post type
4. Whether the parameter placement is correct, Whether to put it on the body of the http message or splicing after the url
5. If it is a get request, whether the transmitted data format is converted to ASCII code format
6. Whether the corresponding encoding format is correct
7. Whether the transmitted parameters, values, types, and formats are the same as those in the following The interface documents written on the end are consistent
. 8. If all of the above are checked and there is no problem, then communicate with the back-end father to see if there is a problem with data business processing.

1.5 The browser parses the data and renders the page

Let’s have a general understanding of the following specific process. The specific flow chart is as follows:
insert image description here
The specific analysis process will be divided into the following parts:

  1. According to the analysis of HTML data, the corresponding DOM tree is generated.
  2. Generate the corresponding CSSOM tree according to the CSS data analysis.
  3. Traverse the NODE node of each level of DOM tree, and calculate all the styles under the node, and generate a DOM tree with each node containing the final style. (DOM tree and CSSOM tree are combined accordingly)
  4. For the DOM tree containing styles, it is layered according to the calculated coordinate position of the page where it is located and the front and rear positions of the page to generate the corresponding Layout tree.
  5. Finally, generate drawing instructions for the page, divide the page area into blocks, rasterize the blocks (GPU acceleration confirms the rgb information of each pixel), and finally perform the drawing operation on the page.

The operation process from data parsing to the bottom layer of the rendered page may be cumbersome, and I will focus on describing the data step in detail later.

1.6 Wave four times to disconnect the TCP connection

When disconnecting the TCP connection, four wave operations are performed. The flow chart of the TCP four-way wave is as follows:
insert image description here
According to the flow chart above, it can be divided into the following four steps:

  1. The client sends a FIN/seq message to the server (FIN=1, seq=x). At this time, the client indicates to the server that it needs to disconnect, and the server already knows that the client needs to disconnect.
  2. The server receives the message in step 1 and sends an ACK/ack/seq message (ACK=1, seq=y, ack=x+1) to the server. At this time, the client knows that the server has received the request to terminate Open the connection operation, and wait for the server to complete the last piece of data transmission.
  3. After a period of time, the server continues to send ACK/FIN/ack/seq messages (ACK=1, FIN=1, seq=z, ack=x+1) to the client. At this time, the client already knows that the server sent the last A piece of data transmission is completed, and the server is waiting for the client to confirm the completion of the last piece of data transmission.
  4. After receiving the message in step 3, the client sends a message ACK/ack/seq message (ACK=1, ack=z+1, seq=x+1) to the server, and the client waits after sending After a period of time, it will switch to the closed state, and the server will switch to the closed state after receiving the confirmation of the completion of the client's transmission (the message in step 4).

1.6.1 FIN in the message

SYN, Seq, ack, ACK, and FIN are used in the four-way handshake of TCP (SYN, Seq, ack, and ACK are described in the three-way handshake 1.2.1, and only FIN is analyzed here): ① FIN: When 1
, Indicates that you need to confirm the disconnection.

1.6.2 Why do you need to use four waves?

Prevent incomplete data transmission from wasting resources, or whether the client and server can be shut down normally.

Anyway, if you wave your hand once, twice, or three times, the following will happen:

  • One hand wave: the client cannot confirm whether the server has received the disconnection request, and whether all the data transmission of the server has been completed.
  • Second waving: Although the client confirms that it has received the first waving from the server, it cannot be closed directly at this time, because all data on the server has not been transmitted yet.
  • Wave three times: Although the client knows that the server knows the disconnection request, and the client also gets all the data from the server, it does not tell the server that it has received the last batch of data.

So you need to wave your hands four times, and the usual communication is:
Guest: Are you here? I need to disconnect.
Service: I know you need to disconnect, this time the data is XXX.
Server: The data this time is YYY, I have sent all the data to you, you can prepare to disconnect.
Guest: Okay, I have received the last data YYY here, I will wait for a while for your response, in case you can’t hear me. If there is no problem, I will close the connection. On the server side, you can also close the server yourself without waiting for my response.

1.6.3 Why does the server enter CLOSE-WAIT when it receives the first hand wave?

Because the server is not sure whether the data between the client and the server has been completely transmitted at this time, it will not immediately send a message about FIN=1, and immediately execute the close connection process, but enter the CLOSE-WAIT state.

1.6.4 Why does the state enter IME-WAIT when the client sends out the fourth wave, instead of closing immediately?

Because the client prevents the loss of the packet sent for the fourth wave at this time, and the server resends the ACK/FIN/ack/seq packet for the third wave.
If the client is directly closed, it will not receive the server message, and the server has been waiting for the fourth wave of the client. If the transmission status of the server is not closed, resources will be wasted.

Guess you like

Origin blog.csdn.net/Ak47a7/article/details/130307948