Application of HTTP protocol from front-end development

 

一、Chrome Developer Network Tab

As the most commonly used development and debugging tool for front-end developers, Cheome Developer has powerful functions in all aspects that can be involved in the front-end, which provides great convenience for our development and positioning problems. Among them, Network Tab is a very commonly used functional section. Through its XHR, JS, CSS, Img and other sub-tabs, we can capture all network requests based on the HTTP/HTTPS protocol of the application layer, and we can view all the header information and content of the request and response.

Network Tab 

Shows all properties for each HTTP request, including:

 

The Connection ID is the connection ID of the transport layer TCP protocol. This will be mentioned in the next chapter.

 

Headers mainly shows the status of the request, as well as the header information of the request and response. The header information is the basis for the operation of both parties in the HTTP interaction:

Most of the keys in the headers are not unfamiliar to experienced developers and do not need to be introduced here. But still need to mention two keys:

content-type is of great significance as a key to describe the MIME format of interactive content data. In actual development, we send a request and receive nothing without receiving it. If the other parts of the request are ok, it is probably because the content-type of the front and back ends does not match. cause.

Referer is also useful as a key that describes the domain to which the request originator belongs.

1. Through it we can make statistics on the number of visits to the website;

2. It is possible to restrict the access of any resource (anti-leech), for example: I refer to a picture URL of Qzone and put it on the <img /> of the webpage served by my own HTTP server. I didn't get this picture at the time, and replaced it with a picture of a site with restricted access. That is to say, when the server of Qzone receives a resource request, it detects the referer. If the request initiated by the page that is not in Qzone cannot obtain the target image normally. Referer itself is a wrong word. The correct spelling should be referrer, which is translated as introducer, which describes the domain under which the request for resources or the operation to jump to a URL is performed. Later, in order to be backward compatible with the HTTP protocol, this wrong word has not been modified.

It should be noted that when we access a resource directly from the browser address bar, the referer is empty at this time, because there is no real referrer at this time, this is a request generated out of thin air, not from anywhere else chain past.

 

 

Response shows the content of the server response, and Preview is the formatted data content processed according to the MIME type of the Content-Type of both parties in the Headers, which is convenient for developers to browse:

Cookie shows the cookies in the browser headers in this request, as well as the HTTP server-side settings for browser-side cookies:

 

Timing The life cycle timing of the entire request from preparation to completion:

For experienced developers, quite useful information can be obtained from Headers, Preview and Response, and Cookies. For Timing Tab, it is closer to the bottom layer, showing the whole process of initiating an HTTP request on the browser side. According to Chrome's official explanation, the stages in Timing are described as follows:

1. Queuing

If a request is queued, it means:

1) The request is deferred by the rendering engine because it is considered lower priority than critical resources (like scripts/styles). This often happens with images.
2) The request is on hold, waiting for an unavailable TCP socket that is about to be released.
3) This request was put on hold because of browser limitations. In the HTTP 1 protocol, there can only be 6 TCP connections on each origin, this issue will be addressed in the next chapter.
4) Disk cache entries are being generated (usually very fast).

2. Stalled/Blocking
The time to wait before sending a request. It may be blocked for whatever reason it entered the queue. This time includes the time for proxy negotiation.

3. Proxy Negotiation (proxy negotiation)
The time it takes to negotiate with the proxy server connection

4.DNS Lookup

The time it took to perform a DNS lookup. Each new domain on the page requires a full roundtrip to do a DNS lookup. When there is no local DNS cache, this time may be for a long time, but for example, once you set up DNS in the host, or visit for the second time, because the browser's DNS cache is still there, this time is 0.

5.Initial Connection / Connecting (Initial Connection / Connecting)
the time required to establish a connection, including TCP handshake / retry and negotiated SSL.

6. Time
taken by SSL to complete the SSL handshake, if it is HTTPS

7.Request Sent / Sending (request has been sent / is being sent)
The time it takes to make a network request. Usually a fraction of a millisecond.

8.Waiting (TTFB) (waiting)
The time spent waiting for the initial response, also known as `Time To First Byte` (the time it took to receive the first byte). In addition to the time spent waiting for the server to deliver a response, this time also captures the delay in sending data from the server. These situations may cause high TTFB: 1. Poor network conditions between client and server; 2. Slow server-side program response.

9. Content Download / Downloading (Content Download/Download)
The time it takes to receive the response data. Starts when the first byte is received and ends when the last byte is downloaded.

By understanding each stage of request issuance and response, we can analyze the problems existing in the current HTTP request and solve the problem accordingly.

 

 

Second, the interaction process between the client and the server through the HTTP protocol

In the description of the HTTP protocol RFC2616 , HTTP is used as the application layer protocol, and TCP/IP is recommended and used by default as the transport layer protocol, and any other reliable transport layer protocol can also be adopted and used by the HTTP protocol. That is to say, if UDP is "reliable", HTTP can also walk on UDP. Currently, the HTTP requests of popular browsers on the market generally abide by this principle and use TCP/IP as the transport layer protocol.

Below is a captured HTTPS GET request to https://localhost:3000/api/syncsystemstatus via XMLHttpRequest:

 

In the previous chapter, it was mentioned that the Connection ID is the ID of the TCP connection, which indicates which TCP connection the resource request is completed through.

Usually, we use Fiddler, Charles or Chome Developer tools to capture packets only for HTTP/HTTPS requests. Here we use WireShark to capture packets of lower-level protocol connections, and analyze the connection from establishment to end. the whole process. The screenshot of WireShark's packet capture is as follows:

 

Description: Since the author uses Webpack's dev-server to do a forward proxy to localhost:3000 and turns on HTTPS, since the server does not turn on HTTPS, the dev-server to the server is not HTTPS but HTTP1.1, 192.168.11.94 It is the IP of the dev-server, which can be regarded as localhost:3000, which is the client browser. 192.168.100.101 is the destination to which the dev-server is forwarded to, and is also the HTTP server to which the request is to be sent. Simply put, this example is an HTTP1.1 request from the browser (192.168.11.14) to the server (192.168.100.101) through the XMLHttpRequest object.

The client-server interaction process is as follows:

The No.x number is the leftmost column in the WireShark packet list, recording the number of each packet in this grab, and increasing in turn.

No.1 : The browser (192.168.11.94) sends a connection request to the server (192.168.100.101), sends a SYN packet, enters the SYN_SEND state, and waits for the server to confirm. This is the first of TCP's three-way handshake.

 No.2 : The server (192.168.100.101) responded to the request of the browser (192.168.11.94), confirmed the browser's SYN (ACK=J+1), and also sent a SYN packet, that is, a SYN+ACK packet, asking to browse The server confirms, and the server enters the SYN_RECV state at this time. This is the second of the TCP three-way handshake.

No.3 : The browser (192.168.11.94) responded to the SYN+ACK packet of the server (192.168.100.101), and sent the confirmation packet ACK (ACK=K+1) to the server. After the packet was sent, the browser and the server entered ESTABLISHED Status, this is the third time of the TCP three-way handshake, the handshake is completed, and the TCP connection is successfully established.

 

No.4 : The browser (192.168.11.94) makes an HTTP request to the server (192.168.100.101).

No.5 : The server (192.168.100.101) receives the request from the browser (192.168.11.94), confirms it, and then starts sending data.

No.6 : The server (192.168.100.101) sends a status response code 200 to the browser (192.168.11.94), indicating that the data transmission is successful and completed, and the content-type indicates that the content text of the response needs to be parsed into JSON format, and OK ends. At this time, our developers can get the complete return data of the server by judging that the readyState of XHR is 4 and the status is 200 and apply it to the front-end logic or page display.

Corresponding to the request sequence diagram of the Chrome Developer Network mentioned in the first chapter:

1. Initiate the first request and complete the establishment of the connection: No.1 to No.4 correspond to steps 5 to 7 in the sequence diagram. The readyState of XHR is 0-2, initialize the request, send the request and establish the connection,

2. Based on the establishment of the TCP connection, data transmission is carried out through the HTTP protocol: No.5 corresponds to steps 8 to 9 in the sequence diagram, the readyState of XHR is 3, and the data is started during the interaction. After the data transmission is completed, the readyState is 4 and the status is 200.

The same is true for the request initiated by the Fetch object, but Fetch is based on Promise encapsulation, readyState and status can be understood as internal control to determine the situation of resolve and reject. The author's project actually uses Fetch, but here is the XMLHttpRequest object, that is, Ajax, which is easier to understand.

A schematic diagram of the three-way handshake of TCP for No.1 to No.3 :

SYN: Synchronize Sequence Numbers Synchronize sequence numbers.

SYN_SEND: Request a connection. When you want to access the services of other computers, you must first send a synchronization signal to the port. At this time, the status is SYN_SENT. If the connection is successful, it becomes ESTABLISHED.

ACK: Acknowledgement acknowledgement character. In data communication, a transmission-type control character sent by the receiving station to the sending station. Indicates that the sent data has been confirmed to be received without error. In the TCP/IP protocol, if the receiver successfully receives the data, it will reply with an ACK data. Usually the ACK signal has its own fixed format, length and size, which is replied by the receiver to the sender.

No.4 is the HTTP packet, which indicates that the HTTP connection is established based on the TCP connection.

 

 

3. The problem that subsequent requests cannot be initiated due to blocking of previous requests

The project I am currently developing returns data from the underlying Go interface to the Node.js layer, and the Node.js layer returns to the front-end interface. When the underlying interface is not optimized, it takes at least 500ms to complete a request, with an average of about 800ms, and even more than 1s. The front-end is a React.j-based SPA application. For the accuracy of data, each interface will request data immediately after entering the interface, and the background also maintains a CronJob that updates data every 15s. If the brute force switch routing changes the interface can create a large number of HTTP requests in a short period of time. The performance under HTTP1.1 is extremely bad, and the blocking situation is serious. HTTP1.1 can only create 6 TCP connections at the same time by default, and each connection can be released for use by requests for another resource after the end of each connection. Although compared with HTTP1.0, the performance has been greatly improved, but there is no essential change. Taking this project as an example, if 10 requests are initiated in an instant, only the first 6 requests can be allocated 6 different TCP connections for processing, and the subsequent 4 requests can only wait for any of these 6 requests to release the TCP connection resources. , to continue. That is to say, if the minimum time-consuming of the first 6 requests is 1s, then the minimum Pending time of the last 4 requests is 1s. Under the operation of the irritable brother of the author, this is a nightmare:

 

It can be found that the blocking phenomenon is quite serious, and each HTTP request will create an independent TCP connection for processing. After the request is completed, it will be closed, and a new TCP connection will be created for the next request. The resource overhead is huge.

Taking the interface of getsnapshot as an example, it takes about 84ms to complete the request without blocking:

However after blocking:

 

uh... so scary. 

 

After switching to HTTPS, the browser enables the HTTP2.0 protocol by default:

 

Under the author's violent operation, the browser initiates a large number of requests in a short period of time. It can be seen that the TCP connection with ID 2693483 has processed all resource requests concurrently, and has remained in the open state. It can be seen that under HTTP2.0, the number and throughput of concurrently processed requests have been improved to a completely different order of magnitude compared to HTTP1.X. It greatly saves the cost of creating a TCP connection and improves the utilization of network bandwidth resources. With the support of the HTTP2.0 multiplexing function, the browser handles a large number of concurrent requests much more smoothly.

 

Okay, let's end this.

There are still many differences between HTTP1.0, HTTP1.1, and HTTP2.0, and the changes between each version are also great, including header compression, keep-alive optimization, binary format support, etc. Interested readers can search for relevant materials on the Internet for in-depth study. This industry is only an introduction to some cases encountered in actual projects, rather than an explanation of the HTTP protocol itself.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325650532&siteId=291194637