HTTP request coalescing vs HTTP parallel requests, who is faster?

Disclaimer: This article is for learning only, sharing source: http://note.youdao.com/noteshare?id=4cb871cd644103fa5e722bfdc8b9a7b5&sub=F558147538F84F7995CB6A8CEE0007D5

 

During interviews, candidates are often asked a question: how to improve web page performance?

Some basic people will mention this one: reduce/merge HTTP requests.

Continue to ask: Isn't the browser able to download resources in parallel? Merging multiple resources into one resource and using only one HTTP request to download is faster than using multiple HTTP requests to download multiple resources in parallel that have not been merged?

Candidate: ... (So far, I have not met a satisfactory answer)

Reducing HTTP requests is the first of Yahoo's 35 military regulations for front-end performance optimization. In 2006, Yahoo proposed these 35 military regulations. Since then, it has deeply affected batch after batch of front-end developers. Today, 12 years later, the influence is still undiminished.....

However, there is another one of Yahoo's military rules: split resources to maximize the ability of browsers to download in parallel. Now the problem comes, reducing HTTP requests, but the resources required by the webpage cannot be reduced (otherwise the webpage will no longer be the previous webpage), so the reduction of HTTP requests is mainly achieved by merging resources. On the one hand, it is recommended to merge Resources, on the one hand, it is recommended to split resources, which is obviously a conflict, so what should be done? Some articles on the Internet have also discussed this issue, but most of them stay on the theoretical analysis that is taken for granted, and ignore the impact of the TCP transmission mechanism. Today, veteran cadres will take everyone to use experiments + theory to discuss this issue carefully.

HTTP request process

The main process of an HTTP request is:

DNS resolution (T1) -> establish a TCP connection (T2) -> send a request (T3) -> wait for the server to return the first byte (TTFB) (T4) -> receive data (T5).

As shown in the figure below, it is an HTTP request displayed in Chrome Devtools, showing the main stages of the HTTP request. Note that the Queueing stage is the queuing time of the request in the browser queue, which is not included in the HTTP request time .

image description

From this process, it can be seen that if N HTTP requests are merged into one, (N-1)* (T1+T2+T3+T4) time can be saved.

But the actual scenario is not so ideal. There are several loopholes in the above analysis:

  1. Browsers cache DNS information, so DNS resolution is not required for every request.
  2. The HTTP 1.1 keep-alive feature enables HTTP requests to reuse existing TCP connections, so not every HTTP request needs to establish a new TCP connection.
  3. The browser can send multiple HTTP requests in parallel, which may also affect the download time of resources, and the above analysis is obviously only based on the scenario of only one HTTP request at the same time.

Experimental proof

Let's do 4 sets of experiments to compare the time required for one HTTP request to load the merged resource and the time required for multiple HTTP requests to load the split resource in parallel. There was a significant difference in the volume size of the resources used in each set of experiments.

lab environment:

Server: Alibaba Cloud ECS 1 core 2GB memory bandwidth 1M

Web server: Nginx (Gzip not enabled)

Chrome v66  incognito mode with cache disabled

Client network: wifi bandwidth 20M

Experimental code address: https://github.com/xuchaobei/…

Experiment 1

Test files: large1.css, large2.css ... large6.css, each file 141K; large-6in1.css, merged from the previous 6 css files, the size is 846K. parallel-large.html refers to large1.css, large2.css ... large6.css, combined-large.html refers to large-6in1.css, the code is as follows:

// parallel-large.html
<!DOCTYPE html>
<html>

  <head>
    <meta charset="utf-8" />
    <title>Parallel Large</title>
    <link rel="stylesheet" type="text/css" media="screen" href="large1.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large2.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large3.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large4.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large5.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large6.css" />
  </head>

  <body>
    Hello, world!
  </body>

</html>
// combined-large.html
<!DOCTYPE html>
<html>

  <head>
    <meta charset="utf-8" />
    <title>Combined Large</title>
    <link rel="stylesheet" type="text/css" media="screen" href="large-6in1.css" />
  </head>

  <body>
    Hello, world!
  </body>

</html>

Refresh the two pages 10 times each, and use the Devtools Network to calculate the average loading time of CSS resources.

Precautions:

  1. The loading time of large1.css, large2.css ... large6.css is calculated from the time when the HTTP request for the first resource is sent to the time when all 6 files are downloaded, as shown in the red box in Figure 2.
  2. Two html pages cannot be loaded at the same time, otherwise the bandwidth will be shared by the two pages, which will affect the test results. Need to wait for a page to load completely, and then manually refresh and load another page.
  3. The interval between two page refreshes should be more than 1 minute to avoid the impact of HTTP 1.1 connection reuse on the experiment.

image description

The experimental results are as follows:

  large-6in1.css large1.css、large2.css … large6.css
average time(s) 5.52 5.3

Let's combine large1.css, large2.css ... large6.css into 3 resources large-2in1a.css, large-2in1b.css, large-2in1c.css, each resource 282K, in combined-large-1.html These 3 resources are referenced in:

// combined-large-1.html
<!DOCTYPE html>
<html>

  <head>
    <meta charset="utf-8" />
    <title>Parallel Large 1</title>
    <link rel="stylesheet" type="text/css" media="screen" href="large-2in1a.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large-2in1b.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large-2in1c.css" />
  </head>

  <body>
    Hello, world!
  </body>

</html>

Tested 10 times, the average loading time is 5.20s.

The summarized experimental results are as follows:

  large-6in1.css large1.css、large2.css … large6.css large-2in1a.css、… large-2in1c.css
average time(s) 5.52 5.30 5.20

From the results of Experiment 1, it can be seen that merging resources and splitting resources have no significant impact on the total loading time of resources. In the experiment, the case of splitting into three resources was the least time-consuming (5.2s), and the most time-consuming was the case of merging into one resource (5.52s), but the difference between the two was only 6%. Considering the randomness of the experimental environment and the fact that the experiment was repeated only 10 times, this time difference does not represent the obvious time difference of the three scenarios.

experiment 2

Keep increasing the css file size.

Test files: xlarge1.css, xlarge2.css, xlarge3.css, each file 1.7M; xlarge-3in1.css, merged from the previous 3 css files, the size is 5.1M. parallel-xlarge.html refers to xlarge1.css, xlarge2.css, xlarge3.css, and combined-xlarge.html refers to xlarge-3in1.css.

The test process is the same as above, and the experimental results are as follows:

  xlarge-3in1.css xlarge1.css、xlarge2.css、xlarge3.css
average time(s) 37.72 36.88

The time difference of this group of experiments is only 2%, which is even smaller, so it is even more impossible to explain the obvious difference in the total loading time of merging resources and splitting resources.

In fact, ideally, as the resource size becomes larger, the time required for the two resource loading methods will tend to be the same.

Theoretically, because the HTTP transmission channel is based on the TCP connection, and the TCP connection has the characteristics of slow start, the network bandwidth is not fully utilized at the beginning, and after the slow start process, the available bandwidth is gradually occupied. For large resources, the bandwidth will always be fully utilized, so the bandwidth is the bottleneck, even if more TCP connections are used, the speed cannot be improved. The larger the resource, the smaller the proportion of slow start to the total download time. Most of the time, the bandwidth is fully utilized, and the total amount of data is the same (the extra Header caused by splitting the resource is completely ok in this case. negligible), the bandwidth is the same, and the transfer time is of course also the same.

experiment 3

Reduce css file size.

Test files: medium1.css, medium2.css ... medium6.css, each file 9.4K; medium-6in1.css, combined from the previous 6 css files, the size is 56.4K. parallel-medium.html references medium1.css, medium2.css ... medium6.css, combined-medium.html references medium-6in1.css.

The experimental results are as follows:

  medium-6in1.css medium1.css、medium2.css … medium6.css
average time (ms) 34.87 46.24

Note that the unit becomes ms

The time difference of Experiment 3 is 33%, although the numerical difference is only 12ms. Without further analysis, continue to experiment 4.

experiment 4

Continue to reduce the size of the css file to tens of bytes.

Test files: small1.css, small2.css ... small6.css, each file 28B; small-6in1.css, merged from the previous 6 css files, the size is 173B. parallel-medium.html references small1.css, small2.css ... small6.css, combined-medium.html references small-6in1.css.

The experimental results are as follows:

  small-6in1.css small1.css、small2.css … small6.css
average time (ms) 20.33 35

The time difference for experiment 4 is 72%.

According to Experiment 3 and Experiment 4, it is found that when the resource volume is small, there is a significant difference in the loading time of merging resources and splitting resources. Figure 3 and Figure 4 are screenshots of a certain test result in Experiment 4. When the resource volume is small, the proportion of data download time (indicated by the blue part of the horizontal bar in the figure) to the total time is very small, which means The key factors affecting resource loading time are DNS resolution (T1), TCP connection establishment (T2), sending request (T3) and waiting for the server to return the first byte (TTFB) (T4). However, establishing multiple HTTP connections at the same time has additional resource consumption. The DNS query time of each HTTP and the establishment time of TCP connections also have certain randomness, which leads to a certain HTTP time-consuming when resources are requested concurrently The possibility of a significant increase becomes greater. As shown in Figure 3, the loading time of small1.css is the shortest (16ms), and the loading time of small5.css is the longest (32ms). The difference between the two is doubled, but the calculation time is based on the completion of all resources loaded. In this case Under the circumstances, using multiple HTTP requests at the same time will lead to greater time inhomogeneity and uncertainty, and the performance result is often slower than using one HTTP request to load the combined resources.

image description

image description

more complex situation

For small files must it be faster to merge resources?

In fact, not necessarily. In some cases, merging small files may significantly increase resource loading time.

Let's talk about some theory. In order to improve transmission efficiency, on the TCP channel, not every time the sender sends a data packet, it has to wait for the receiver's acknowledgment response (ACK) before sending the next message. TCP introduces the concept of "window". The window size refers to the maximum value that can continue to send data without waiting for a confirmation response. For example, the window size is 4 MSS (Maximum Segment Size, the maximum data segment that a TCP packet can transmit each time) , which means that 4 message segments can be sent continuously without waiting for the confirmation signal from the receiver, that is to say, the transmission of 4 message segments is completed in 1 network round-trip. As shown in the figure below (MSS is 1, window size is 4), 1-4000 data is sent continuously without waiting for the confirmation response, similarly, 4001-8000 is also sent continuously. Please note that this is only a schematic diagram of an ideal situation, the actual situation is more complicated than here.

image description

In the slow start phase, TCP maintains a congestion window variable. The size of the window in this phase is equal to the congestion window. In the slow start phase, with each network round trip, the size of the congestion window will double. For example, assuming the initial congestion window The size is 1, and the size of the congestion window changes as: 1, 2, 4, 8.... As shown below.

image description

In the actual network, the initial value of the congestion window is generally 10 , so the size of the congestion window changes as: 10, 20, 40 ... , the value of MSS depends on the network topology and hardware devices, and the MSS value in Ethernet is generally 1460 bytes , the data size transmitted by each message segment is equal to the MSS calculation (the actual situation can be smaller than the MSS value), after the first network round trip, the maximum data transmitted is 14.6K, and after the second time, it is (10+20 )  1.46 = 43.8K, after the third time, it is (10+20+40)  1.46 = 102.2K.

According to the above theoretical introduction, in Experiment 4, no matter whether resources are merged or resources are split, the transmission is completed in one network round trip. However, in Experiment 3, the resource size after splitting is 9.4K, which can be transferred in one network round trip, while the merged resource size is 56.4K, which requires three network round trips to complete the transfer. If the network delay is large ( For example, 1s), the bandwidth is not the bottleneck, and two more network round trips will increase the time consumption by 1s. At this time, merging resources may not be worth the candle. The reason why Experiment 3 did not produce this result is that the network delay in the experiment was about 10ms, and the value was too small to have a significant impact on the results.

Summarize

For large resources, whether to merge or not has no obvious impact on loading time, but splitting resources can make better use of the browser cache, and will not invalidate all resource caches due to the update of a certain resource. After the resources are merged, the update of any resource All will lead to invalidation of the overall resource cache. In addition, domain name sharding technology can also be used to split and deploy resources to different domain names, which can not only disperse the pressure on the server, but also reduce the impact of network jitter.

For small resources, merging resources often has a faster loading speed, but in the case of good network bandwidth, because the improved time unit is measured in ms, the benefit is negligible. If the network delay is large and the server response speed is slow, it can bring some benefits. However, in a high-latency network scenario, it is also necessary to pay attention to the increase in the number of network round trips after merging resources, which in turn will affect the loading time.

In fact, seeing this, it is no longer important whether it is combined or divided. What is important is that we need to know what is the principle behind the combined and divided, and what is the business scenario.

Guess you like

Origin blog.csdn.net/wdm891026/article/details/100133889