A taste of HTTP's past and present

In every age, people who can learn will not be treated badly.

Hello everyone, I am yes.

The HTTP protocol can be seen everywhere in the Internet today, and it has been silently supporting the operation of the network world behind it, and HTTP is more familiar to us programmers.

On weekdays, we all say that the architecture is evolving, and demand drives the iteration, update, and progress of technology, and the same is true for the HTTP protocol.

I wonder if you have ever thought about how the HTTP protocol was born, how it was at the beginning, and how did it develop step by step to today's HTTP/3?

What secrets have you experienced?

Today I want to take a look at the evolution of HTTP with you, and see how it has grown from a baby to the existence that now rules the Internet.

But before that, let’s take a brief look at the history of the ancestor of the Internet-Arpanet, which is still very interesting.

The ancestor of the Internet-ARPANET

In the 1950s, communication researchers recognized the need for communication between different computer users and networks, which prompted research on distributed networks, queuing theory, and packet interaction.

On February 7, 1958, US Secretary of Defense Neil McElroy issued the Department of Defense Order No. 5105.15, establishing the Advanced Research Projects Agency (ARPA).

A study sponsored by the IPTO (Information Processing Office), one of ARPA's core institutions, led to the development of ARPA.

Let's take a look at this history.

In 1962, the director of ARPA hired Joseph Liclyde as the first director of IPTO. He was one of the first to foresee modern interactive computing and its various applications.

IPTO has funded advanced computer and network technology research, and commissioned thirteen research groups to conduct research on human-computer interaction and distributed systems related technologies. The budget for each group is thirty to forty times the normal research grant.

This is rich and powerful, the researchers must be full of energy!

In 1963, Li Clyde funded a research project called MAC, which aimed to explore the possibility of building communities on time-sharing computers .

This project has had a lasting impact on IPTO and the wider research community, becoming a prototype for extensive networking.

And Licklyde's global network vision has greatly influenced his successors at IPTO.

In 1964, Leclyde moved to IBM, and the second director Sutherland went online. He created the revolutionary Sketchpad program to store the memory of computer monitors. In 1965, he signed a contract with Lawrence Roberts of the Massachusetts Institute of Technology. IPTO contract to further develop computer network technology.

Subsequently, Roberts and Thomas Merrill realized the first data packet exchange between the TX-2 computer at MIT and the Q-32 computer in California through a dial-up telephone connection .

In 1966, Bob Taylor, the third director, took office. He was deeply influenced by Licklyde. It happened that Taylor was also a psychoacoustician like Licklyde.

With three different terminals connected to three different research sites in Taylor's IPTO office, he realized that this architecture would severely limit his ability to expand access to multiple sites.

So he thought about connecting a terminal to a network that can access multiple sites, and from his position in the Pentagon, he has the ability to realize this vision.

The Director of the US Department of Defense Advanced Research Projects Agency Charlie Hertzfeld promised Taylor that if the IPTO can be organized, he will provide $1 million to build a distributed communications network.

Taylor felt comfortable, and then he was very impressed with Roberts' work and invited him to join and lead the work, but Roberts was not happy.

Taylor was unhappy, so he asked Hertzfeld to let the director of Lincoln Laboratories pressure Roberts to reconsider , which eventually prompted Roberts to ease his attitude and joined IPTO as chief scientist in December 1966.

In June 3, 1968, Taylor Roberts to describe plans to build Arpanet, 18 days later, June 21, Taylor approved this plan, 14 months after ARPANET established .

When Arpanet developed smoothly, Taylor handed over the management of IPTO to Roberts in September 1969.

Roberts then left ARPA to become the CEO of Telenet, and Licklyde returned to IPTO again as a director to complete the life cycle of the organization.

At this point, this period of history has come to an end. It can be seen that Roberts, the father of Arpanet, was pressured to accept this task, and finally created Arpanet, the ancestor of the Internet .

Thanks to Licklyde's vision and money to promote the development of technology, ARPA has not only become the birthplace of the Internet, but also the birthplace of important achievements such as computer graphics, parallel processes, and computer simulation flights.

History is such a coincidence and interesting.

The history of the Internet

In 1973, the ARPA network expanded into the Internet. The first batch of computers connected to it were computers in Britain and Norway, which gradually became the backbone of network connections.

In 1974, Robert Kahn of ARPA and Vinton Cerf of Stanford proposed the TCP/IP protocol.

In 1986, the National Science Foundation (NSF) established the backbone network NSFNET between universities. This was an important step in the history of the Internet. NSFNET became the new backbone, and ARPANET was retired in 1990.

In 1990, Tim Berners-Lee (I’ll call it Old Li) created all the tools needed to run the World Wide Web: Hypertext Transfer Protocol (HTTP), Hypertext Markup Language (HTML), the first web page Browser, first web server and first website.

So far, the Internet has opened the road of rapid development, and HTTP has also begun its great journey.

There is still a lot of interesting history, such as the first browser war and so on. I have a chance to talk about it later. Today our protagonist is HTTP.

Next, let's take a look at the evolution of the major versions of HTTP and see how it has grown to where it is today.

HTTP / 0.9 era

In 1989, Mr. Li published a paper in which he put forward three concepts that now seem very common.

  • URI, Uniform Resource Identifier, serves as a unique identifier on the Internet.
  • HTML, Hypertext Markup Language, describes hypertext.
  • HTTP, Hypertext Transfer Protocol, transmits hypertext.

Afterwards, Mr. Li took action and got all these out and called it the World Wide Web.

It was the early days of the Internet, and the processing power of computers, including network speed, was very weak, so HTTP could not escape the constraints of that era, so the design was very simple and it was also in plain text format .

Mr. Li's idea at the time was that the document was stored in the server, and we only needed to get the document from the server, so there was only "GET", no request headers were needed, and it was over after the request, so the connection was broken after the request was responded .

This is why HTTP is designed as a text protocol, and there is only "GET" at the beginning, and the connection is broken after the response.

It seems to us that this protocol is too rudimentary, but it was a big step in the development of the Internet at the time! It is the most difficult thing to make something from nothing .

At this time, HTTP does not have a version number. The reason why it is called HTTP/0.9 is added by later generations to distinguish the later versions.

HTTP 1.0 era

People's needs are endless. With the development of images and audio, browsers are constantly improving to support them.

In 1995, Apache was developed to simplify the construction of HTTP servers. More and more people are using the Internet, which also promoted the modification of HTTP protocol.

Demands prompted the addition of various features to meet user needs. After a series of draft HTTP/1.0, it was officially released in 1996.

Dave Raggett led the HTTP working group in 1995. He hopes to extend the protocol through extended operations, extended negotiation, richer meta-information, and security protocols related to security protocols. This security protocol can be extended by adding additional methods and header fields. Improve efficiency.

Therefore, the following points are mainly added in the HTTP/1.0 version:

  • New methods such as HEAD and POST have been added.
  • Added response status code.
  • Introduced headers, namely request headers and response headers.
  • The HTTP version number was added to the request.
  • The introduction of Content-Type makes the transmitted data no longer limited to text.

It can be seen that new methods have been introduced and the semantics of operations have been filled. For example, HEAD can also only take meta-information without transmitting the entire content, which improves efficiency in some scenarios.

The introduced response status code allows the requester to know the status of the server, and can distinguish the cause of the request error without being confused.

The introduction of the header makes the request and response more flexible, and the separation of control data and business entities is also a decoupling.

The new version number indicates that this is a symbol of engineering, indicating that it is on the right path, after all, it cannot be managed without a version number.

The introduction of Content-Type supports the transmission of different types of data, enriches the carrier of the protocol, and enriches the user's eyeballs.

But at that time HTTP/1.0 was not a standard, and there was no actual binding force. The various forces did not take it. The vernacular is that you are the oldest.

HTTP 1.1 era

The HTTP/1.1 version was first recorded in RFC 2068 in 1997. The first browser war from 1995 to 1999 greatly promoted the development of the Web.

With the development of HTTP/1.0, it evolved into HTTP/1.1, and in 1999 the previous RFC 2068 was abandoned and RFC 2616 was released.

It can be known from the version number that this is a minor version update. The update is mainly due to the big performance problem of HTTP/1.0, that is, every time a resource is requested, a new TCP connection must be created, and it can only be requested serially.

Therefore, the following points are mainly added in the HTTP/1.1 version:

  • Added connection management, keepalive, which allows persistent connections.
  • Support pipeline, you can send the second request without waiting for the response of the previous request.
  • The response data is allowed to be chunked, that is, if the Content-Length is not marked in the response, the client cannot disconnect until it receives the EOF from the server, which is conducive to the transmission of large files.
  • Added cache control and management.
  • The Host header is added, which is used when you deploy multiple hosts on one machine, and then multiple domain names resolve to the same IP. At this time, you can determine which host you want to visit by adding the Host header.

It can be seen that the browser war has promoted the development of the Web, and also exposed the shortcomings of HTTP/1.0. After all, network bandwidth and so on are all improving, and the protocol cannot be allowed to limit the development of hardware.

Therefore, HTTP/1.1 was proposed, mainly to solve performance problems, including support for persistent connections, pipelines, cache management, etc., and also added some features.

Later, in 2014, HTTP/1.1 was revised again, because it was too large and complicated, so it was split and made six small documents RFC7230-RFC7235

At this time, HTTP/1.1 has become a standard. In fact, the standard is often established after the major competitors are relatively stable, because the standard means uniformity, and uniformity does not have to work hard to be compatible with various gadgets.

Only powerful forces can set standards, and when you are strong enough, you can also set standards to challenge the old standards.

HTTP 2 era

With the release of HTTP/1.1, the Internet has also begun to explode. This growth has exposed the shortcomings of HTTP, mainly due to performance problems, while HTTP/1.1 is indifferent.

This is human inertia, and it is in line with the evolution of our products on weekdays. When you are strong and comfortable enough, you don't want to bother about any changes.

Don't use it.

At this time, Google can't stand it anymore, don't you? I do it myself, I play with myself, I have a large user base, I have Chrome, and I have more services.

Google launched the SPDY protocol. With its global share of more than 60% confidence, in July 2012, the team that developed SPDY publicly stated that it was working hard to achieve standardization.

HTTP couldn't sit still, and then the Internet Standards Organization began to develop a new version of the HTTP protocol based on SPDY, and finally released HTTP/2 in 2015.

The HTTP/2 version mainly adds the following points:

  • It is a binary protocol, not plain text anymore.
  • Supports one TCP connection to initiate multiple requests and removes the pipeline.
  • Use HPACK to compress the header to reduce the amount of data transmission.
  • Allow the server to actively push data.

From text to binary, it actually simplifies the neat complexity, the overhead of parsing data is smaller, the data is more compact, the network delay is reduced, and the overall throughput is improved.

It supports multiple requests initiated by one TCP connection , that is, multiplexing is supported. For example, the HTTP/1.1 pipeline is still blocked. You need to wait for the previous response to return before the latter can return .

The multiplexing is completely asynchronous, which reduces the overall round-trip time (RTT), solves the HTTP header blocking problem, and avoids the impact of TCP slow start .

HPACK compresses the header , using static tables, dynamic tables and Huffman coding. Both the client and the server maintain a list of request headers, so only the incremental and compressed header information is needed, and the server will assemble it after getting it. You can get the complete header information.

The image is as shown in the figure below:

To be more specific is the following picture:

The server actively pushes data . This actually reduces the number of requests. For example, when the client requests 1.html, I also send the js and css needed for 1.html. After saving, the client requests me to request js. I Want this css.

It can be seen that the overall evolution of HTTP/2 is developed from the perspective of performance optimization, because the performance at this time is the pain point, and the evolution of everything is where the pain is.

Of course there are some exceptions, such as some accidents, or just the kind of "idle egg pain".

This time, the advancement belongs to the users. If you don't upgrade me, I will do it myself. I have the capital, so you can weigh it yourself.

The end result is good. Google later abandoned SPDY and embraced the standard. The historical burden of HTTP/1.1 is too heavy, so HTTP/2 has only been used by roughly half of the websites.

HTTP 3 era

This HTTP/2 hasn't covered the heat, why did HTTP/3 come?

This time it's Google again. It broke through itself, mainly because of its pain points. This time the pain points came from the TCP that HTTP relies on.

TCP is a reliable and orderly transmission protocol , so there will be failure retransmission and ordering mechanisms, while HTTP/2 is that all streams share a TCP connection, so there will be TCP-level head-of-line blocking , when retransmission occurs Will affect multiple request responses.

And TCP determines the connection based on the four-tuple (source IP, source port, destination IP, destination port) . In the case of a mobile network, the IP address will be changed frequently, which will cause repeated connection establishment.

There is also a superimposed handshake between TCP and TLS, which increases the delay.

The problem was TCP, so Google set its sights on UDP.

We know that UDP is connectionless, no matter what the order is, and no matter what packet loss you have, and TCP, as I said in the previous article, it is clear that students who don't understand TCP incurable disease analysis can check it out.

Simply put, TCP is too altruistic, or too conservative, and now we need a more radical approach.

What's going on? I'll change TCP if it doesn't change! Then the reliable and orderly functions of TCP are implemented at the application layer, so Google developed the QUIC protocol.

The QUIC layer implements its own packet loss retransmission and congestion control, and for security reasons we all use HTTPS, so multiple handshake is required.

I have also mentioned the situation of quadruples above, so in the mobile Internet era, the consumption of handshake is even more magnified, so QUIC introduced a connection called Connection ID to identify a link, so this connection can be reused after switching networks. When it reaches 0 RTT, the transmission can be started.

Note that the above picture is after shaking hands with the server, there is 0 RTT due to network switching and other reasons, that is, the Connection ID was generated before .

If it is the first time to establish a connection or requires multiple handshake, let's take a look at a simplified handshake comparison chart.

So the so-called 0RTT is when the connection has been established before.

Of course, there is HPACK mentioned in HTTP/2, which relies on TCP's reliable and orderly transmission, so QUIC has to build a QPACK, which also uses static tables, dynamic tables and Huffman coding.

It enriches the static table of HTTP/2, from 61 items to 98 items.

The dynamic table mentioned above is used to store header items that are not included in the static table. Assuming that the dynamic table has not been received, it will definitely be blocked when the header is resolved later.

So QPACK opens another way. It transmits the codec of the dynamic table in a one-way Stream. Once the one-way transmission is completed, the decoding can be started after receiving the end. That is to say, you can leave it alone before it is ready to prevent it from getting stuck in half. Up .

Then there is the aforementioned TCP head-of-line blocking. How does QUIC solve it? After all, it must be orderly and reliable.

Because TCP does not know which request each flow is, so it can only block all of them, but QUIC knows, so for example, if request A loses packets, I will just block A, request B can be completely released, the slightest Not affected.

It can be seen that UDP-based QUIC is still very powerful, and there are many users. In 2018, the Internet Standards Organization IETF proposed to rename HTTP over QUIC to HTTP/3 and was approved .

It can be seen that demand has promoted technological progress. Due to the limitations of TCP's own mechanism, our eyes have turned to UDP. Will TCP become history?

We will wait and see.

At last

Today, we have roughly gone through the history of HTTP development and its evolution, and we can see that technology is derived from demand, and demand drives the development of technology.

In essence, it is human inertia, and it grows only when it hurts .

And standards are actually promoted by the giants for their benefit, but standards can indeed reduce the cost of docking, unified and convenient.

Of course, as far as HTTP is concerned, there are still many details, many details, many algorithms, such as Connection ID, how can you ensure that the request will be forwarded to the previous server for different quadruples?

So today I just talked about the general evolution in a simple way. The specific realization still depends on your own exploration, or I will write some more when I have the opportunity.

However, relative to these implementation details, I am more interested in the evolution of history. This allows me to learn from the background and other constraints of the times, why this thing was designed in this way in the beginning, so as to have a deeper understanding of this stuff.

And history is still very interesting, isn't it?


I am yes, from a little bit to a little bit, see you in the next article .

Shoulders of giants

https://www.livinginternet.com/i/ii_ipto.htm

https://jacobianengineering.com/blog/2016/11/1543/

https://w3techs.com/technologies/details/ce-http2

https://www.verizondigitalmedia.com/blog/how-quic-speeds-up-all-web-applications/

https://www.oreilly.com/content/http2-a-new-excerpt/

https://www.darpa.mil/about-us/timeline/dod-establishes-arpa

https://en.wikipedia.org/wiki/ARPANET

https://en.wikipedia.org/wiki/Internet

In-depth analysis of HTTP/3 protocol, Tao Hui

Perspective of HTTP protocol, Luo Jianfeng

Guess you like

Origin blog.csdn.net/yessimida/article/details/108810863
Recommended