URL to Page: Exploring the Mysterious Process of Loading Web Pages

When we enter the URL from the address bar of the browser and press Enter,

img

Finally, the required web interface appears,

img

What happened in the middle, and then analyzed step by step.

The main process is as follows:

  1. enter URL
  2. DNS resolution
  3. Client sends HTTP request
  4. Establish a TCP connection
  5. The server processes the request, calculates the response, returns the response
  6. The browser renders the page
  7. close connection

This article is just an overview of the whole process, mainly to explain what has been done in the process from entering the URL to displaying the page, but will not introduce each sub-process in detail, which are summarized in my previous blog However, if you are not sure, you can refer to it, as follows.

First understanding of network planning (understanding the basic process of network transmission)

Network transport layer protocol: UDP and TCP

IP protocol and Ethernet, DNS

HTTP Protocol of Measuring Network

1. Enter URL

When you enter the URL in the browser and press Enter, the browser will make the following judgments on the entered information:

  1. Check whether the input content is a legitimate URL link or a keyword to be searched.
  2. If it is a legal URL link, judge whether the input URL is a complete URL, if not, the browser will guess by itself, and then complete the URL.
  3. If it is a keyword to be searched, the browser will search in combination with the URL of the default search engine set by the user.

2. Perform DNS domain name resolution

DNS is composed of a resolver and a domain name server, and its function is to convert domain names into IP addresses.

The URL we usually enter in the browser is actually the domain name. When we enter the URL and press Enter, a GET request is initiated. At this time, the first thing the DNS system of the browser needs to do is to resolve the domain name and convert the domain name to converted into an IP address.

In layman's terms, we are more accustomed to remembering the name of a website, such as www.baidu.com, rather than remembering its IP address, such as: 167.23.10.2; and computers are better at remembering the IP address of a website than Such as www.baidu.com and other links; analogy, in fact, DNS is equivalent to a phone book, if you want to find the domain name www.baidu.com, then go to the phone book to know its phone number (IP) It is 167.23.10.2.

3. Encapsulation

After the browser gets the IP address corresponding to the domain name, it can construct an HTTP datagram, hand it over to the transport layer, and send a random port (1024~65535) to port 80 of the server's Web program (the server listens to the web page The default port requested by the client) initiates a TCP connection request ( three-way handshake ), and then passes the data to the network layer. The IP protocol encapsulates it into an IP datagram, and then passes it to the data link layer to convert it into binary form of bits (bit) stream, sent from the network card, and then the bits are converted into electronic, optical or microwave signals for transmission in the network and finally transmitted through the network card.

4. Make the transfer

In the process of transmission, through some network devices, switches and routers, etc.;

  • The switch distributes the data to the data link, re-encapsulates it, and continues to forward it.
  • The router will distribute the data to the network layer, re-encapsulate it, and then the router will match it in the routing table according to the destination IP in the datagram, find a suitable direction to send it out, and the TTL will be reduced by 1 each time it is forwarded (TTL is an IP protocol , which tells the network whether the packet has been in the network too long and should be dropped).

5. The server receives the request, calculates the response according to the request, repackages, and returns the response

The server obtains the client's HTTP request by listening to the port. After establishing a TCP connection with the client, the server starts to receive the data sent by the client, first enters the network card, and then enters the TCP/IP protocol stack of the kernel to share the data ( It is used to identify the connection request, depacketize it, and peel it layer by layer), and finally decode it through HTTP, parse it from the received data, find the resource you want to access, construct the resource into an HTTP response, and then Send the response layer-by-layer encapsulation to the client browser.

Such an HTTP communication is completed, and the server will decide whether to close the TCP connection channel according to the Connection field in the HTTP request. When the value of the Connection field is keep-alive, the server will not close the connection immediately to ensure the completion of the communication.

6. The browser receives the response, renders the page, and finally presents a complete page

After the browser receives the response content, it still divides the data first, and finally the browser parses the HTML file to generate a homepage frame (build a dom tree), and the JS parsing is completed by the JS parsing engine in the browser, such as Google's It is V8.

During the parsing process, if a request for external resources is encountered at the same time, such as pictures, references to static resources such as CSS of external links, it will continue to send requests to the server. The content of the request is some resources on the home page, such as pictures, videos, JS documents etc...

These request processes are asynchronous and do not affect the loading of HTML pages.

7. Disconnect

At this point, the whole process is over, and finally through four waved hands , the browser and the client are connected.

Guess you like

Origin blog.csdn.net/Trong_/article/details/131090473