The entire process from loading the page to enter the URL

 

From simple say:

1. DNS name resolution;
2. TCP connection is established;
3. send HTTP requests;
4 returns the response result;
5. Close TCP connection;
6. browser parses the HTML;
7. The rendering browser layout;


We basically know this, but the details inside, most people are still not very clear, we will elaborate on this:

Why do DNS name resolution?
Most network communication is based on TCP / IP, and TCP / IP is based on IP address, IP address can only identify as "202.96.134.133" like the computer to communicate on the network and fail to recognize the domain name. We can not remember the site more than 10 IP addresses, so when we visited the site, and more so in the browser address bar enter the domain name, you can see the desired page, because there is one called "DNS Server" the computer automatically our domain name "translation" became the corresponding IP address, and then bring up the IP address corresponding to the web page.
*
Internet Protocol (Internet Protocol Suite) is a network communication model, as well as a whole network transport protocol family, the Internet's basic communications infrastructure. It is often known as TCP / IP protocol suite (English: TCP / IP Protocol Suite, or TCP / IP Protocols), referred to as TCP / IP. Because the two core protocol family of protocols: TCP (Transmission Control Protocol) and IP (Internet Protocol), for the first family of standards adopted.


A, DNS name resolution
we enter the URL in the browser, in fact, would like to request the page content server we want, we must first confirm all browsers is where domain name corresponding to the server. The DNS server IP address corresponding to this work is done by the DNS server.
After the client receives the domain address you entered, it first went to the local hosts file, check if there is a corresponding domain name in the file, correspondence between IP, if so, sends the request to its IP address, if not, then go to the DNS server. Average user rarely edit modify the hosts file.

DNS server hierarchy

Browser client sends to the local DNS server contains a domain name www.cnblogs.com DNS query messages. The local DNS server forwards the query message to a root DNS server, root DNS server noticed that com suffix, then return the IP address comDNS server to the local DNS server. Local DNS server sends a query request again to comDNS server, comDNS server notices that www.cnblogs.com suffix and is responsible for the domain name with the IP address of the authoritative DNS server response. Finally, the local DNS server response containing the IP address of www.cnblogs.com message sent to the client. 
From the client to the server belonging to the local recursive queries, and the interaction between the DNS server is part of an iterative query. 
Under normal circumstances, the local caching DNS servers already comDNS address of the server, and therefore requests root name servers This step is not necessary.

DNS cache:

Distance from ordering the browser from it, are the following: the browser cache, system cache (hosts file), the router cache, IPS server (Intrusion Prevention System (IPS: Intrusion Prevention System) is a computer network security facility) cache, root domain name server cache, top-level domain server cache, primary domain name server cache.

  

Two, Tcp connection
fee meal turns finally got the server IP, the next step is naturally linked to the server. For TCP connections the client and server is bound to say that "three-way handshake."

  • The first handshake: connection is established

  The client sends a connection request segment, the SYN (Synchronize sequence numbers (Synchronize Sequence Numbers)) value is set to 1, Sequence Number of x. The client enters SYN_SEND state, waiting for confirmation of the server.

  • The second handshake: server receives a SYN segment

  Server receives a SYN segment clients need to verify this SYN segment, provided Acknowledgment Number (acknowledgment number) to x + 1 (Sequence Number + 1). At the same time, he would also like to send SYN request information, the SYN value is set to 1, Sequence Number set to y. All of the above server-side information into a segment (i.e., segment SYN + ACK packet), the be sent to the client, the server enters a state SYN_RECV.

  • Third handshake: the client receives the SYN + ACK segment

  SYN + ACK packet after the client receives the server's segment Acknowledgment Number is provided, to send ACK segment y + 1 to the server, after the segment has been sent, the client and server side into the ESTABLISHED state, complete the TCP three-way handshake.


Complete three-way handshake, the client and the server begins transmitting data, in the process, there are some important concepts:

  • No connection queue: the three-way handshake protocol, the server maintains a SYN packet (syn = j) is not connected to a queue, the queue is for each client to open an entry that indicates that the server SYN packet has been received, a confirmation to the customer , is awaiting confirmation package customers. These entries are connected to the server identified in Syn_RECV state, when the server receives the client's confirmation package, delete the entry, the server into the ESTABLISHED (established) state. Backlog parameters: represents the maximum number of receive queue is not connected.
  • SYN-ACK retransmission times: the server has finished sending SYN-ACK packet, if the client has not received confirmation packet, the server first retransmission, wait some time yet to receive customer confirmation pack, a second retransmission, if the retransmission maximum number of retransmissions exceeds the number of predetermined system, the system never deleted the connection information of the connection queue. Note that each retransmission waiting time is not necessarily the same.
  • Not connected survival time: refers to the connection queue entry is not the most viable time, i.e. the service is received from the SYN packet to confirm that the message is invalid maximum time, the time values of all the retransmission request packet is the longest wait the sum of the time. Sometimes we also called the survival time was not connected Timeout time, SYN_RECV survival time.

Why is a three-way handshake
in Xie Xiren the "computer network" in the fourth edition stresses "three-way handshake" The purpose is to prevent expired connection request segment suddenly transferred to the server, resulting in an error
"has failed connection requests generating segment "in such a case: a first client sends a connection request message segment is not lost, but in some network nodes the residence time, resulting in delayed release of the connection to a subsequent time to reach the server. This in itself is already a failure of the segment. But the server receives this request after the failure of the connection segment, it is mistaken for a new connection request issued by the client again. So it is a confirmation message segments to client, agreed to establish a connection. Do not assume that a "three-way handshake", as long as the server a confirmation, a new connection is established. Now that no client requesting establishment of a connection, and therefore will not ignore the acknowledgment server, it will not send data to the server. But the server thought that the new transport connection has been established, and has been sent to wait for client data. In this way, a lot of resources on the server wasted. A "three-way handshake" approach can prevent this phenomenon. For example, that sort of situation, client does not send a confirmation to confirm the server. Since the server does not receive confirmation, we know that client is not required to establish a connection. "


Third, initiate Http request

HTTP - Hyper Text Transfer Protocol, Hypertext Transfer Protocol, is a state on the establishment of TCP connections, the entire basic workflow client sends an HTTP request, the operation of the resource and the requesting client wants to access , after the server receives the request, the server starts processing the request, and make the appropriate action according to request access to server resources, and finally the results back to the client by sending an HTTP response. One of the requests start to finish a response called a transaction, when the end will be a thing to add a log entry on the server.
Http request will initiate a request message, which includes: request line (request line), request header (header), a blank line and request data of these four sections.

1. The request line
request line by the method of the request field, URL HTTP protocol version field, and a field of three fields, which are separated by spaces. For example, GET /index.html HTTP / 1.1.
POST / clues / get_clues_detail HTTP / 1.1
Request method of the HTTP protocol with a GET, POST, HEAD, PUT, DELETE, OPTIONS, TRACE, CONNECT.

The common are the following:

GET: Complete request a resource (common)
the HEAD: request only response headers
POST: submit the form (Common)
PUT: (WebDAV) to upload the file (but the browser does not support this method)
DELETE: (WebDAV) Delete
OPTIONS: returns the requested resource supported methods
TRACE: the pursuit of a resource request through the middle of the proxy (which can not be issued by the browser)

2. request header

Request header from the key / value pairs, one pair per line, and the key value colon ":" separated. Request header notifies the server information about client requests, a typical request header are:,

Accept: content type list client recognizes.
Accept-Encoding: statement browser supports encoding type.
Cache-Control: specifies the caching mechanism requests and responses follow.
Connection: the decision after the current transaction is completed, will close the network connection. If the value is "keep-alive", persistent network connection is not closed, so that the same request to the server can proceed on the connection.
Host: host name of the request, allowing multiple domain names at a same IP address, that virtual host.
Referer: tell the server which page to link coming from the server to take Keyihuode some of the information for processing.
Upgrade-Insecure-Requests: allow the browser to automatically upgrade request from http to https, http web page that contains a large amount of resources to upgrade directly to http https without error.
User-Agent: the type of browser generating the request.

3. blank line

After the last request header is a blank line, send carriage return and linefeed, notification server no longer has the following request header.

4. The request data
request method GET data is not used, but the use of the POST process. POST method is suitable for applications requiring customers to fill out the form. Associated with the request data, request headers are most commonly used Content-Type and Content-Length.

 

Fourth, the results returned in response to

Also HTTP response consists of three parts, namely: a status line, a respective head, the response body.
The HTTP / 1.1 200 is the OK
1. The status line
HTTP-Version Status-Code Reason- Phrase CRLF

Wherein, HTTP-Version indicates a version of the HTTP protocol server; Status-Code represents the server sends back a response status code; Reason-Phrase represent text description of the status code. The three-digit status code, the first number in response to the category defined, and there are five possible values.

1xx: indication information - indicates a request has been received, processing continues.
2xx: Success - indicates that the request has been successfully received, understood, accepted.
3xx: Redirection - to fulfill the request must go a step further.
4xx: Client Error - The request contains a syntax error or a request can not be achieved.
5xx: Server-side Error - The server failed to achieve a legitimate request.
Common status codes, the status described in the following description.

200 OK: The client request was successful.
301: permanent redirect, Location response header value is still current URL, therefore hidden redirection.
302: temporary redirection, the redirection explicit, Location response header is a new URL.
304: Not Modified unmodified, such as when comparing the local cache server resource files and found no changes, the server returns a 304 status code that tells the browser, you do not request the resource, local resources can be used directly.
400 Bad Request: The client requests a syntax error, it can not be understood by the server.
401 Unauthorized: unauthorized request, the status code must be used with the WWW-Authenticate header field.
403 Forbidden: server receives the request, but refused to provide services.
404 Not Found: requested resource does not exist, for example: enter the wrong URL.
500 Internal Server Error: unexpected server error occurred.
502: Bad Gateway appears in front of a proxy server can not contact back-end server
503 Server Unavailable: the server can not currently handle the client's request, it may return to normal after a period of time, for example: HTTP / 1.1 200 OK (CRLF ).
504: Gateway Timeout This is the agent can be linked to back-end servers, but within the specified time the back-end server does not respond to the proxy server

2. The response of the head
as is the response of the head of the Chrome browser:

Connection using a keep-alive characteristics of the
Content-Length WEB server tells the length or size of the object browser own responses, such as: Content-the Length: 2284
Content-Encoding using gzip way of resource compression
Content-type MIME type is html type, character set UTF-8 is
the date date response
wEB server server uses
transfer-encoding: chunked chunked transfer encoding is a data transfer mechanism in http, sent by the HTTP web server to allow the client application (typically a web browser) data may be divided into a plurality of portions, chunked transfer encoding only the HTTP protocol version 1.1 (HTTP / 1.1) in the
HTTP response headers and the request header table: http://tools.jb51.net/table/http_header
maintain links
complete after the HTTP request, the server does not immediately disconnect the connection with the client. In HTTP / 1.1 in, Connection: keep-alive is enabled by default, it represents a persistent connection, in order to handle a new request soon after arrival, without the need to re-establish the connection increases slow start cost and improve the throughput of the network. Nginx reverse proxy software, the default persistent connection time is 75 seconds, if there is no newly arrived request within 75 seconds, then disconnect the connection with the client. At the same time, every 45 seconds the browser sends to the server TCP keep-alive probes, to determine the status of the TCP connection, if there is no ACK response is received, the active connection to the server is disconnected. Note that, HTTP keep-alive and TCP keep-alive although the keep-alive mechanism is a kind, but they are completely different, a role of the application layer, a transport layer acting.

 

Five close the TCP connection

  • The first wave: the client want to break up

  Assume that the client wants to close the connection, the client sends a FIN flag is 1 package (FIN = 1, seq = x), said he had no data can be sent, but can still receive data.

  After the transmission is completed, the client enters FIN_WAIT_1 state.

  • The second wave: the server also want to break up

  Client server acknowledgment FIN packet, sends an acknowledgment packet (ACK = 1, ACKnum = x + 1), indicate their request received client closes the connection, but not ready to close the connection.

  After the transmission is completed, the server enters CLOSE_WAIT state, after the client receives the acknowledgment packet enters FIN_WAIT_2 state, waiting for the server closes the connection.

  • Third wave: the server is ready to break up

  When the server is ready to close the connection, sends a connection request to the client end, FIN is set to 1 (FIN = 1, seq = y).

  After the transmission is completed, the server enters LAST_ACK state, a wait for the final ACK from the client.

  • The fourth wave: break up

  Close the client receives a request from the server, sends an acknowledgment packet (ACK = 1, ACKnum = y + 1), and enters the TIME_WAIT state, waiting for an ACK packet requires retransmission may occur.
  After the server receives the acknowledgment packet, the connection is closed, enters the CLOSED state.
  After the client waits 2MSL (2MSL, 2 Maximum Segment Lifetime ), did not receive a reply, make sure the server is indeed closed, the client closes the connection, enter the CLOSED state.

 

Sixth, the browser parses HTML

After the server receives the browser http Response sent by http protocol, the http Response HTML text for the received portion of the processing entities, i.e. analytical procedure is as follows:

Document Object Model (DOM)
parsing a page tag generating DOM tree
generated DOM tree procedure:
  byte character → → → → Node tags DOM tree
  Bytes → characters → tokens → nodes → object model
  generation rule DOM tree
  building process DOM tree is a deep traversal: all the child nodes of the current node are built only to build the next sibling of the current node after good.

CSS Object Model (CSSOM)
to build this simple page in the browser DOM process, met a link tag in the head of the document, the numbers refer to an external CSS style sheets: style.css. Anticipating the need to use resources to render the page, it will send an immediate request for resources, and return the following:

body { font-size: 16px }
p { font-weight: bold }
span { color: red }
p span { display: none }
img { float: right }


We could directly within HTML markup declarations style (inline), but let CSS is independent of HTML content will help us design and processing as separate concerns: the designer responsible for dealing with CSS, HTML developers to focus on, and so on.
Like when handling HTML, we need to convert CSS rules to receive some browsers can understand and deal with things. Therefore, we will repeat the process HTML, CSS and not merely as HTML.

CSS byte into a character, and then converted into tokens node, a link to the last called "CSS Object Model" (CSSOM) tree structure.

The browser parses the data into a tree structure DOM HTML generated DOM Tree, the browser CSS code parsed into a tree data structure of CSSOM, generate CSS Rule Tree.

DOM Tree and CSS Rule Tree combined to generate Render Tree.

display: none of the nodes will not be added to Render Tree, and visibility: hidden will.
• display: hidden element corresponding to the original space but do not squeeze the element.
• visibility: hidden element corresponding to occupy the original space and the elements
so if a node is not the beginning of the show, set to display: none is better.

 

Seven, browser layout rendering

Html parsing tree to construct dom -> Construction render tree -> Layout render tree -> Draw render tree

Here to explain a few concepts, to facilitate understanding:

  • DOM Tree: browser parses the HTML into a tree data structure.
  • CSS Rule Tree: CSS browser will parse the data into a tree structure.
  • Render Tree: generating the combined DOM Render Tree and CSSOM.
  • layout (layout): With the Render Tree, the browser has to know which node pages, CSS definitions of each node as well as their affiliation, thereby to calculate the position of each node in the screen.
  • painting (drawing): calculated in accordance with the rules by the graphics card, the contents drawn on the screen.
  • reflow (reflow): When the browser find a section there was a bit changes affect the layout, you need to go back and re-rendering, experts say the rollback process is called reflow. From this root frame will reflow recursion starts down, all the nodes sequentially calculated geometry and position. reflow is almost inevitable. Now popular on the screen some of the effects, such as folding tree, expand (essentially show elements and hidden), etc., will cause reflow browser. Mouse over, click on ...... as long as these acts caused a change in the properties of certain elements of the footprint, positioning way, margins and other page, it will cause internal, and even around the re-rendering the entire page. Usually we are unable to estimate the browser in the end what part of the code will reflow, they all affect each other.
  • repaint (redraw): change the background color of an element, text color, border color, and so when it does not affect the layout of the property around or inside, part of the screen to be redrawn, but the geometry of the elements has not changed.

note:

  1. display: none of the nodes will not be added to Render Tree, and visibility: hidden will, therefore, if a node is not the beginning of the show, set to display: none is better.
  2. display: none will trigger reflow, and visibility: hidden will only trigger repaint, because found no change in position.
  3. In some cases, such as modifying the style elements, the browser does not reflow immediately or repaint once, but will accumulate a number of such operations, and then do a reflow, which is also called asynchronous reflow or incremental asynchronous reflow. However, in some cases, such as resize the window, change the default page of fonts. For these operations, the browser will immediately be reflow.

 

  1. HTML browser will be parsed into a DOM tree DOM tree building process is a deep traversal: all the child nodes of the current node are built only to build the next sibling of the current node after good.
  2. CSS is parsed into CSS Rule Tree.
  3. The DOM tree is constructed and CSSOM Rendering Tree. Note: Rendering Tree render tree is not the same DOM tree, or as some like Header display: none no need to put things in the render tree.
  4. With the Render Tree, the browser has to know there are pages CSS definitions which nodes, each node as well as their affiliation. The next step is called layout, by definition is the calculated position of each node in the screen.
  5. The next step is to draw, i.e. render tree traversal, and drawn by using the UI backend layers of each node.

 note:

  This process is the gradual completion of the above, in order to better user experience, the rendering engine will be as early as possible to present the content on the screen, and not wait to go to the construction and layout of the render tree after all html are parsed. It is part of the complete analytical content on the display part of the content, while the rest may also download content over a network.

 

Performance optimization redraw rearrangement:

(1)Reflow(回流/重排):当它发现了某个部分发生了变化影响了布局,渲染树需要重新计算。

(2)Repaint(重绘):改变了某个元素的背景颜色,文字颜色等,不影响元素周围或内部布局的属性,将只会引起浏览器的repaint,根据元素的新属性重新绘制,使元素呈现新的外观。重绘不会带来重新布局,并不一定伴随重排;
Reflow要比Repaint更花费时间,也就更影响性能。所以在写代码的时候,要尽量避免过多的Reflow。

reflow的原因:

(1)页面初始化的时候;

(2)操作DOM时;

(3)某些元素的尺寸变了;
(4)如果 CSS 的属性发生变化了。

减少 reflow/repaint

(1)不要一条一条地修改 DOM 的样式。与其这样,还不如预先定义好 css 的 class,然后修改 DOM 的 className。

(2)不要把 DOM 结点的属性值放在一个循环里当成循环里的变量。

(3)为动画的 HTML 元件使用position: fixed 或 absoult 的,那么修改他们的 CSS 是不会 reflow 的。
(4)尽量不要使用 table 布局。因为可能很小的一个小改动会造成整个 table 的重新布局。

 

其他

一。根据HTTP标准,HTTP请求可以使用多种请求方法。
HTTP1.0定义了三种请求方法: GET, POST 和 HEAD方法。
HTTP1.1新增了五种请求方法:OPTIONS, PUT, DELETE, TRACE 和 CONNECT 方法。
1.GET
请求指定的页面信息,并返回实体主体。
2.HEAD
类似于get请求,只不过返回的响应中没有具体的内容,用于获取报头
3.POST
向指定资源提交数据进行处理请求(例如提交表单或者上传文件)。数据被包含在请求体中。POST请求可能会导致新的资源的建立和/或已有资源的修改。
4.PUT
从客户端向服务器传送的数据取代指定的文档的内容。
5.DELETE
请求服务器删除指定的页面。
6.CONNECT
HTTP/1.1协议中预留给能够将连接改为管道方式的代理服务器。
7.OPTIONS
允许客户端查看服务器的性能。
8.TRACE
回显服务器收到的请求,主要用于测试或诊断。

二。dsn缓存失效问题:

https://blog.csdn.net/lock_xuanqing/article/details/80334579

三。DNS递归查询与迭代查询:
1.递归查询:
一般客户机和服务器之间属递归查询,即当客户机向DNS服务器发出请求后,若DNS服务器本身不能解析,则会向另外的DNS服务器发出查询请求,得到结果后转交给客户机;
2.迭代查询(反复查询):
一般DNS服务器之间属迭代查询,如:若DNS2不能响应DNS1的请求,则它会将DNS3的IP给DNS2,以便其再向DNS3发出请求;

举例:比如学生问老师一个问题,王老师告诉他答案这之间的叫递归查询。这期间也许王老师也不会,这时王老师问张老师,这之间的查询叫迭代查询!

递归是用户只向本地DNS服务器发出请求,然后等待肯定或否定答案。而迭代是本地服务器向根DNS服务器发出请求,而根DNS服务器只是给出下一级DNS服务器的地址,然后本地DNS服务器再向下一级DNS发送查询请求直至得到最终答案。

Guess you like

Origin www.cnblogs.com/qiujianmei/p/11654802.html