What happened after the browser, enter the URL of a carriage return? (Ultra-detailed version)

Foreword

  This issue has become a cliche, but often appear as the finale of the title of the interview, there are many online articles, but recently idle, bored, and then do an article notes themselves, I feel more thorough understanding than before.

  This note is two days saw dozens of articles I summed up, so relatively little full, but because I was doing front-end, it would be more focused on analyzing the browser page rendering that part, the other part I will list the keywords, interested can own inspection,

 

Note: The steps in this article is based, the request is a simple HTTP request, not HTTPS, HTTP2, the simplest DNS, no proxy, and the server does not have any problem on the basis, although this is impractical.

 

The general process

  1. URL parsing

  2. DNS lookup

  3. TCP connection

  4. Processing the request

  5. Acceptance response

  6. Rendering page

 

A, URL parsing

  Address Resolution:

    First, determine your input is a keyword or a valid URL to be searched, and the automatic completion, character encoding and other operations based on what you type.

  HSTS

    Due to security risks, use HSTS force clients to use HTTPS to access the page. See: You do not know HSTS [1] .

  Other operations

    The browser also some additional operations, such as security checks, access restrictions (before the domestic browser limitations 996.icu).

  Check the cache

 

 

    

 

 

 

Two, DNS queries

  The basic steps

    

 

  1. The browser cache

    The browser will first check in the cache, the system library function is called no query.

  2. The operating system cache

    Operating system also has its own DNS cache, but before that, whether there will be a local Hosts file to check the domain name, do not send queries to the DNS server.

  3. Router Cache

    Routers also has its own cache.

  4. ISP DNS cache

    ISP DNS is the preferred DNS server on the client computer settings, which in most cases there will be cached.

  Root name server queries

    In the case of all the preceding steps have not cached, the local DNS server forwards the request to the root domain on the Internet, following a good interpretation of FIG whole process:

 

Need to pay attention to the point

  1. Recursively: all the way down the middle of the investigation does not return, the final result before the return information (browser process to the local DNS server)

  2. Iterative way, is the way the local DNS server to a root name server queries.

  3. What is DNS hijacking

  4. Optimization distal dns-prefetch

 

 

Three, TCP connection

  TCP / IP is divided into four, when data is transmitted, each data package to be:

    

 

1. Application Layer: HTTP request

  In the previous step we've got the server's IP address, the browser will start constructing a HTTP message, including:

    • Request header (Request Header): request method, destination address, protocol to follow, and so on

    • Request body (additional parameters)

  Which points to note:

    • The browser can only send GET, POST method, and open the page using the GET method

2. Transport Layer: TCP packet transmission

  Transport layer initiates a TCP connection to reach the server, in order to facilitate transfer, the data will be divided (in segment units), and marker number, can accurately restore the message information accepted by the server when it is convenient.

  Before establishing a connection, we will carry out a TCP three-way handshake.

About TCP / IP three-way handshake, the Internet has a lot of scripts and pictures vividly described.

Knowledge Point:

  1. SYN flood attack

3. Network layer: IP protocol address inquiries Mac

  The data segment package and add the source and destination IP address, and is responsible for finding transmission routes.

  Determine whether the target address and the current address on the same network, is then sent directly from the Mac address, or use the routing table to find the next hop address, and the use of the ARP protocol query its Mac address.

Note: In the OSI reference model is located in the ARP protocol link layer, but in TCP / IP, the network layer it is located.

4. Link Layer: Ethernet protocol

  Ethernet protocol

    According to the Ethernet protocol data packet into a "frame" as a unit, each frame is divided into two parts:

      • Header: the sender of the packet, the receiver, the data type

      • : Packet details

  Mac Address

    Ethernet provides for all devices connected to the network must have the "card" interface, data packets are transferred from one card to another piece of card, address card is the Mac address. Every Mac address is unique, with a capacity of one to one.

  broadcast

    The method of transmitting data is the original, directly to the data, transmitted to all machines of the network through ARP protocol, the recipient address is compared with its own Mac The header information is consistent accepted, otherwise discarded.

  Note : The recipient responded Unicast.

Knowledge Point:

  1. ARP attack

  Server accepts the request

    Acceptance process is reversed to the above steps, see picture.

 

Fourth, the server processes the request

  The general process

    

 

 

 

  HTTPD

    The most common have a common HTTPD Apache and Nginx on Linux, and IIS on Windows.

    It listens get the request, and then open a child process to handle the request.

  Processing the request

    After receiving the TCP packet, the connection will be processed, parses the HTTP protocol (request method, domain names, paths, etc.), and some verification:

      • Verify that the virtual host configuration

      • Verify whether to accept this method Hosting

      • The user can use the authentication method (based on the IP address, status information, etc.)

  Redirect

    If the server is configured HTTP redirection, returns a  301permanent redirect response to the browser based on the response, it resends the HTTP request (re-perform the above procedure).

For more: See the article [2]

  URL Rewrite

    Then looks at the URL rewrite rules, if the requested file is real, such as images, html, css, js files, etc., it will be returned directly to the file.

    Otherwise the server will be in accordance with the rules on the request to rewrite a REST-style URL.

    Then based on dynamic scripting languages, call to decide what type of dynamic document interpreter to process the request.

    In MVC framework PHP language of example, it will first initialize the parameters of the environment, by the method according to the URL to bottom to match the route, then the route so defined to process the request.

 

Fifth, the browser accepts response

  After the browser receives the response from the server's resources, resources will be analyzed.

  First check Response header, doing different things (such as the aforementioned redirection) depending on the status code.

  If the response is compressed resource (such as gzip), also need to be decompressed.

  Then, the response resources to do caching.

  Next, according to the response of the resource in  the MIME [. 3]  Type to parse the response content (such as HTML, Image have different analytical methods).

Sixth, rendering pages

  Browser kernel

 

    

 

  Different browser kernel, the rendering process is not exactly the same, but the general process is similar.

The basic flow

 

     

 

 

 

6.1. HTML parsing

    You must first know the browser resolution is parsed line by line from top to bottom.

    The resolution process can be divided into four steps:

  ① decoding (encoding)

    Transmission back are in fact some binary data bytes, the browser needs to be converted into a string encoding in accordance with the specified file (e.g., UTF-8), which is the HTML code.

  ② pre-parsed (pre-parsing)

    Preresolved do is to load resources in advance, to reduce processing time, it will recognize some of the properties of the requested resource, such as imga tag srcattribute, and the request is added to queue the request.

  ③ symbolized (Tokenization)

    Symbol is the process of lexical analysis, parsing input into symbols, HTML symbols including the start tag, end tag, attribute names and attribute values.

    It goes through a state machine identification symbols, such as encountered <, >the state will change.

  ④ build tree (tree construction)

Note: The symbolic tree is constructed and operating in parallel, that is to say as long as the resolve to start a label, it will create a DOM node.

  Symbolized in the previous step, the parser to obtain these markers, and then to create a suitable method DOMobject and these symbols inserted into DOMa subject.

<html>
<head>
    <title>Web page parsing</title>
</head>
<body>
    <div>
        <h1>Web page parsing</h1>
        <p>This is an example Web page.</p>
    </div>
</body>
</html>

 

 

 

  Browser fault-tolerant hex

    You've never seen something like "Invalid syntax" error in the browser, because the browser to correct errors of grammar, and continue working.

  event

    When the entire resolution process is complete, the browser will pass DOMContentLoadedto notification events DOMparsed.

  6.2. CSS parsing

    Once the browser to download the CSS, CSS parser will handle any CSS it encounters, according to the syntax specification [4] parse all CSS and labeled, then we get a regular table.

  CSS rules match

    When a matching node corresponding to CSS rules, is from right to left, for example: div p { font-size :14px }This will first search all ptags it is then judged whether the parent element div.

    So when we write CSS, as far as possible with the id and class, do not over stack.

  6.3. Rendering tree

    In fact, this is a DOM tree and CSS rules tree merge process.

Note: render tree will ignore those nodes do not need rendering, such as setting up display:nonenodes.

  Compute

    Size calculated to allow any one of the possible values are reduced to three: autopercentage, PX, such as the remconversion into px.

  cascade

    The browser needs a way to determine which styles really need to be applied to the corresponding elements, so it is called using a specificityformula, the formula will pass:

    1. Tag name, class, id

    2. Whether inline style

    3. !important

  Then come to a weight value, whichever is the highest.

  Render blocking

    When faced with a scripttime label, DOM building will be suspended until the completion of the implementation of the script, and then continue to build a DOM tree.

    But if the JS dependent CSS styles, but it has not been downloaded and build the browser will delay script execution until the CSS Rules is built.

  All we know:

    • CSS block the execution JS

    • JS can block behind the DOM parsing

  To avoid this situation, should the following principles:

    • CSS resources ahead of JavaScript resources

    • JS HTML on the bottom, that is,  </body>before

  In addition, if you want to change the blocking mode, you can use the defer and async, see: this article [5]

  6.4. Layout drawing

    Determining geometric properties render all nodes species, such as: location, size, etc., and finally enter a box model, it can accurately capture the exact location and size of each element within the screen.

    Then render tree traversal, call the renderer paint () method to display its contents on the screen.

  6.5. The combined render layers

    The above drawing all the pictures merger, the final output image.

  6.6 reflux and redraw

  Reflow (reflow)

    When the browser finds a part detect changes affecting the layout, you need to go back and re-rendering, will be from the htmllabel start recursive down, re-calculate the position and size.

    The basic reflow is unavoidable, because when you swipe the mouse, resize the window, the page will change.

  Redraw (repaint)

    When changing the background color, text color, and so an element of the position change will not affect the surrounding elements, redrawing occurs.

    After each redraw, browser render layers need to be merged and output to the screen.

    Reflux cost is much higher than redrawn, so we should try to avoid reflux.

  such as:

    • display:none Trigger reflux, and  visibility:hidden only trigger redrawn.

  6.7. JavaScript compiler implementation

  The general process

    

 

 

 

 

  It can be divided into three stages:

  1. lexical analysis

    After JS script is loaded, it will first enter the syntax analysis phase, it will first analyze the syntax of the code block is correct, incorrect then throw "syntax error" and stop execution.

   Several steps:

    • Word, for example var a = 2,, into var, a, =, 2such lexical units.

    • Parsing, lexical unit to convert the abstract syntax tree (AST).

    • Code generation, converting the abstract syntax tree into machine instructions.

  2. precompiled

    JS There are three operating environment:

    • Global Environment

    • Function environment

    • eval

    Each enters a different operating environment creates a corresponding execution context, depending on the context, the formation of a function call stack, stack bottom will always be the global execution context stack is always the current execution context.

  Creating execution context

  The process of creating the execution context, mainly to do the following three things:

  • Creating variable objects

    • Parameters, functions, variables

  • Establish the scope chain

    • Confirm whether the current execution environment can access variables

  • This points to determine

  3. Perform

    JS thread

 

 

 

  Although JS is single-threaded, but in fact the thread in the work of a total of four:

Just three of assistance, only JS engine thread is the real execution

  • JS engine thread: also known as JS kernel is responsible for resolving the main thread of execution JS script, such as V8 engine

  • Event triggers Thread: belong browser kernel threads, mainly used to control events, such as a mouse, keyboard, etc., when the event is triggered, the event handler function to promote the event queue, waiting for the JS engine thread execution

  • Timer triggered threads: the main control setIntervaland setTimeoutfor timing, the timing is finished, put the timer handler promote the event queue, waiting for the JS engine thread.

  • Asynchronous HTTP request thread: When connected via XMLHttpRequest, the browser to open a new thread to monitor the state change readyState, if you set the callback function of the state, the state of the handler to promote the event queue, waiting for the JS engine thread carried out.

  Note: The number of concurrent connections the browser is limited to the same domain name, usually six.

  Macro task

    Divided into:

    • Sync task: order execution only after the completion of the previous task, after a task to perform

    • Asynchronous tasks: not directly executed only when the trigger condition is satisfied, the relevant thread asynchronous tasks to promote jobs in the queue, waiting for the tasks performed on the JS engine when the main thread begins execution is completed, such as asynchronous Ajax, DOM events, setTimeout and so on.

  Micro-task

    Micro-task, the main API under ES6 and Node environment: Promise, process.nextTick.

    The implementation of micro-task synchronization task after task of the macro, before asynchronous tasks.

    

 

 

 

 

  Code examples

 
console.log ( '1'); // macro task synchronization 

the setTimeout (function () { 
    the console.log ( '2'); // macrotask asynchronous 
}) 

new new Promise (function (Resolve) { 
    the console.log ( '. 3 '); // macro task synchronization 
    Resolve (); 
}) the then (function () {. 
    the console.log ('. 4 ') // micro task 
}) 

the console.log ('. 5 ') // synchronization macrotask
 

  The above code sequence is output: 1,3,5,4,2

 

Reference Documents

[1]You do not know HSTS:  http://t.cn/AiR8pTqx
[2]See this article: http://t.cn/AiR8pnEC
[3]MIME: http://t.cn/AiR8prtm
[4]Syntax Specification: http://t.cn/AiR80GdO
[5]This article:  http://t.cn/AiR80c1k
[6]what-happens-when-zh_CN: http://t.cn/AiR80xb5http://t.cn/AiR80xb5http://t.cn/AiR80xb5
[7]Tags to DOM:http://t.cn/AiR80djX
[8] a thorough understanding of the browser caching mechanism: http://t.cn/AiR8Ovob
[9] works browsers: Behind the scenes of modern web browsers: http://t.cn/AiR8Oz06
[10] layman browser rendering principle: http://t.cn/AiR8O4fO


Original Address: https: //4ark.me/post/b6c7c0a2.html
 
 

 

Ah, I paste ~~

Guess you like

Origin www.cnblogs.com/jin-zhe/p/11586327.html