Foreword
This issue has become a cliche, but often appear as the finale of the title of the interview, there are many online articles, but recently idle, bored, and then do an article notes themselves, I feel more thorough understanding than before.
This note is two days saw dozens of articles I summed up, so relatively little full, but because I was doing front-end, it would be more focused on analyzing the browser page rendering that part, the other part I will list the keywords, interested can own inspection,
Note: The steps in this article is based, the request is a simple HTTP request, not HTTPS, HTTP2, the simplest DNS, no proxy, and the server does not have any problem on the basis, although this is impractical.
The general process
-
URL parsing
-
DNS lookup
-
TCP connection
-
Processing the request
-
Acceptance response
-
Rendering page
A, URL parsing
Address Resolution:
First, determine your input is a keyword or a valid URL to be searched, and the automatic completion, character encoding and other operations based on what you type.
HSTS
Due to security risks, use HSTS force clients to use HTTPS to access the page. See: You do not know HSTS [1] .
Other operations
The browser also some additional operations, such as security checks, access restrictions (before the domestic browser limitations 996.icu).
Check the cache
Two, DNS queries
The basic steps
1. The browser cache
The browser will first check in the cache, the system library function is called no query.
2. The operating system cache
Operating system also has its own DNS cache, but before that, whether there will be a local Hosts file to check the domain name, do not send queries to the DNS server.
3. Router Cache
Routers also has its own cache.
4. ISP DNS cache
ISP DNS is the preferred DNS server on the client computer settings, which in most cases there will be cached.
Root name server queries
In the case of all the preceding steps have not cached, the local DNS server forwards the request to the root domain on the Internet, following a good interpretation of FIG whole process:
Need to pay attention to the point
-
Recursively: all the way down the middle of the investigation does not return, the final result before the return information (browser process to the local DNS server)
-
Iterative way, is the way the local DNS server to a root name server queries.
-
What is DNS hijacking
-
Optimization distal dns-prefetch
Three, TCP connection
TCP / IP is divided into four, when data is transmitted, each data package to be:
1. Application Layer: HTTP request
In the previous step we've got the server's IP address, the browser will start constructing a HTTP message, including:
-
-
Request header (Request Header): request method, destination address, protocol to follow, and so on
-
Request body (additional parameters)
-
Which points to note:
-
-
The browser can only send GET, POST method, and open the page using the GET method
-
2. Transport Layer: TCP packet transmission
Transport layer initiates a TCP connection to reach the server, in order to facilitate transfer, the data will be divided (in segment units), and marker number, can accurately restore the message information accepted by the server when it is convenient.
Before establishing a connection, we will carry out a TCP three-way handshake.
“About TCP / IP three-way handshake, the Internet has a lot of scripts and pictures vividly described.
Knowledge Point:
”
SYN flood attack
3. Network layer: IP protocol address inquiries Mac
The data segment package and add the source and destination IP address, and is responsible for finding transmission routes.
Determine whether the target address and the current address on the same network, is then sent directly from the Mac address, or use the routing table to find the next hop address, and the use of the ARP protocol query its Mac address.
“Note: In the OSI reference model is located in the ARP protocol link layer, but in TCP / IP, the network layer it is located.
”
4. Link Layer: Ethernet protocol
Ethernet protocol
According to the Ethernet protocol data packet into a "frame" as a unit, each frame is divided into two parts:
-
-
-
Header: the sender of the packet, the receiver, the data type
-
: Packet details
-
-
Mac Address
Ethernet provides for all devices connected to the network must have the "card" interface, data packets are transferred from one card to another piece of card, address card is the Mac address. Every Mac address is unique, with a capacity of one to one.
broadcast
The method of transmitting data is the original, directly to the data, transmitted to all machines of the network through ARP protocol, the recipient address is compared with its own Mac The header information is consistent accepted, otherwise discarded.
Note : The recipient responded Unicast.
“Knowledge Point:
”
ARP attack
Server accepts the request
Acceptance process is reversed to the above steps, see picture.
Fourth, the server processes the request
The general process
HTTPD
The most common have a common HTTPD Apache and Nginx on Linux, and IIS on Windows.
It listens get the request, and then open a child process to handle the request.
Processing the request
After receiving the TCP packet, the connection will be processed, parses the HTTP protocol (request method, domain names, paths, etc.), and some verification:
-
-
-
Verify that the virtual host configuration
-
Verify whether to accept this method Hosting
-
The user can use the authentication method (based on the IP address, status information, etc.)
-
-
Redirect
If the server is configured HTTP redirection, returns a 301
permanent redirect response to the browser based on the response, it resends the HTTP request (re-perform the above procedure).
“For more: See the article [2]
”
URL Rewrite
Then looks at the URL rewrite rules, if the requested file is real, such as images, html, css, js files, etc., it will be returned directly to the file.
Otherwise the server will be in accordance with the rules on the request to rewrite a REST-style URL.
Then based on dynamic scripting languages, call to decide what type of dynamic document interpreter to process the request.
In MVC framework PHP language of example, it will first initialize the parameters of the environment, by the method according to the URL to bottom to match the route, then the route so defined to process the request.
Fifth, the browser accepts response
After the browser receives the response from the server's resources, resources will be analyzed.
First check Response header, doing different things (such as the aforementioned redirection) depending on the status code.
If the response is compressed resource (such as gzip), also need to be decompressed.
Then, the response resources to do caching.
Next, according to the response of the resource in the MIME [. 3] Type to parse the response content (such as HTML, Image have different analytical methods).
Sixth, rendering pages
Browser kernel
Different browser kernel, the rendering process is not exactly the same, but the general process is similar.
The basic flow
6.1. HTML parsing
You must first know the browser resolution is parsed line by line from top to bottom.
The resolution process can be divided into four steps:
① decoding (encoding)
Transmission back are in fact some binary data bytes, the browser needs to be converted into a string encoding in accordance with the specified file (e.g., UTF-8), which is the HTML code.
② pre-parsed (pre-parsing)
Preresolved do is to load resources in advance, to reduce processing time, it will recognize some of the properties of the requested resource, such as img
a tag src
attribute, and the request is added to queue the request.
③ symbolized (Tokenization)
Symbol is the process of lexical analysis, parsing input into symbols, HTML symbols including the start tag, end tag, attribute names and attribute values.
It goes through a state machine identification symbols, such as encountered <
, >
the state will change.
④ build tree (tree construction)
“Note: The symbolic tree is constructed and operating in parallel, that is to say as long as the resolve to start a label, it will create a DOM node.
”
Symbolized in the previous step, the parser to obtain these markers, and then to create a suitable method DOM
object and these symbols inserted into DOM
a subject.
<html> <head> <title>Web page parsing</title> </head> <body> <div> <h1>Web page parsing</h1> <p>This is an example Web page.</p> </div> </body> </html>
Browser fault-tolerant hex
You've never seen something like "Invalid syntax" error in the browser, because the browser to correct errors of grammar, and continue working.
event
When the entire resolution process is complete, the browser will pass DOMContentLoaded
to notification events DOM
parsed.
6.2. CSS parsing
Once the browser to download the CSS, CSS parser will handle any CSS it encounters, according to the syntax specification [4] parse all CSS and labeled, then we get a regular table.
CSS rules match
When a matching node corresponding to CSS rules, is from right to left, for example: div p { font-size :14px }
This will first search all p
tags it is then judged whether the parent element div
.
So when we write CSS, as far as possible with the id and class, do not over stack.
6.3. Rendering tree
In fact, this is a DOM tree and CSS rules tree merge process.
“Note: render tree will ignore those nodes do not need rendering, such as setting up
”display:none
nodes.
Compute
Size calculated to allow any one of the possible values are reduced to three: auto
percentage, PX, such as the rem
conversion into px
.
cascade
The browser needs a way to determine which styles really need to be applied to the corresponding elements, so it is called using a specificity
formula, the formula will pass:
-
-
Tag name, class, id
-
Whether inline style
-
!important
-
Then come to a weight value, whichever is the highest.
Render blocking
When faced with a script
time label, DOM building will be suspended until the completion of the implementation of the script, and then continue to build a DOM tree.
But if the JS dependent CSS styles, but it has not been downloaded and build the browser will delay script execution until the CSS Rules is built.
All we know:
-
-
CSS block the execution JS
-
JS can block behind the DOM parsing
-
To avoid this situation, should the following principles:
-
-
CSS resources ahead of JavaScript resources
-
JS HTML on the bottom, that is,
</body>
before
-
In addition, if you want to change the blocking mode, you can use the defer and async, see: this article [5]
6.4. Layout drawing
Determining geometric properties render all nodes species, such as: location, size, etc., and finally enter a box model, it can accurately capture the exact location and size of each element within the screen.
Then render tree traversal, call the renderer paint () method to display its contents on the screen.
6.5. The combined render layers
The above drawing all the pictures merger, the final output image.
6.6 reflux and redraw
Reflow (reflow)
When the browser finds a part detect changes affecting the layout, you need to go back and re-rendering, will be from the html
label start recursive down, re-calculate the position and size.
The basic reflow is unavoidable, because when you swipe the mouse, resize the window, the page will change.
Redraw (repaint)
When changing the background color, text color, and so an element of the position change will not affect the surrounding elements, redrawing occurs.
After each redraw, browser render layers need to be merged and output to the screen.
Reflux cost is much higher than redrawn, so we should try to avoid reflux.
such as:
-
-
display:none
Trigger reflux, andvisibility:hidden
only trigger redrawn.
-
6.7. JavaScript compiler implementation
The general process
It can be divided into three stages:
1. lexical analysis
After JS script is loaded, it will first enter the syntax analysis phase, it will first analyze the syntax of the code block is correct, incorrect then throw "syntax error" and stop execution.
Several steps:
-
-
Word, for example
var a = 2
,, intovar
,a
,=
,2
such lexical units. -
Parsing, lexical unit to convert the abstract syntax tree (AST).
-
Code generation, converting the abstract syntax tree into machine instructions.
-
2. precompiled
JS There are three operating environment:
-
-
Global Environment
-
Function environment
-
eval
-
Each enters a different operating environment creates a corresponding execution context, depending on the context, the formation of a function call stack, stack bottom will always be the global execution context stack is always the current execution context.
Creating execution context
The process of creating the execution context, mainly to do the following three things:
-
Creating variable objects
-
Parameters, functions, variables
-
Establish the scope chain
-
Confirm whether the current execution environment can access variables
-
This points to determine
3. Perform
JS thread
Although JS is single-threaded, but in fact the thread in the work of a total of four:
“Just three of assistance, only JS engine thread is the real execution
”
-
JS engine thread: also known as JS kernel is responsible for resolving the main thread of execution JS script, such as V8 engine
-
Event triggers Thread: belong browser kernel threads, mainly used to control events, such as a mouse, keyboard, etc., when the event is triggered, the event handler function to promote the event queue, waiting for the JS engine thread execution
-
Timer triggered threads: the main control
setInterval
andsetTimeout
for timing, the timing is finished, put the timer handler promote the event queue, waiting for the JS engine thread. -
Asynchronous HTTP request thread: When connected via XMLHttpRequest, the browser to open a new thread to monitor the state change readyState, if you set the callback function of the state, the state of the handler to promote the event queue, waiting for the JS engine thread carried out.
Note: The number of concurrent connections the browser is limited to the same domain name, usually six.
Macro task
Divided into:
-
-
Sync task: order execution only after the completion of the previous task, after a task to perform
-
Asynchronous tasks: not directly executed only when the trigger condition is satisfied, the relevant thread asynchronous tasks to promote jobs in the queue, waiting for the tasks performed on the JS engine when the main thread begins execution is completed, such as asynchronous Ajax, DOM events, setTimeout and so on.
-
Micro-task
Micro-task, the main API under ES6 and Node environment: Promise
, process.nextTick
.
The implementation of micro-task synchronization task after task of the macro, before asynchronous tasks.
Code examples
console.log ( '1'); // macro task synchronization
the setTimeout (function () {
the console.log ( '2'); // macrotask asynchronous
})
new new Promise (function (Resolve) {
the console.log ( '. 3 '); // macro task synchronization
Resolve ();
}) the then (function () {.
the console.log ('. 4 ') // micro task
})
the console.log ('. 5 ') // synchronization macrotask
The above code sequence is output: 1,3,5,4,2
Reference Documents
[1]
You do not know HSTS: http://t.cn/AiR8pTqx
[2]
See this article: http://t.cn/AiR8pnEC
[3]
MIME: http://t.cn/AiR8prtm
[4]
Syntax Specification: http://t.cn/AiR80GdO
[5]
This article: http://t.cn/AiR80c1k
[6]
what-happens-when-zh_CN: http://t.cn/AiR80xb5http://t.cn/AiR80xb5http://t.cn/AiR80xb5
[7]
Tags to DOM:http://t.cn/AiR80djX
[8] a thorough understanding of the browser caching mechanism: http://t.cn/AiR8Ovob
[9] works browsers: Behind the scenes of modern web browsers: http://t.cn/AiR8Oz06
[10] layman browser rendering principle: http://t.cn/AiR8O4fO
Original Address: https: //4ark.me/post/b6c7c0a2.html
Ah, I paste ~~