Browser-related principle (interview questions) a detailed summary

1. Chrome open a page how many processes you need to start? What processes are there?

Browser from a closed state to start, and then to open a page requires at least one network process, a browser process, a GPU process and a rendering process, a total of four process; the follow-up to open another tab, browse , network process, GPU process is shared, does not restart, if two pages belonging to the same site, then and opened from a page b page, then they will share a rendering process, or open a new rendering process .

The latest Chrome browser includes: a main process browser (Browser), a process GPU, a network (NetWork) process, multiple processes and multiple rendering plug-in process.

  • 浏览器进程: Mainly responsible for the display interface, user interaction, child process management, while providing storage and other functions.
  • 渲染进程: The core task is to convert HTML, CSS and JavaScript for the user can interact with the page layout engine V8 JavaScript engine and Blink are running in the process, by default, Chrome will create a rendering process for each Tab label . For security reasons, the rendering process is run in a sandbox mode.
  • GPU 进程: In fact, beginning when Chrome is not released GPU process. The original intention was to use the GPU to realize 3D CSS effects, but then the page, Chrome UI interface are selected using the GPU rendering, which makes GPU become popular browser requirements. Finally, Chrome on its multi-process architecture also introduces GPU process.
  • 网络进程: Network resources are mainly responsible for the page to load before running as a module inside the browser process, only recently independent, to become a separate process.
  • 插件进程: Mainly responsible for running plug-ins, plug-ins and easy due to the collapse, so it is necessary to isolate plug-ins through the process to ensure that the plug-in process crashes will not affect the browser and the page.

2. How to ensure full page documents can be served browser?

Internet data is transmitted by data packet. Data packets to be transmitted over the Internet, it must comply with the Internet Protocol (IP), a different device online on the Internet has a unique address, the address is just a number, just know this specific address, information can be sent in the past here.

If it is a data packet transmitted from the host A to host B, then prior to transmission, is appended to the IP address information of the host B on the packet, so as to correct addressing during transmission. Additionally, the IP address of the host is also appended to the data packets A itself, this information with host B can return information to the host A. This additional information will be put into a place called the IP header data structure. IP header information at the beginning of the IP data packet contains an IP version, source IP address, destination IP address, the survival time and other information.

IP is the underlying protocol, is only responsible for the data packets to other computers, other computer but do not know which program to put the packet is handed over to the browser or to the king of glory? Therefore, based on IP and application developers can deal with protocol, the most common are 用户数据包协议(User Datagram Protocol),简称UDPand 传输控制协议(Transmission Control Protocol),简称TCP.

Basic transmission process is:

  1. The upper layer packet to the transport layer
  2. Appends transport layer on the front of the packet UDP 头, to form a new UDP packet, then the new UDP packet to the network layer
  3. The network layer IP header and then attached to the data packet to form a new IP packet, and to the bottom
  4. Packet is transmitted to the network layer host B, where the host B disassembled IP header and data portions to open to the transport layer
  5. UDP header on the transport layer packet may be disassembled, and in accordance with the UDP port number provided by the data to the upper application part
  6. Finally, the packet is sent to the host B upper application here.

3. UDP and TCP What is the difference?

  • TCP protocol when transmitting data segments to give a label segment; the UDP protocol does not
  • TCP protocol reliable; UDP protocol is not reliable
  • TCP is connection-oriented protocol; the UDP protocol is connectionless
  • Higher loading TCP protocol, using virtual circuit; the UDP is connectionless
  • Protocol TCP sender to confirm whether the recipient of the received data segment (3 way handshake protocol)
  • Protocol uses TCP window flow control techniques and
characteristic TCP UDP
Is connected Connection-oriented Non-connection-oriented
Transmission reliability reliable Unreliable
Applications Transfer large amounts of data Transfer small amounts of data
speed slow fast

4. TCP transmission process in detail what?

Carry out three-way handshake to establish a TCP connection.
  • The first handshake: connection is established. The client sends a connection request segment, the SYN position 1, Sequence Number of x; Then, the client enters SYN_SEND state, waiting for acknowledgment from the server;
  • The second handshake: server receives a SYN segment. Server receives the client's SYN segment, this needs to be confirmed SYN segment, Acknowledgment Number set to x + 1 (Sequence Number + 1); the same time, but also transmit their own SYN request information, the location of a SYN , Sequence Number is Y; server all of the above information into a segment (i.e., segment SYN + ACK packet), the be sent to the client, when the server enters a state SYN_RECV;
  • Third handshake: the client receives the server SYN + ACK segment. Then Acknowledgment Number setting, segment sends an ACK packet to the server for the y + 1, after this segment has been sent, the client and server are entering ESTABLISHED state, complete the TCP three-way handshake.
    She completed a three-way handshake, the client and the server can begin transmitting data.

ACK: This field is valid flag indicates a response, that is to say the previously mentioned TCP acknowledgment number will be included in the TCP packet; has two values: 0 and 1, 1 is a time domain representation of the effective response, and 0 otherwise.
TCP protocol provides that only valid ACK = 1, also provides that all packets sent by the ACK must be 1 after the connection is established.
SYN (SYNchronization): used to synchronize the serial number when the connection is established. 1 when the ACK = 0 when SYN =, indicates that this is a connection request packet. If the other party agrees to establish a connection, should the response packet manipulation SYN = 1 and ACK = 1. Therefore, the SYN set to 1 indicates that this is a connection request or a connection acceptance message.
FIN (finis) End i.e., termination means for releasing a connection. When FIN = 1, this indicates that the sender of data segment has been transmitted, and release the connection requirements.

Sending the HTTP request, the server processes the request and returns the response result

After the TCP connection is established, the browser can use HTTP / HTTPS protocol to send a request to the server. Server receives the request, it parses the request header, if the head has cached information, such as if-none-match and if-modified-since, verify that the cache is valid, if valid, the status code 304 is returned when invalid return to the resources, status code 200.

Close the TCP connection
  • The first break: Host 1 (enables the client may be a server), and set the Sequence Number Acknowledgment Number, transmits a FIN packet to the host section 2; case, the host computer 1 enters a state FIN_WAIT_1; this indicates that the host 1 no data to send to the host computer 2;
  • Second break: Host 2 Host received a FIN segment sent 1, 1 to the host back an ACK segment, Acknowledgment Number as Sequence Number plus 1; 1 enters FIN_WAIT_2 host state; Host 2 Host 1 telling, I " agreed to "shut your request;
  • Third break: the host computer 2 transmits a FIN segment to the host a request to close the connection, while the host LAST_ACK enters state 2;
  • Fourth break: Host receives a FIN segment sent by the host computer 2 transmits ACK segment 2 to the host, then the host 1 enters the TIME_WAIT state; host 2 receives ACK segment host 1 after the connection is closed ; At this point, the host 1 waiting 2MSL still not received a reply, then prove Server has a normally closed end, well, the host 1 can also close the connection.

5. Why do so many sites open second speed quickly?

The main reason is the first time the page is loaded process, some of the data cache time-consuming.
So, what data is cached it?

DNS cache

When the main is to associate in a local browser and the corresponding IP domain name, DNS resolution is performed so quickly.

MemoryCache

It refers to the presence of in-memory cache. From the priority, it is the first browser to try a cache hit. From the efficiency, it is one of the fastest response speed cache.
Cache memory is fast, but also "short-lived" in. It and the rendering process "life and death", after the end of the process, after the tab is closed, the memory data will cease to exist.

Browser cache

Look at a classic flow chart, combined with understanding of
GitHub
the browser cache, also known as Http cache, the cache is divided into strong and consultation cache. Higher priority is strong cache, in the case of a strong cache hit failure, will go consultations cache.

Strong Cache

强缓存Using http header Expiresand Cache-Controltwo control fields. Strong cache, when a request is made, the browser will be based on which the judgment expires and cache-control target resource is "hit" strong cache hit if the access to resources directly from the cache, the communication does not happen again with the server.

Achieve strong cache, in the past we have been using expires. When the server returns a response, write expiration time expires fields in the Response Headers. like this

expires: Wed, 12 Sep 2019 06:12:18 GMT

You can see, is a time stamp expires, if we try again next resource request to the server, the browser will first compare the local time and the time stamp expires, if the local time is less than set the expiration time expires, then go directly to take this cache resources.

We are also not hard to guess from this description, expires there is a problem, its biggest problem is that the reliance on "local time". If the server and the client time setting may be different, or I manually go directly to the client's time to get rid of, then expires will fall short of our expectations.

Taking into account the limitations expires, HTTP1.1 added a Cache-Controlfield to complete the task expires. expires can do, Cache-Control can do; things can not be completed expires, Cache-Control can do. Therefore, Cache-Control can be regarded as a completely alternative expires. In practice, the front end of the moment, the only purpose of our continued use expires is backward compatible.

cache-control: max-age=31536000

In the Cache-Control, we have to control the validity of resources through max-age. max-age is not a timestamp, but the length of time. In this case, max-age is 31,536,000 seconds, it means that the resources are 31,536,000 seconds or less effective, perfectly avoid potential problems caused by a time stamp.

Cache-Control 相对于 expires 更加准确,它的优先级也更高。当 Cache-Control 与 expires 同时出现时,我们以 Cache-Control 为准。

Cache consultation

Consultations cache depends on the communication between the server and the browser. In consultation with caching, the browser needs to ask questions related to the server cached information, and then determine re-initiate a request to download a complete response, or obtain resources from the local cache. If the server cache resources are not prompted changes (Not Modified), resources will be redirected to the browser cache,这种情况下网络请求对应的状态码是 304。

Negotiation cache implementation, from Last-Modifiedto Etag, Last-Modified is a timestamp, if we negotiate caching enabled, it will be with Response Headers return on the first request:

Last-Modified: Fri, 27 Oct 2017 06:35:57 GMT

Then every time we requested, it will take a call If-Modified-Since timestamp field, a response is returned to its last-modified value on its value precisely:

If-Modified-Since: Fri, 27 Oct 2017 06:35:57 GMT

After the server receives the time stamp will be consistent whether the last modification time than the timestamp on the server and resources to determine whether the resource has changed. If a change occurs, it returns the contents of a full response, and add new Last-Modified value in the Response Headers; otherwise, returns a response 304 as shown above, will not add Response Headers Last-Modified field.

Use some drawbacks Last-Modified, the most common of which is this two scenarios:

  • We edited the document, but the content of the file has not changed. The server is not clear whether we really changed the file, it is still judged by the final editing time. So when this resource is requested again, it will be treated as a new resource, and thus lead to a complete response - and when not to re-requested, will be re-request.
  • When we modify a file too quickly (for example, took 100ms to complete the change), due to the If-Modified-Since only seconds to check the smallest unit of measurement of the time difference, so it is not aware of this change - the when re-requested, but did not re-requests.

Both scenarios point to actually change a bug-- same server and does not sense the correct file. To solve this problem,Etag 作为 Last-Modified 的补充出现了。

EtagBy a unique server identification string generated for each resource, the identification string can be coded based on contents of the file, as long as the file content different, their corresponding Etag is different, and vice versa. Therefore Etag can sense changes in the file accurately.

Etag generation process need to pay an extra server overhead will affect the performance of the server, which is its drawbacks. Therefore, we need to assess the situation and enable Etag. As --Etag not a substitute for Last-Modified and we have just mentioned, it is only as a Last-Modified complement and strengthen presence.

Etag 在感知文件变化上比 Last-Modified 更加准确,优先级也更高。当 Etag 和 Last-Modified 同时存在时,以 Etag 为准。

Service Worker Cache

Service Worker is an independent thread the main thread of Javascript. It is out of the browser window, and therefore can not access the DOM directly. Such independent personality makes Service Worker "personal behavior" can not interfere with the performance of the page, the "behind the scenes" can help us to achieve the offline cache, push messaging and network agents, and other functions. We use offline cache Service worker realization is called the Service Worker Cache.

Service Worker's life cycle, including install, active, working in three stages. Once the Service Worker is install, it will always exist, only to switch between active and working, unless we take the initiative to terminate it. This is an important prerequisite for it can be used for offline storage.

Push Cache

Push Cache Cache HTTP2 refers to the presence in the server push stage. This relatively new knowledge, applications are still in the embryonic stage, the application range is limited does not mean unimportant --HTTP2 trend is the future. It has not been pushed by extension of this moment, I still hope that we can understand the key features of Push Cache:

  • Push Cache Cache is the last line of defense. Only in the case of the browser Memory Cache, HTTP Cache and Service Worker Cache no hits will go to ask Push Cache.
  • Push Cache is a cache present in the conversation stage, when the session is terminated, the cache also will be released.
  • As long as different pages share the same HTTP2 connection, then they can share the same Push Cache.

Reference material

  • Time Geeks "Browser works and practice"
  • Nuggets booklet "front-end performance optimization theory and practice"

At last

Guess you like

Origin www.cnblogs.com/LuckyWinty/p/11685723.html