In the process of the browser enter the address of a

1. Enter the URL in your browser

It all starts here:

2. The browser Find the IP address of the domain name

 image

The first step is to find navigation accessing the domain's IP address. DNS lookup as follows:

  • Browser cache - browser cache DNS records for a period of time. Interestingly, the operating system does not tell the browser survival time for each DNS record, so the browser will cache a fixed period of time (browser interval 2-30 minutes).
  • Operating system cache  - if the browser cache does not contain the required records, the browser system call (gethostbyname Windows in). Operating system has its own cache.
  • Router cache  - continue to request your router, router usually has its own DNS cache.
  • ISP DNS cache  - check the next location is the cache ISP's DNS server. There cache, naturally.
  • Recursive search  - Your ISP's DNS server to start the recursive search from the root name servers to the name servers via Facebook .com top-level name servers. Usually, DNS server will have the .com name server names in the cache, so the hit to the root name server is unnecessary.

 

The following is a diagram of a recursive DNS search:

500 pixels, An_example_of_theoretical_DNS_recursion_svg 

 

3. The browser sends an HTTP request to the Web server

image

Facebook can be pretty sure the home page will not be available from the browser cache, because dynamic pages can be very rapid or immediate (expiration date is set to expire).

Therefore, the browser sends the request to the Facebook server:

GET http://facebook.com/ HTTP / 1.1
接受:application / x-ms-application,image / jpeg,application / xaml + xml,[...] 
User-Agent:Mozilla / 4.0(compatible; MSIE 8.0; Windows NT 6.1; WOW64; [...]
Accept-Encoding:gzip,deflate
Connection: Keep vitality
Moderator: facebook.com
Cookies: DATR = 1265876274- [...]; locale = EN_US; LSD = WW [...]; c_user = 2101 [...]

GET request name to extract the URL of  :  "http://facebook.com/." Browser identifies itself ( the User-Agent header), and that it accepted what type of response ( the Accept and Accept-Encoding header). The connection head further requires the server to keep the TCP connection open request.

The request also contains this domain browser Cookie . You may already know, Cookie tracking website state between page requests of different pairs. Hence the name Cookie will store the user's login and password assigned by the server to the user, the user's certain settings. Cookie stored in a text file on the client, with each request sent to the server.

4.Facebook redirect server responds in a permanent manner

image

This is in response to the Facebook server sends back to the browser requests:

HTTP / 1.1 301 Moved Permanently
Cache-Control:private,no-store,no-cache,must-revalidate,post-check = 0,
      Check before = 0
Expiration date: January 1, 2000 00:00:00 GMT Saturday
Location: http: //www.facebook.com/
P3P:CP =“DSP LAW”
Pragma: no cache
Set-Cookie: made_write_conn = deleted; expires = Thu, February 12, 2009 05:09:50 GMT;
      Path = /; domain = .facebook.com; only Http
Content Type: text / html; charset = utf-8 of
X-Cnection: Close
Date: February 12, 2010, Friday 05:09:51 GMT
Content-Length: 0

Server responded to 301 permanent move in response, telling the browser to "http://www.facebook.com/" rather than "http://facebook.com/".

There are some interesting reasons why the server adhere to redirect, rather than immediately respond to page the user wants to see.

One reason for the search engine rankings related . See if the same page there are two URL, for example http://www.igoro.com/ and http://igoro.com/ , search engines may think that they are two different sites, each inbound fewer links, rank lower. Search engines understand permanent redirect (301), and the combined incoming links from two sources into a single ranking.

In addition, the URL is not the same content multiple cache friendly . When a piece of content there is more than one name, it would appear many times the potential in the cache.

5. The browser follows the redirect

image

The browser now know that "http://www.facebook.com/" is the correct URL, it will send another GET request:

GET http://www.facebook.com/ HTTP / 1.1
接受:application / x-ms-application,image / jpeg,application / xaml + xml,[...]
Accept-Language: en-US
User Agent: Mozilla / 4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; [...]
Accept-Encoding:gzip,deflate
Connection: Keep vitality
Cookie:lsd = XW [...] ; c_user = 21 [...] ; x-referer = [...] 
Moderator: www.facebook.com

The meaning of the title with the same first request.

6. Server 'Processing' Request

image

The server will receive a GET request, processes it and sends a response.

This seems like a simple task, but in fact, there is a lot of interesting things happen - even on simple sites like my blog like this, not to mention the massive expansion such as Facebook sites.

  • Web server software is
    Web server software (such as IIS or Apache) receives the HTTP request and decide which request handler performs to process this request. Request handler is generated in response to a read request and an HTML program (in ASP.NET, PHP, Ruby, ... in).

    In the simplest case, the request processing program may be stored in the URL structure reflects the structure of the file hierarchy, for example http://example.com/folder1/page1.aspx  URL will map to the file / httpdocs / folder1 / page1 of. aspx. Web server software can also be configured to map to a request handler URL manually, so page1.aspx public URL can be http://example.com/folder1/page1 .

  • Request handler
    request handler read request, its parameters and Cookie. It may read and update some of the data stored on the server. Then, the request handler generates a HTML response.

An interesting conundrum for each dynamic websites face is how to store data. Smaller sites usually have a SQL database to store their data, but store large amounts of data and / or have many visitors to the site have to find a way to split the database across multiple machines. Solution comprising slicing (based on primary key division multiple databases), and copy the database using a simplified semantic consistency of weakness.

Keep the data updated cheap One technique is to postpone some work to batch jobs. For example, Facebook must update the news source, but supporting data, "You may know" feature may only need to update every night (I guess, I do not really know how to implement this feature). Batch job updates result in some less important stale data, but the data can be updated faster and easier.

7. The server sends back a response HTML

image

The following is a response from the server generates and sends back:

HTTP / 1.1 200 OK
Cache-Control:private,no-store,no-cache,must-revalidate,post-check = 0,
    Check before = 0
Expiration date: January 1, 2000 00:00:00 GMT Saturday
P3P:CP =“DSP LAW”
Pragma: no cache
Content-Encoding: gzip
Content Type: text / html; charset = utf-8 of
X-Cnection: Close
Transfer Encoding: Block
Date: February 12, 2010, Friday 09:05:55 GMT

2B3 
Tn of @ [...]

The entire response is 36 kB, mostly in bytes I trim the end of the blob.

The content encoding header tells the body response with gzip compression algorithm browser. After unpacking Blob, you will see your desired HTML:

<!DOCTYPE html PUBLIC“ -  // W3C // DTD XHTML 1.0 Strict // EN”   
      “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html xmlns =“http://www.w3.org/1999/xhtml”xml:lang =“en” 
      lang =“en”id =“facebook”class =“no_js”>
<HEAD>
<meta http-equiv =“Content-type”content =“text / html; charset = utf-8”/>
<meta http-equiv =“Content-language”content =“en”/>
...

In addition to compression, heading specify whether and how to cache the page, set any Cookie (in this response without), and other private information.

Note that the Content-Type set to text / html title. Title instructs the browser to the response content rendered as HTML, rather than as a file download. The browser will use the title to decide how to interpret the response, but will also consider other factors, such as the expansion of the URL.

8. browser to start rendering HTML

Even before the browser has received the entire HTML document, it also began to render website:

 image

9. The browser sends a request embedded HTML objects

image

When the browser renders HTML, it noted the need to label other URL extraction. The browser sends the GET request to retrieve the files.

The following are a few of my visit facebook.com website:

  • Picture
    http://static.ak.fbcdn.net/rsrc.php/z12E0/hash/8q2anwu7.gif 
    http://static.ak.fbcdn.net/rsrc.php/zBS5C/hash/7hwy7at6.gif 
    ...
  • CSS style sheets
    http://static.ak.fbcdn.net/rsrc.php/z448Z/hash/2plh8s4n.css 
    http://static.ak.fbcdn.net/rsrc.php/zANE1/hash/cvtutcee.css 
    . ..
  • JavaScript files
    http://static.ak.fbcdn.net/rsrc.php/zEMOA/hash/c8yzb6ub.js 
    http://static.ak.fbcdn.net/rsrc.php/z6R9L/hash/cq2lgbs8.js 
    .. .

Each of these URL through a process with the HTML page through a similar process. Therefore, the browser will look in the DNS, send a request to the redirection URL.

However, static files (with different dynamic pages) allows the browser to cache them. Some files can be provided from the cache without contacting the server. Browser cache know how long a particular file, because the return of the response file contains an Expires header. In addition, each response can also contain a similar version of ETag header - to see if the browser already has a file version of ETag, you can stop the transmission immediately.

You can guess the URL "fbcdn.net" mean anything? It is worth mentioning that this means that "Facebook content delivery networks." Facebook using content delivery network (CDN) to distribute static content - images, style sheets, and JavaScript files. Therefore, these files will be copied to the many global machine.

Static content usually represent the majority of the bandwidth site, and can be easily copied to the CDN. Typically, the site will use a third-party CDN provider, rather than trying to run CND. For example, Facebook's static files hosted by the largest CDN provider Akamai.

As a demonstration, when you try to ping static.ak.fbcdn.net, you will receive a response from akamai.net server. In addition, it is interesting, if you ping the URL a few times, you may receive a response from different servers, load balancing suggesting that happens behind the scenes.

10. The browser sends a further asynchronous (AJAX) request

image

In the spirit of Web 2.0, even after the page is rendered, the client continues to communicate with the server.

For example, Facebook Chat will continue to update the list you are logged friends as they come. To update the list of friends logged in, JavaScript execution in the browser must send an asynchronous request to the server. Asynchronous request is GET or POST requests A program of construction, go to a special URL. In the Facebook example, the client sends a POST request to http://www.facebook.com/ajax/chat/buddy_list.php, for a list of online friends.

This mode is sometimes referred to as "AJAX", which stands for "Asynchronous JavaScript and XML", although there is no particular reason why the server response must be formatted as XML. For example, Facebook asynchronous request returns a response of JavaScript code.

In addition, the tooltip lets you see the asynchronous request sent by the browser. In fact, you can not only passively observe requests, you can also edit and re-send. In fact, this is an easy fact AJAX request to "cheat" for developers using the online game scoreboard, it caused a lot of damage. (Obviously, do not cheat in this way.)

Facebook chat provides an interesting question AJAX example: the data from the server to the client pushed. Since HTTP is a request - response protocol, the chat server can not therefore be pushed to the client a new message. Instead, the client must poll the server every few seconds to see if new messages have arrived.

Long polling is interesting technology to reduce these types of scenarios in the server load. If the server does not have any new messages during polling, it will not send back a response. Moreover, if the client has received a message within the timeout period, the server discovery request is not complete and return the message.

Guess you like

Origin www.cnblogs.com/kezan/p/12445903.html