[Computer Network - Application Layer] World Wide Web

1 Uniform Resource Locator URL

Uniform Resource Locator (URL)

Format:<协议>://<主机>:<端口>/<路径> (port and path can sometimes be omitted, URLs are case insensitive)

For example:

  • Access the web server using the HTTP protocol:http://www.abc.com:80/dir/file1.htm
  • Download and upload files using the FTP protocol:ftp://ftp.abc.com:21/dir/file1.htm
  • Read computer local files:file://localhost/d:/mydir/file1.zip

Case where the filename is omitted:

  • http://www.abc.com/dir/: The server will preset the default file to be accessed when the file name is omitted . The URL essentially visits /dir/index.htmlor/dir/default.htm
  • http://www.abc.com/: The URL essentially visits /index.htmlor/default.htm
  • http://www.abc.com: The URL is actually visited /index.htmlor/default.htm

2 World Wide Web Documentation

There are three types of World Wide Web documents:

  • HyperText Markup Language (HTML) : uses a variety of "tags" to describe the structure and content of web pages
  • Cascading Style Sheets (CSS) : Describe the style of web pages from an aesthetic point of view
  • JavaScript : a scripting language that controls the behavior of web pages

2.1 Hypertext Markup Language HTML

demo.html:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="UTF-8">
        <title>最简单的网页</title>
    </head>
    <body>
        <p>Hello world</p >
    </body>
</html>

2.2 Cascading Style Sheets CSS

demo.html:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="UTF-8">
        <title>最简单的网页</title>
        <link rel="stylesheet" type="text/css" href="demo.css" />
    </head>
    <body>
        <p class="pink">Hello world</p>
    </body>
</html>

demo.css:

.pink {
    
    
    color: deeppink;
    font-size: 36px;
}

2.3 JavaScript

demo.html:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="UTF-8">
        <title>最简单的网页</title>
        <link rel="stylesheet" type="text/css" href="demo.css" />
        <script type="text/javascript" src="demo.js"></script>
    </head>
    <body>
        <p class="pink" id="myId">Hello world</p >
        <button type="button" onclick="myFunction()">点个赞吧</button>
    </body>
</html>

demo.js:

function myFunction() {
    
    
    document.getElementById("myId").innerHTML="谢谢你的赞”;
}

3 Hypertext Transfer Protocol HTTP

Hypertext Transfer Protocol (HyperText Transfer Protocol, HTTP ): Defines how the browser (that is, the World Wide Web client process) requests World Wide Web documents from the World Wide Web server, and how the World Wide Web server transmits World Wide Web documents to the browser. It is based on the TCP protocol with port number 80.

3.1 HTTP message format

HTTP is text-oriented, and each field in its message is some ASCII code string, and the length of each field is uncertain.

Uniform Resource Identifier (Uniform Resource Identifier) : A file name storing web page data or a file name of a CGI program, such as /dir1/file1.htmland /dir1/program1.cgi.

3.1.1 HTTP request message

  • Request message format:
<方法><空格><URI><空格><HTTP版本>   // 请求行:可大致了解请求的内容
<首部字段名>:<字段值>               // 消息头:请求的附加消息
...
...
<首部字段名>:<字段值>
<消息主体>                          // 消息体:包含客户端向服务端发送的数据,请求报文不一定存在该字段

[Note] Only one URI can be written in each request message, so only one file can be obtained at a time. If you need to obtain files multiple times, you must send a separate request for each file. For example, a web page contains 3 pictures, then to get the web page and get the pictures, a total of 4 requests need to be sent to the server.

  • The main methods of HTTP :
method meaning
GET Requests the information specified by the URI. If the URI specifies a file, the content of the file is returned; if the specified CGI program is specified, the output data of the program is returned
POST Send data from client to server. It is generally used to fill in data in sending forms, etc.
HEAD Only return the header of the HTTP message, not the data content. Used to obtain attribute information such as the last update time of the file
CONNECT for proxy server

3.1.2 HTTP response message

  • Response message format:
<HTTP版本><空格><状态码><空格><响应短语>    // 状态行
<首部字段名>:<字段值>                       // 消息头
...
...
<首部字段名>:<字段值>
<消息主体>                                  // 消息体:包含服务端向客户端发送的数据,响应报文不一定存在该字段
  • HTTP status code :
Status code (33 types in total) meaning
1xx Notify the processing progress and status of the request
2xx Indicates success, such as accepting or knowing
3xx Redirect, indicating that further action is required
4xx Client errors, such as bad syntax in the request
5xx Server error, such as a server failure that could not complete the request

3.2 Working process of HTTP

3.2.1 HTTP/1.0

HTTP/1.0 uses a non-persistent connection method. In this mode, each time the browser requests a file, it must establish a TCP connection with the server , and close the connection immediately after receiving the response.

insert image description here

  • Each request for a document has an overhead of twice the RTT. If there are many reference objects (such as pictures, etc.) on a web page, it will take 2RTT to request each object.
  • In order to reduce the delay, the browser usually establishes multiple parallel TCP connections to request multiple objects at the same time . However, this will consume a lot of resources of the World Wide Web server, especially the World Wide Web server will often serve the requests of a large number of clients at the same time, which will make it a heavy burden.

3.2.2 HTTP/1.1

HTTP/1.1 uses a persistent connection method. In this mode, the World Wide Web server still maintains this connection after sending the response, so that the same client (browser) and the server can continue to transmit subsequent HTTP request messages and response messages on this connection . This is not limited to passing objects referenced on the same page, as long as those documents are all on the same server.

insert image description here

There are two types of persistent connections:

  • Non-pipelined (that is, the above picture): The client can only issue the next request after receiving the previous response.
  • Pipelining : The default method for HTTP/1.1. A client can send multiple requests in succession without waiting for a response.

The following example illustrates the working process of the continuous connection method (the picture is from "How the Network is Connected").

insert image description here

After the two parties establish a TCP connection, the following communication process is carried out:

  • The browser sends a request message to the web server to obtain the /sample.htm file:
GET /sample1.htm HTTP/1.1   // 表示向服务器发送的请求内容的请求行
Accept: */*
Accept-Language: zh
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible;【右侧省略】
Host: www.lab.glasscom.com
Connection: Keep-Alive      // 持续连接方式
  • The server returns the content of /sample.htm to the client, using the response message:
HTTP/1.1 200 OK         // 状态行,200表示请求成功完成
Date: Wed, 21 Feb 2007 09:19:14 GMT
Server: Apache          // 服务程序类型
Last-Modified: Mon, 19 Feb 2007 12:24:51 GMT
ETag: "5a9da-279-3c726b61"
Accept-Ranges: bytes
Content-Length: 632     // 数据长度
Connection: close       // 连接关闭
Content-Type: text/html     // 以MIME规格表示的数据格式。text/html表示HTML文档。如果是JPEG格式的图片,这里应该是image/jpeg

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>网络探索之旅</title>
</head>

<body>
<h1 align="center">网络探索之旅</h1>
<img border="1" src="picture.jpg" align="right" width="200" height="150">

这张网页解释了WWW的工作原理。网页中的文本数据和其中嵌入的图片数据是分别
保存在不同文件中的,它们合在一起就组成了一张网页。由于读取文件的操作是对各
个文件分别执行的,因此文本和图片是相互独立的文件,这就意味着读取它们的操作
也是相互独立执行的。

</body>
</html>
  • The browser sends a request message to the web server to obtain the /picture.jpg file:
GET /picture.jpg HTTP/1.1
Accept: */*
Referer: http://www.lab.glasscom.com/sample1.htm
Accept-Language: zh
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible;【右侧省略】
Host: www.lab.glasscom.com
Connection: Keep-Alive      // 持续连接方式
  • The server continues to return the content of /picture.jpg to the client, using the response message:
HTTP/1.1 200 OK
Date: Wed, 21 Feb 2007 09:19:14 GMT
Server: Apache
Last-Modified: Mon, 19 Feb 2007 13:50:32 GMT
ETag: "5a9d1-1913-3aefa236"
Accept-Ranges: bytes
Content-Length: 6419
Connection: close   // 连接关闭
Content-Type: image/jpeg

【下面就是图片数据,因为这些数据都是二进制的,所以我们在此省略】

3.2.3 Related Examples

[Example 1] Assuming that the HTTP1.1 protocol works in a continuous non-pipeline mode, the time for a request-response is RTT, and the rfc. It takes (4) RTTs until the content is reached.

insert image description here

[Example 2] Assume that the browser in the host uses the HTTP/1.1 protocol to work in a continuous non-pipeline mode, and requests the demo.html page containing 3 small JPEG images from the Web server. The request-response time for one request is RTT. From the time the first web request is initiated to when all content is received, the number of RTTs passed is (4).

4 Cookie

  • The early application of the World Wide Web was very simple, just users viewing various static documents stored on different servers. Therefore HTTP is designed as a stateless protocol . This simplifies the design of the server.
  • Cookies provide a mechanism by which a World Wide Web server can "remember" a user without the user actively providing user-identifying information. In other words, Cookie is a stateful technology for stateless HTTP .
  • Cookies work as follows:

insert image description here

Guess you like

Origin blog.csdn.net/baidu_39514357/article/details/130070800