computer network HTTP

concept

web-based

  • HTTP (HyperText Transfer Protocol, Hypertext Transfer Protocol) .
  • Three technologies of WWW (World Wide Web): HTML, HTTP, and URL .
  • RFC (Request for Comments, request for comments), the design document of the Internet .

URL

  • URL (Uniform Resource Identifier, Uniform Resource Identifier) .
  • URL (Uniform Resource Locator, Uniform Resource Locator) .
  • URN (Uniform Resource Name, uniform resource name), such as urn:isbn:0-486-27557-4 .

URL includes URL and URN. At present, only URL is popular in WEB, so what you see is basically URL.
insert image description here

request response message

request message

insert image description here

response message

insert image description here

HTTP method

The first line of the request message sent by the client contains the method field.

GET

获取资源

Most of the current network requests use the GET method.

POST

传输实体主体

The main purpose of POST is not to get resources, but to transfer data stored in the content entity.
Both GET and POST requests can use additional parameters, but GET parameters appear in the URL as query fields, while POST parameters are stored in the content entity.

GET /test/demo_form.asp?name1=value1&name2=value2 HTTP/1.1
POST /test/demo_form.asp HTTP/1.1
Host: w3schools.com
name1=value1&name2=value2

HEAD

获取报文首部

Same as the GET method, but does not return the message entity body.
It is mainly used to confirm the validity of URL and the date and time of resource update.

PUT

上传文件

Since it does not have a verification mechanism, anyone can upload files, so there are security issues, and this method is generally not used.

PUT /new.html HTTP/1.1
Host: example.com
Content-type: text/html
Content-length: 16
<p>New File</p>

PATCH

对资源进行部分修改

PUT can also be used to modify resources, but it can only completely replace the original resource, and PATCH allows partial modification.

PATCH /file.txt HTTP/1.1
Host: www.example.com
Content-Type: application/example
If-Match: "e0023aa4e"
Content-Length: 100
[description of changes]

DELETE

删除文件

Contrary to the PUT function, and also without authentication mechanism.

DELETE /file.html HTTP/1.1

OPTIONS

查询支持的方法

Queries the methods supported by the specified URL.
Will return something like Allow: GET, POST, HEAD, OPTIONS.

CONNECT

要求用隧道协议连接代理

The requirement is to establish a tunnel when the proxy server communicates, and use SSL (Secure Sockets Layer, secure socket) and TLS (Transport Layer Security, Transport Layer Security) protocols to encrypt the communication content and transmit it through the network tunnel.

CONNECT www.example.com:443 HTTP/1.1

insert image description here

TRACE

追踪路径

The server returns the communication path to the client.

When sending a request, fill in the value in the Max-Forwards header field, and it will be decremented by 1 every time it passes through a server. When the value is 0, the transmission will stop.

TRACE is usually not used, and it is vulnerable to XST attacks (Cross-Site Tracing, cross-site tracking), so it is even less likely to be used.

HTTP status code

The first line of the status line in the response message returned by the server contains the status code and reason phrase, which is used to inform the client of the result of the request.

status code category reason phrase
1XX Informational (informational status code) The received request is being processed
2XX Success (success status code) The request is processed normally
3XX Redirection (redirection status code) Additional action is required to complete the request
4XX Client Error (client error status code) The server was unable to process the request
5XX Server Error (server error status code) An error occurred while the server was processing the request

2xx success

  • 200 OK
  • 204 No Content : The request has been successfully processed, but the returned response message does not contain the body of the entity. It is generally used when you only need to send information from the client to the server, but do not need to return data.
  • 206 Partial Content : Indicates that the client has made a range request. The response message contains the entity content in the range specified by Content-Range.

3XX redirection

  • 301 Moved Permanently : Permanent redirection
  • 302 Found : Temporary redirection
  • 303 See Other : It has the same function as 302, but 303 clearly requires that the client should use the GET method to obtain resources.
  • Note: Although the HTTP protocol stipulates that it is not allowed to change the POST method to the GET method when redirecting in the 301 and 302 states, most browsers will change the POST method to the GET method in the redirection in the 301, 302 and 303 states.
  • 304 Not Modified : If the header of the request message contains some conditions, such as: If-Match, If-ModifiedSince, If-None-Match, If-Range, If-Unmodified-Since, if the conditions are not met, the server will return a 304 status code.
  • 307 Temporary Redirect : Temporary redirection, similar to 302 in meaning, but 307 requires that the browser will not change the POST method of the redirection request to the GET method.

4XX client errors

  • 400 Bad Request : There is a syntax error in the request message.
  • 401 Unauthorized : This status code indicates that the sent request requires authentication information (BASIC authentication, DIGEST authentication). If a request has been made before, it means that the user authentication failed.
  • 403 Forbidden : The request is rejected, and the server does not need to give detailed reasons for the rejection.
  • 404 Not Found

5XX server error

  • 500 Internal Server Error : An error occurred while the server was executing the request.
  • 503 Service Unavilable : The server is temporarily overloaded or down for maintenance, and cannot process requests now.

HTTP header

There are 4 types of header fields: general header field, request header field, response header field and entity header field .

General header field

header field name illustrate
Cache-Control Control cache behavior
Connection Control header fields that are no longer forwarded to proxies, manage persistent connections
Date Date and time the message was created
Pragma message command
Trailer List of headers at the end of the message
Transfer-Encoding Specifies the transfer encoding method for the body of the message
Upgrade Upgrade to another protocol
Via Information about proxy servers
Warning error notification

request header field

header field name illustrate
Accept The media types that the user agent can handle
Accept-Charset preferred character set
Accept-Encoding preferred content encoding
Accept-Language Preferred Language (Natural Language)
Authorization Web authentication information
Expect Expect specific behavior from the server
From User's email address
Host The server where the resource is requested
If-Match Compare Entity Tags (ETags)
If-Modified-Since Compare resource update times
If-None-Match Compare Entity Tags (as opposed to If-Match)
If-Range Send entity Byte range request when resource is not updated
If-Unmodified-Since Compare resource update times (as opposed to If-Modified-Since)
Max-Forwards Maximum transmission hop-by-hop
Proxy-Authorization The proxy server asks for authentication information from the client
Range Entity byte range request
Refer The original getter for the URI in the request
THE Priority of transfer encoding
User-Agent Information for HTTP client programs

response header field

header field name illustrate
Accept-Ranges Whether to accept byte range requests
Age Estimated resource creation elapsed time
ETag is the matching information of the resource
Location Redirect the client to the specified URI
Proxy-Authenticate 代理服务器对客户端的认证信息
Retry-After 对再次发起请求的时机要求
Server HTTP 服务器的安装信息
Vary 代理服务器缓存的管理信息
WWW-Authenticate 服务器对客户端的认证信息

实体首部字段

首部字段名 说明
Allow 资源可支持的 HTTP 方法
Content-Encoding 实体主体适用的编码方式
Content-Language 实体主体的自然语言
Content-Length 实体主体的大小
Content-Location 替代对应资源的 URI
Content-MD5 实体主体的报文摘要
Content-Range 实体主体的位置范围
Content-Type 实体主体的媒体类型
Expires 实体主体过期的日期时间
Last-Modified 资源的最后修改日期时间

Cookie

HTTP 协议是无状态的,主要是为了让 HTTP 协议尽可能简单,是它能够处理更大的事务。HTTP/1.1 引入 Cookie 来保存状态信息。

Cookie 是服务器发送给客户端的数据,该数据会被保存在浏览器中,并且客户端的下次请求保存会包含该数据。通过 Cookie 可以让服务器知道两个请求是否来自同一个客户端,从而实现保持登录状态等功能

创建过程

服务器发送的响应报文包含 set-Cookie 字段,客户端得到响应报文后把 Cookie 内容保存到浏览器中。

HTTP/1.0 200 OK
Content-type: text/html
Set-Cookie: yummy_cookie=choco
Set-Cookie: tasty_cookie=strawberry
[page content]

客户端之后发送请求时,会从浏览器中读出 Cookie 值,在请求报文中包含 Cookie 字段。

GET /sample_page.html HTTP/1.1
Host: www.example.org
Cookie: yummy_cookie=choco; tasty_cookie=strawberry

Set-Cookie

属性 说明
NAME=VALUE 赋予 Cookie 的名称和其值(必需项)
expires=DATE Cookie 的有效期(若不明确指定则默认为浏览器关闭前为止)
path=PATH 将服务器上的文件目录作为 Cookie 的适用对象(若不指定则默认为文档所在的文件目录)
domain=域名 作为 Cookie 适用对象的域名(若不指定则默认为创建 Cookie 的服务器的域名)
Secure 仅在 HTTPs 安全通信时才会发送 Cookie
HttpOnly 加以限制,使 Cookie 不能被 JavaScript 脚本访问

Session 和 Cookie 区别

Session 是服务器用来跟踪用户的一种手段,每个 Session 都有一个唯一标识:Session ID。当服务器创建一个 Session 时,给客户端发送的响应报文包含 Set-Cookie 字段,其中有个名为 sid 的键值对,这个键值对就是 Session ID 。客户端接收到后就把 Cookie 保存在浏览器中,并且之后发送的请求报文都包含 Session ID 。HTTP 就是通过 Session 和 Cookie 两种方式一起合作来实现跟踪用户状态的,Session 用于服务器端,Cookie 用于客户端

浏览器禁用 Cookie 的情况

会使用 URL 重写技术,在 URL 后面加上 sid=xxx 。

使用 Cookie 实现用户名和密码的自动填写

网站脚本会自动从保存在浏览器中的 Cookie 读取用户名和密码,从而实现自动填写。

缓存

优点:

  • 降低服务器的负担
  • 提高响应速度(缓存资源比服务器上的资源离客户端更近)

实现方法:

  • 让代理服务器进行
  • 让客户端浏览器缓存

Cache-Control 字段

HTTP 通过 Cache-Control 首部字段来控制缓存。

Cache-Control: private, max-age=0, no-cache

no-cache 指令

该指令出现在请求报文的 Cache-Control 字段中,表示缓存服务器需要先向原服务器验证缓存资源是否过期;

该指令出现在响应报文的 Cache-Control 字段中,表示缓存服务器在进行缓存之前需要先验证缓存资源的有效性。

no-store 指令

该指令表示缓存服务器不能对请求或响应的任何一部分进行缓存。

no-cache 不表示不缓存,而是缓存之前需要先进行验证,no-store 才是不进行缓存。

max-age 指令

该指令出现在请求报文的 Cache-Control 字段中,如果缓存资源的缓存时间小于该指令指定的时间,那么就能接受该缓存。

该指令出现在响应报文的 Cache-Control 字段中,表示缓存资源在缓存服务器中保存的时间。

Expires 字段也可以用于告知缓存服务器该资源什么时候会过期。在 HTTP/1.1 中,会优先处理 Cache-Control : max-age 指令;而在 HTTP/1.0 中,Cache-Control : max-age 指令会被忽略掉。

持久连接

当浏览器访问一个包含多张图片的 HTML 页面时,除了请求访问 HTML 页面资源,还会请求图片资源,如果每进行一次 HTTP 通信就要断开一次 TCP 连接,连接建立和断开的开销会很大。持久连接只需要建立一次 TCP 连接就能进行多次 HTTP 通信。
insert image description here
持久连接需要使用 Connection 首部字段进行管理。HTTP/1.1 开始 HTTP 默认是持久化连接的,如果要断开 TCP 连接,需要由客户端或者服务器端提出断开,使用 Connection : close;而在 HTTP/1.1 之前默认是非持久化连接的,如果要维持持续连接,需要使用 Connection : Keep-Alive。

管线化方式 可以同时发送多个请求和响应,而不需要发送一个请求然后等待响应之后再发下一个请求。

通信数据转发

代理

代理服务器接受客户端的请求,并且转发给其它服务器。

使用代理的主要目的是:缓存、网络访问控制以及访问日志记录。

代理服务器分为正向代理和反向代理两种,用户察觉得到正向代理的存在,而反向代理一般位于内部网络中,用户察觉不到。
insert image description here

网关

与代理服务器不同的是,网关服务器会将 HTTP 转化为其它协议进行通信,从而请求其它非 HTTP 服务器的服务。

隧道

使用 SSL 等加密手段,为客户端和服务器之间建立一条安全的通信线路。

版本比较

The difference between HTTP/1.0 and HTTP/1.1

  • http/1.1 uses a long connection and http1.0 uses a short connection.
  • HTTP/1.1 adds a version number to the message for extended compatibility.
  • The cache mechanism of http/1.1 is more flexible.
  • HTTP/1.1 optimizes bandwidth.
  • http/1.0 only defines 16 status response codes, while http/1.1 defines 24 status codes.
  • One server in http/1.0 can only bind one address, while one server in http/1.1 can have multiple virtual hosts sharing the same IP address, because both requests and responses support the Host header field.

The difference between HTTP/1.1 and HTTP/2.0

multiplexing

HTTP/2.0 uses multiplexing, using the same TCP connection to handle multiple requests.

header compression

The headers of HTTP/1.1 carry a lot of information and have to be sent repeatedly every time. HTTP/2.0 requires both communication parties to cache a header field table, thereby avoiding repeated transmission.

server push

When the client requests a resource, related resources will be sent to the client together, and the client does not need to initiate the request again. For example, if the client requests the index.html page, the server will send the index.js to the client together.

binary format

HTTP/1.1's parsing is text-based, while HTTP/2.0 uses a binary format.

Guess you like

Origin blog.csdn.net/qq_44697754/article/details/128097373