Understand the HTTP protocol to make your website faster

This article shares the basic knowledge of the http protocol with you, understands the request method of http, related http status codes, and the introduction of http messages, hoping to help you in your work.


1. Introduction to HTTP protocol

The HTTP protocol, the full name of HyperText Transfer Protocol, the Chinese name is Hypertext Transfer Protocol, which is the most commonly used network protocol in the Internet. One of the important applications of HTTP is WWW service. The original purpose of designing the HTTP protocol is to provide a way to publish and receive HTML (a page markup language) pages.

The HTTP protocol is one of the commonly used communication protocols on the Internet. It has many applications, but the most popular one is used for communication between Web browsers and Web servers, namely WWW applications or Web applications.

www, the full name of World Wide Web, often referred to as the Web, Chinese translated as "World Wide Web." It is currently the most popular form of information service on the Internet. The default port of the WWW service application of the HTTP protocol is 80, and the default port of https of another encrypted WWW service application is 443, which is mainly used for money-related services such as online banking and payment. Today, the concepts of HTTP service, WWW service, and Web service have been confused, and they are the same in this book, and they all refer to the most common website service applications at present.

2. HTTP protocol version

The HTTP protocol has gone through several versions from its inception to now, the most important versions of which are HTTP/1.0 and HTTP/1.1. HTTP/1.0 is the first version to be widely used, and HTTP/1.1 is the mainstream version currently in use.

1. Introduction to HTTP/1.0

HTTP/1.0 was the first widely used version of HTTP. HTTP/1.0 adds HTTP request headers on the basis of HTTP/0.9, which can support more request methods and can process multimedia objects. HTTP/1.0 made possible the graphic web pages and interactive forms that contributed to the widespread acceptance of the Internet. HTTP/1.0 stipulates that the browser and the server only maintain a short connection. Each request of the browser needs to establish a TCP connection with the server. After the server finishes processing the request, it disconnects the TCP connection. The server does not track each client and does not record past requests.

2. Introduction to HTTP/1.1

The focus of HTTP/1.1 is to fix the flaws in HTTP design, and related improvements have been made in many aspects such as scalability, cache processing, bandwidth optimization, persistent connection, host header, error notification, message delivery, and content negotiation. HTTP/1.1 is the current mainstream HTTP version of the Internet.

In terms of connections, HTTP/1.1 supports persistent connections. Multiple HTTP requests and responses can be transmitted on one TCP connection, which reduces the consumption and time delay of establishing and closing connections.

In terms of request headers, HTTP/1.1 adds more request headers and response header information to enhance HTTP functions. For example: the host host header function allows the Web browser to use the host header name to clearly indicate which WEB site on the server to visit, so that the Web server can be used to configure multiple virtual Web sites on the same IP address and port number.

The continuous connection of HTTP/1.1 also needs to add a new request header to help realize it. For example, when the value of the Connection request header is Keep-Alive, it means that the client notifies the server to keep the connection after returning the result of this request; the value of the Connection request header When it is close, it means that the client notifies the server to close the connection after returning the result of this request. HTTP/1.1 also provides request headers and response headers related to mechanisms such as identity authentication, state management, and cache caching.

2.1 HTTP request method

In HTTP communication, each HTTP request message contains a method. It is used to tell the web server what specific actions need to be performed. These actions include: obtaining a specified web page, submitting content to the server, deleting resource files on the server, etc. The methods contained in these HTTP request messages are called HTTP request methods.

Commonly used HTTP request methods

HTTP method

Function description

GET

The client requests the specified resource information, and the server returns the specified resource

HEAD

Only request the HTTP header in the response message

POST

Submit the client's data to the server, for example: registration form

PUT

Replaces the specified document content with the data sent from the client to the server.

DELETE

Request the server to delete the resource indicated by the Request-URI.

MOVE

Requests the server to move the specified page to another network address.

2.2 HTTP status code

1. Introduction to HTTP status codes

HTTP Status Code (HTTP Status Code) is a numerical code used to indicate the status of a Web server responding to an HTTP request. Whenever a web client sends an HTTP request to a web server, the web server returns a status response code. This status code is a three-digit code, which is used to inform the Web client whether the request is successful, or whether to take other actions.

The status codes in version 1.1 of the HTTP protocol can be divided into five categories.

Different ranges of status codes and their corresponding functions

status code range

Function description

100 - 199

It is used to specify certain actions corresponding to the client

200 - 299

Used to indicate a successful request

300 - 399

Used for files that have been moved, and is often included in the location header information to specify the new address information

400 - 499

Used to point out client errors

500 - 599

Used to point out server-side errors

There are many types of status codes for HTTP responses, but in actual work scenarios, there are not many status codes that are often encountered. Common important status codes and their corresponding functions in production scenarios

Common status codes in production scenarios and their corresponding functions

status code

Detailed description

200 - OK

The server successfully returned the web page, which is the standard status code returned by a successful HTTP request

301 - Moved Permanently

Permanent jump, the requested web page will permanently jump to the set new location,

401 - Forbidden

Access is prohibited. Although this request is legal, the server refuses to respond to the client's request because it matches the pre-set rules. Such problems are generally caused by improper configuration of server or service permissions.

404 - Not Found

The server cannot find the specified page requested by the client, which may be caused by the client requesting a resource that does not exist on the server.

500 - Internal Server Error

Internal server error, the server encountered an unexpected situation and could not complete the client's request.

This is a more general error report, usually caused by server settings or internal program problems.

For example: SELinux is enabled, but no rule permission is set for HTTP, the client access is 500

502 - Bad Gateway (emphasis)

Bad gateway, generally when the proxy server requests the backend service, the backend service is unavailable or has not completed the response to the gateway server. This is usually caused by a problem with the node below the reverse proxy server.

The reverse proxy server cannot establish contact with the web service node server behind

503 - Service Unavailable

The service is currently unavailable, which may be caused by the server being overloaded or down for maintenance, or there is no node behind the reverse proxy server that can provide the service

504 - Gateway Timeout

Gateway timeout, generally when the gateway proxy server requests the back-end service, the back-end service does not finish processing the request within a specific time. The number is that the server is overloaded and the data is not returned to the front-end proxy server within the specified time.

2. Command line view of HTTP status code

You can use the curl command (with relevant parameters) to view the corresponding digital status code of HTTP on the Linux command line. The command is as follows:

[root@localhost ~]# curl -I www.baidu.com
HTTP/1.1 200 OK <- 200 is the status code
Server: openresty/1.13.6.1
Date: Thu, 22 Dec 2022 06:37:17 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 5537
Connection: keep-alive
Cache-Control: private
Content-Encoding: gzip
X-AspNet-Version: 4.0.30319
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-XSS-Protection: 1;mode=block
X-Powered-By: ASP.NET
[root@localhost ~]# curl -I -s -w "%{http_code}\n"-o /dev/null www.baidu.com
200 <- 200 is the status code

2.2.1 HTTP message

There are many lines of content in the HTTP message, and these contained fields are composed of some ASCII code strings, but the length of each field is different. HTTP messages can be divided into two types, one is the HTTP message sent from the web client to the web server, called a request message (Request Message). The other is a message sent from the web server to the web client, called a response message (Response Message). The format of HTTP request and response messages is similar.

1. HTTP request message ( Request Message ) introduction

An HTTP request message consists of several parts: request line, request header (header), blank line, and request message body.

HTTP request message format description

message format

message information

request line

Request method URL protocol version

request header

Field name 1: value 1

field name 2: value 2

......

For example:

Accept:image/gif,image/jpeg

Accept-Language:zh-cn

......

blank line

blank no content

request message body

The GET method does not have a request message body, but the POST method does.

The following describes each part of the HTTP request message one by one:

1) request line

Request Line The first line of the request message, used to describe what the client wants to do. The content consists of request method field, URL field and HTTP protocol version field, which are separated by spaces. The following uses GET/index.html HTTP/1.1 as an example to illustrate the details of the initial request line information of the request message

The initial request line information of the request message

Example request method field

URL field example

HTTP protocol version

GET

/index.html

HTTP/1.1

2) request header

The request header consists of keyword/value pairs, one pair per line, and the keyword and value are separated by a colon ":". The function of the request header is to tell the server the relevant information of the request through the client.

Common request header information

request header information

description

Accept:image/gif,image/jpeg

媒体类型

Accept-Language:zh-cn

语言类型

Accept-Encoding:gzip,deflate

支持压缩

User-Agent:Mozilla/4.0(compatible;MSIE6.0;Windows NT;...)

客户端类型

Host:www.baidu.com

主机名

与请求报文相关的最常用的请求头是Content-Type和Content-Length。

3) 空行

最后一个请求头部信息之后是一个空行,通过发送回车符和换行符,通知Web服务器空行以下不会有请求头部的信息了。

4) 请求报文主体

请求报文主体包含了要发送给Web服务器的数据信息。请求报文主体不会应用于HTTP的GET命令方法,而是应用与POST方法。POST方法适用于需要客户填写表单的场合。请求报文的主体信息此处就不再举例了。

2. HTTP响应报文(Response Message)介绍

HTTP响应报文由起始行、响应头部(header)、空行和相应报文主体这几个部分组成,和HTTP请求报文格式类似。

HTTP响应报文的一般格式

报文格式

报文信息

起始行

协议及版本号、数字状态码、状态信息

响应头部

字段名1:值1

字段名2:值2

......

例如:

Content-Type:text/html,charset=utf-8

Content-Language:78

......

空行

空白无内容

响应报文主体

<html>

<head><title>test</title></head>

<body>

123

</body>

</html>

下面对响应报文的每个部分逐一阐述。

1) 起始行

相应报文的起始行也叫状态行,用来说明服务器响应客户端请求的状况。一般为协议及版本号、数字状态码、状态情况。例如:HTTP/1.1 200 OK。

2) 响应头部

和请求报文类似,起始行的后面一般有若干个头部字段。每个头部字段都包含一个名字和一个值,两者之间用冒号分隔。头部结尾也是一个空行结束的。常见的头部信息有:

Content-Type:text/html,charset=utf-8
Content-Language:78
......

3) 空行

最后一个响应头部信息之后是一个空行,通过发送回车符和换行符,通知客户端空行下文无头部信息了。

4) 响应报文主体

响应报文主体中装载了要返回给客户端的数据。这些数据可以是文本,也可以是二进制的(如图片、视频)

2.2.2 HTTP协议原理及重点分析

HTTP协议属于OSI模型中的第七层应用层协议,HTTP协议的重要应用就是WWW服务应用,下面就以WWW服务应用为例介绍HTTP协议的通信原理。以HTTP协议进行通信时,需要有客户端(即终端用户)和服务器端(即Web服务器),在Web客户端向Web服务器发送请求报文之前,先要通过TCP/IP协议在Web客户端和服务端之间建立一个TCP/IP连接,整个HTTP协议请求的工作流程如下:

1) 终端客户在Web浏览器地址栏输入访问地址http://www.baidu.com

2) Web浏览器请求DNS服务器把域名www.baidu.com转换为Web服务器的IP地址,此处的解析过程就是DNS解析的原理流程

3) Web浏览器将端口号(默认80)从访问地址(URL)中解析出来;

4) Web浏览器通过解析后的IP地址及端口号与Web服务器之间建立一条TCP连接;

5) 建立TCP连接后,Web浏览器向Web服务器发送一条HTTP请求报文

6) Web服务器响应并读取浏览器的请求信息,然后返回一条HTTP响应报文

7) Web服务器关闭HTTP连接,关闭TCP连接,Web浏览器显示访问的网站内容到屏幕上。

Guess you like

Origin blog.csdn.net/weixin_43805705/article/details/131301349