In-depth analysis of HTTP and HTTPS protocols

HTTP

  • Meaning: Hypertext Transfer Protocol, acting on the application layer.
  • Port used: 80.
  • Based on the transport layer protocol: TCP.
  • Methods used by HTTP to initiate network requests: GET, POST. The difference between these two methods is as follows:
  1. Data transmission using the GET method is not safe, because the transmitted parameters will be placed in the URL, resulting in data exposure; and using the POST method for data transmission is safe, because all operations in the POST are invisible to the user.
  2. There is a limit to the length of URL parameters transmitted by the GET method; there is no limit to POST.
  3. The GET method can only transmit parameters of ASCII type; POST has no restrictions.
  4. The GET method can only use URL encoding; while POST can use multiple encoding methods.
  5. GET requests can be actively cached by the browser; POST requests cannot, and need to be set manually.
  6. GET parameters are placed in the URL, and POST parameters are placed in the Request Body.
  7. The transmission efficiency of the GET method is high; the transmission efficiency of the POST method is low.

The first 6 differences are easier to understand, but the seventh difference may be difficult to understand.

Why is GET more efficient?

Because GET only produces one TCP packet, while POST produces two TCP packets. In detail, when a GET request is sent, the browser will send the header information and data together, and the server will respond with 200 (data returned successfully). When a POST request is sent, the browser will first send the header information, and then the server will respond with 100 (indicating that the request has been received), then the browser will continue to send data, and the server will finally respond with 200 (data returned successfully). Because GET has one less step, the time consumption is less and the efficiency is higher.

Data structure of HTTP request message

GET /index.jsp HTTP/1.1
Accept-Language: zh-cn
Connection: Keep-Alive 
Host: www.jxsd.cn
Content-Length: 28 

userName=tom&password=123456
复制代码

As shown above, the data structure of the HTTP request message is divided into request line, request header, blank line, and request data.

request line:

  • Request method field: GET, POST, etc.
  • URL field: relative to be accessed, such as /index.jsp.
  • HTTP version number field: HTTP/1.1 or HTTP/1.0.

Request header:

  • Accept-Language: The language type accepted by the browser, such as zh-cn means Simplified Chinese, zh means Chinese.
  • Connection: Whether to continue to connect or disconnect after processing this request.
  • Host: The client tells the server the domain name of the host it wants to access through this header information.
  • Content-Length: Indicates the length of the request message body.
  • User-Agent: browser type.
  • Accept-Charset: The character set acceptable to the browser, such as UTF-8, which represents the Unicode character encoding.
  • Accept-Encoding: The encoding method that the browser can decode, such as gzip.
  • Points to note: Most request header information is not required, but Content-Length must be used for the POST method, and Host must exist for HTTP/1.1.

Blank lines:

  • A blank line separates the request header from the request data, indicating to the server that the request header ends here.

Request data:

  • If the request method is GET, this item is empty and has no data.
  • If the requested method is POST, this item places the data to be submitted.

Data structure of HTTP response message

HTTP/1.1 200 OK
Date: Sat, 31 Dec 2005 23:59:59 GMT
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 122

<html>
<head>
<title>Wrox Homepage</title>
</head>
<body>
<!-- body goes here -->
</body>
</html>
复制代码

As shown above, the data structure of an HTTP response message includes a response line, a response header, and a response body.

Response line:

  • HTTP version number
  • HTTP status code
  • Description of the status code
  • Common HTTP status codes are as follows:

Response header:

The response header is used to describe the basic information of the server and the description of the data, and the server notifies the client how to operate the data through the description of these data.

  • Date: the standard time of the world,
  • Content-Type: Which MIME type the following document belongs to, such as "text/html".
  • Content-Length: Content length.
  • Allow: The request method supported by the server (such as GET, POST, etc.)
  • Content-Encoding: The encoding method of the document.
  • Location: Used in conjunction with status code 302 to redirect the receiver to a new url. Indicates where the browser should receive the document.
  • Expires: Tell the browser how long to cache the returned data, 0 or -1 means no cache.
  • Last-Modified: The time when the document was last modified.
  • Refresh: Tell the browser to refresh the time, in seconds.
  • Sever: Tell the browser the type of server.

Response body:

The response body is the message body of the response. If it is pure data, it will return pure data. If the request is an HTML page, then the HTML code will be returned.

The process of establishing and disconnecting HTTP connections

  1. TCP establishes a connection through a three-way handshake.
  2. The browser sends a request message and request data to the server.
  3. The server sends a response message and response data to the browser.
  4. TCP disconnects with four waves.

The schematic diagram is as follows:

Does HTTP require a three-way handshake for every request?

  • If it is HTTP/0.9 or HTTP/1.0, HTTP requires a three-way handshake for each request.
  • The currently used HTTP/1.1 protocol is not used. It supports the keep-alive parameter to maintain a persistent connection between the client and the server, and the client will periodically send a heartbeat packet to the server to indicate that the client is still active.

keep-alive function

  • Check for dead nodes. The main purpose is to find out if the connection fails quickly, and then reconnect.
  • Prevents connections from being disconnected due to inactivity. The operating system will release inactive processes in order to save resources, and send heartbeat packets regularly through keep-alive, just to tell the operating system that I am active and not to kill me.

Features of HTTP

  • No connection: Only one request will be processed per connection, and the connection will be disconnected after the processing is completed.
  • Stateless: Each request from the client is independent. The server does not save the state of the client. That is, when the client initiates a request to the server for the second time, the server still responds with the same response as the first request.

Some people may ask, http/1.1 is a persistent connection, and then one of the characteristics of http is no connection. Is this a contradiction?

  • Not contradictory.
  • http/1.1 is a persistent connection, which means that the TCP connection that transmits http data will not be disconnected, and the next time http data is transmitted, it will still be performed in this TCP connection.
  • The feature of http is connectionless. The connectionless here refers to the connection between the client and the server after the tcp connection is established, that is, steps 4 and 5 in the HTTP connection flow chart above, after a request and a response , the connection is disconnected.

Some people will also ask, one of the characteristics of http is stateless, but if the server wants to identify the client to do a specific operation, can this be achieved?

  • Can be achieved.
  • Cookies can be used to identify user requests. Through the cookie attribute, when the browser initiates a request to the server for the second time, the server can "recognize" the requester.
  • The implementation of cookie is as follows: the server will generate a unique identification code for the user using the browser, and this identification code will be returned to the user through the Set-cookie of the response message. The cookie file of the browser will store the identification code, and the next time the user initiates a request, the message will carry this value.

The difference between HTTP/1.0 and HTTP/1.1

  • HTTP/1.0 can only perform one request and response for each connection, and there is no Host field.
  • HTTP/1.1 can make multiple requests and responses for each connection, and must have a Host field.

Two working methods of HTTP1.1

  • Non-pipelined: The browser does not send the next request until it receives the previous response.
  • Pipeline: After the browser sends the first request, it can send the second request without receiving a response.

HTTPS

The difference between it and HTTP:

SSL layer protocol: SSL, or Secure Sockets Layer, is a security protocol used to encrypt network connections at the transport layer.

https process

An HTTPS request actually includes two HTTP transmissions, which can be subdivided into 8 steps.

  1. The client initiates an HTTPS request to the server and connects to port 443 of the server.
  2. There is a key pair on the server side, that is, a public key and a private key, which are used for asymmetric encryption. The server side keeps the private key and cannot disclose it. The public key can be sent to anyone.
  3. The server sends its own public key to the client.
  4. After the client receives the public key from the server, it will check the public key to verify its legitimacy. If there is a problem with the public key, the HTTPS transmission cannot continue. Here, verify the legitimacy of the digital certificate sent by the server. If the public key is qualified, the client will generate a random value, which is the key used for symmetric encryption. We call this key the client key, that is, the client key. Then use the server's public key to asymmetrically encrypt the client key, so that the client key becomes ciphertext. So far, the first HTTP request in HTTPS ends.
  5. The client will initiate the second HTTP request in HTTPS to send the encrypted client key to the server.
  6. After the server receives the ciphertext sent by the client, it will use its own private key to asymmetrically decrypt it. The decrypted plaintext is the client key, and then use the client key to symmetrically encrypt the data, so that the data is into ciphertext.
  7. The server then sends the encrypted ciphertext to the client.
  8. The client receives the ciphertext sent by the server, uses the client key to symmetrically decrypt it, and obtains the data sent by the server. In this way, the second HTTP request in HTTPS ends, and the entire HTTPS transmission is completed.

How does the client verify the legitimacy of the digital certificate

  • The digital certificate includes the organization that issued the certificate. After the client obtains the name of the organization, it will search in the root certificate of the browser. If the organization corresponding to the digital certificate is not found, it means that the organization has not been certified, and the digital certificate is illegal. of.

The meaning and difference between symmetric encryption and asymmetric encryption

  • Asymmetric encryption/asymmetric decryption, encryption and decryption use different keys; symmetric encryption/symmetric decryption, encryption and decryption use the same key.
  • Asymmetric encryption is more secure than symmetric encryption because with the latter, the key is lost and the encrypted data is exposed.
  • The calculation method of asymmetric encryption is more complex, involving large number multiplication, large number modulus and other operations, and the speed is slow; symmetric encryption uses bit operations, which is fast.

Three random numbers during HTTPS establishment

  1. The client sends out a request for the first time to generate a random number.
  2. The server generates a random number when responding to a request.
  3. The client generates a random number again, which will be asymmetrically encrypted by the server's public key.

The role of HTTPS three random numbers

  • For the client, after verifying the legitimacy of the public key of the server, the third random number is generated. At this time, the client has 3 random numbers. According to these 3 random numbers, the client will generate a new The key (Session Secret).
  • For the server, after using the private key to asymmetrically decrypt the public key, the third random number of the client will be obtained, that is to say, the server also has these 3 random numbers. According to these 3 random numbers, the service The end also generates a new key (Session Secret).
  • The new key is the same for both client and server. Their subsequent data transmission will use this key (Session Secret) for symmetric encryption and symmetric decryption.

For more technical articles, please pay attention to the official account Duxiong Jun.

Guess you like

Origin blog.csdn.net/qq_40796375/article/details/122662936