http/https protocol, common status codes, get/post requests, http caching mechanism

1. What is the HTTP protocol

​ The "Http" protocol is called "Hypertext Transfer Protocol" (HTTP-Hypertext transfer protocol). It defines how the browser requests Web documents from the Web server, and how the server transmits the documents to the browser. From a hierarchical perspective, Http is an application-oriented protocol, which is an important basis for the World Wide Web to reliably exchange files, and The rules for communication between the client browser and the server are specified in detail. Http is based on the TCP/IP communication protocol to transfer data (HTML files, picture files, etc.). The "letter" sent by the client to the server is called the "request protocol". l The "letter" sent by the server to the browser is called the "response protocol".

Note: Use the HttpWatch packet capture tool to capture the content of the HTTP protocol.

Second, the version of the HTTP protocol

HTTP protocol version: HTTP/1.0, HTTP/1.1

Three, the difference between HTTP1.0 and HTTP1.1

In the HTTP1.0 protocol, after the client establishes a connection with the web server, only one web resource can be obtained.

The HTTP1.1 protocol allows the client to obtain multiple web resources on a connection after establishing a connection with the web server.

Four, HTTP request

After the client connects to the server and requests a certain web resource from the server, it is said that the client sends an HTTP request to the server.

Details of the HTTP request-message header

  • accept:-The browser tells the server through this header that it supports data types
  • Accept-Charset: ——The browser tells the server through this header which character set it supports
  • Accept-Encoding:-The browser tells the server through this header, the supported compression format
  • Accept-Language:-The browser tells the server through this header that its language environment
  • Host:-The browser tells the server through this header which host it wants to visit
  • If-Modified-Since:-The browser tells the server through this header, the time to cache the data
  • Referer: ——The browser tells the server through this header which page the client is from the anti-leech link
  • Connection:-The browser tells the server through this header whether to disconnect the link after the request or how to maintain the link
Details of the HTTP request-request line

​ The GET in the request line is called the request method, the request methods are: POST, GET, HEAD, OPTIONS, DELETE, TRACE, PUT, commonly used are: GET, POST

1. OPTIONS
returns the HTTP request method supported by the server for a specific resource. You can also use a request to send a'*' to the web server to test the functionality of the server.
2. HEAD
asks the server for a response consistent with the GET request, but the response The body will not be returned. This method can obtain the meta-information contained in the response small message header without having to transmit the entire response content.
3. GET
sends a request to a specific resource. Its essence is to send a request to obtain a certain resource on the server. Resources are returned to the client through a set of HTTP headers and presentation data (such as HTML text, or pictures or videos, etc.). The presentation data is never included in the GET request.
4. POST
submits data to the specified resource for processing request (such as submitting a form or uploading a file). The data is contained in the request body. POST requests may result in the creation of new resources and/or the modification of existing resources. The corresponding POST request function in Loadrunner: web_submit_data, web_submit_form
5. PUT
uploads its latest content to the specified resource location
6. DELETE
request server to delete the resource identified by Request-URL
7. TRACE
echo the request received by the server, mainly used for testing or Diagnosis
8. The CONNECT
HTTP/1.1 protocol is reserved for the proxy server that can change the connection to the pipe mode.
note:
1) The method name is case-sensitive. When the requested resource does not support the corresponding request method, the server should return the status code 405 (Mothod Not Allowed); when the server does not recognize or does not support the corresponding request method When, it should return status code 501 (Not Implemented).
2) The HTTP server should implement at least GET and HEAD/POST methods. Other methods are optional. In addition to the above methods, specific HTTP servers support extended custom methods.

Five, HTTP response

​ An HTTP response represents the data that the server sends back to the client, which includes: a status line, several message headers, and entity content

The status code starting with 2 (request successful) indicates that the request was successfully processed.

200 (Success) The server has successfully processed the request. Usually, this means that the server provided the requested page.
201 (Created) The request was successful and the server created a new resource.
202 (Accepted) The server has accepted the request, but has not yet processed it.
203 (Non-authorized information) The server has successfully processed the request, but the returned information may come from another source.
204 (No content) The server successfully processed the request, but did not return any content.
205 (Reset content) The server successfully processed the request, but did not return any content.
206 (Partial content) The server successfully processed part of the GET request.

The beginning of 3 (the request is redirected) indicates that further operations are required to complete the request. Usually, these status codes are used for redirection.

300 (multiple choices) In response to requests, the server can perform multiple operations. The server can select an operation based on the requester (user agent), or provide a list of operations for the requester to choose.
301 (Moved Permanently) The requested page has been permanently moved to a new location. When the server returns this response (response to a GET or HEAD request), it will automatically redirect the requester to the new location.
302 (Temporary move) The server currently responds to requests from web pages in different locations, but the requester should continue to use the original location for future requests.
303 (View other locations) The server returns this code when the requester should use separate GET requests for different locations to retrieve the response.
304 (Unmodified) The requested webpage has not been modified since the last request. When the server returns this response, the content of the web page will not be returned.
305 (Use proxy) The requester can only use the proxy to access the requested web page. If the server returns this response, it also indicates that the requester should use a proxy.
307 (Temporary redirect) The server currently responds to requests from web pages in different locations, but the requester should continue to use the original location for future requests.

At the beginning of 4 (request error), these status codes indicate that the request may be wrong, which prevents the server from processing.

400 (Bad request) The server does not understand the syntax of the request.
401 (Unauthorized) The request requires authentication. For web pages that require login, the server may return this response.
403 (Forbidden) The server rejected the request.
404 (Not Found) The server could not find the requested page.
405 (Method disabled) The method specified in the request is disabled.
406 (Not Accepted) Unable to respond to the requested page with the requested content characteristics.
407 (Proxy authorization required) This status code is similar to 401 (Unauthorized), but specifies that the requester should be authorized to use the proxy.
408 (Request timeout) The server timed out while waiting for the request.
409 (Conflict) The server encountered a conflict while fulfilling the request. The server must include information about the conflict in the response.
410 (Deleted) If the requested resource has been permanently deleted, the server will return this response.
411 (Valid length required) The server does not accept requests without a valid content length header field.
412 (Precondition not met) The server did not meet one of the preconditions set by the requester in the request.
413 (The request entity is too large) The server cannot process the request because the request entity is too large and exceeds the server's processing capacity.
414 (The requested URI is too long) The requested URI (usually a URL) is too long for the server to process.
415 (Unsupported media type) The requested format is not supported by the requested page.
416 (The requested range does not meet the requirements) If the page cannot provide the requested range, the server will return this status code.
417 (Expected value not met) The server did not meet the requirements of the "expected" request header field.

The status codes beginning with 5 (server error) indicate that an internal error occurred when the server was trying to process the request. These errors may be errors in the server itself, rather than request errors.

500 (Internal server error) The server encountered an error and could not complete the request.
501 (Not yet implemented) The server does not have the function to complete the request. For example, the server may return this code when the request method is not recognized.
502 (Bad gateway) The server was acting as a gateway or proxy and received an invalid response from the upstream server.
503 (Service unavailable) The server is currently unavailable (due to overload or maintenance shutdown). Usually, this is only a temporary state.
504 (Gateway timeout) The server is acting as a gateway or proxy, but did not receive a request from the upstream server in time.
505 (HTTP version is not supported) The server does not support the HTTP protocol version used in the request.

HTTP response details-common response headers

Common response headers (message headers) in HTTP responses

  • Location: The server uses this header to tell the browser where to jump to
  • Server: The server tells the browser the model of the server through this header
  • Content-Encoding: The server tells the browser through this header, the data compression format
  • Content-Length: The server tells the browser the length of the data sent back through this header
  • Content-Language: The server tells the browser language environment through this header
  • Content-Type: The server tells the browser the type of data sent back through this header
  • Refresh: The server tells the browser to refresh regularly through this header
  • Content-Disposition: Through this header, the server tells the browser to download data
  • Transfer-Encoding: The server tells the browser that the data is sent back in chunks through this header
  • Expires: -1 Control browser not to cache
  • Cache-Control: no-cache
  • Pragma:no-cache

The difference between HTTP and HTTPS

​ The Hypertext Transfer Protocol HTTP protocol is used to transfer information between the Web browser and the website server. The HTTP protocol sends content in clear text and does not provide any means of data encryption. If the attacker intercepts the Web browser and the website server You can directly understand the information in the transmission message between the two. Therefore, the HTTP protocol is not suitable for transmitting some sensitive information, such as payment information such as credit card numbers and passwords.

​ In order to solve this defect of the HTTP protocol, another protocol is needed: Secure Sockets Layer Hypertext Transfer Protocol HTTPS. For the security of data transmission, HTTPS adds the SSL protocol on the basis of HTTP, and SSL relies on certificates to verify The identity of the server and encrypts the communication between the browser and the server.

Basic concepts of HTTP and HTTPS

HTTP: is the most widely used network protocol on the Internet. It is a standard for client and server requests and responses (TCP). It is used to transfer hypertext from the WWW server to the local browser. It can enable browsing The device is more efficient and reduces network transmission.

HTTPS: An HTTP channel with security as the goal. Simply speaking, it is a secure version of HTTP, that is, an SSL layer is added to HTTP. The security foundation of HTTPS is SSL. Therefore, the encryption details require SSL.

The main function of the HTTPS protocol can be divided into two types: one is to establish an information security channel to ensure the security of data transmission; the other is to confirm the authenticity of the website.

What is the difference between HTTP and HTTPS?

The main differences between HTTPS and HTTP are as follows:

1. The https protocol requires CA to apply for a certificate. Generally, there are fewer free certificates, so a certain fee is required.

2. http is a hypertext transmission protocol, information is transmitted in plain text, and https is a secure ssl encrypted transmission protocol.

3. http and https use completely different connection methods, and the ports used are different, the former is 80 and the latter is 443.

4. The http connection is very simple and stateless; the HTTPS protocol is a network protocol constructed by the SSL+HTTP protocol for encrypted transmission and identity authentication, which is more secure than the http protocol.

HTTP caching mechanism

Web cache can be roughly divided into: database cache, server-side cache (proxy server cache, CDN cache), browser cache.

The browser cache also contains a lot of content: HTTP cache, indexDB, cookie, localstorage, etc. Here we only discuss HTTP caching related content.

Before we learn more about HTTP caching, let's clarify a few terms:

  • Cache hit ratio: the ratio of the number of requests that get data from the cache to the number of all requests. The ideal state is that the higher the better.

  • Expired content: Content that has been marked as "stale" after the set effective time. Generally, expired content cannot be used to respond to client requests. You must request new content from the origin server or verify that the cached content is still ready.

  • Verification: Verify whether the expired content in the cache is still valid, and refresh the expiration time if the verification is passed.

  • Invalidation: Invalidation is the removal of content from the cache. When the content changes, the invalid content must be removed.

Strong cache

​ When a strong cache is hit, the browser does not send the request to the server. In Chrome's developer tools, I see that the return code of http is 200, but it will be displayed as (from cache) in the Size column.

​ Strong caching is controlled by using the Expires or Cache-Control fields in the http return header to indicate the caching time of resources.

The first request of the browser:

img

When the browser requests again:

img

The difference between GET and POST

1. Standard answer

  • GET is harmless when the browser rolls back, while POST will submit the request again.
  • The URL address generated by GET can be Bookmarked, but not POST.
  • GET requests will be actively cached by the browser, while POST will not, unless manually set.
  • GET requests can only be url-encoded, while POST supports multiple encoding methods.
  • GET request parameters will be completely retained in the browser history, while POST parameters will not be retained.
  • The parameters transmitted in the URL for GET requests are limited in length, while POST does not.
  • For the data type of the parameter, GET only accepts ASCII characters, while POST has no restrictions.
  • GET is less secure than POST, because the parameters are directly exposed on the URL, so it cannot be used to transmit sensitive information.
  • GET parameters are passed through the URL, and POST is placed in the Request body.

2. In-depth answers

  • GET and POST are two methods of sending requests in the HTTP protocol.
  • HTTP is a protocol based on TCP/IP about how data is communicated in the World Wide Web. (The bottom layer of HTTP is TCP/IP. So the bottom layer of GET and POST is also TCP/IP. In other words, GET/POST are both TCP links. GET and POST can do the same thing. You have to add GET Request body, and POST with url parameters is technically completely feasible.)
  • GET generates one TCP data packet; POST generates two TCP data packets. (For GET requests, the browser will send the http header and data together, and the server will respond with 200 (return data); for POST, the browser will first send the header, the server will respond with 100 continue, and the browser will send the data again. Response 200 ok (return data))

Guess you like

Origin blog.csdn.net/hrj970808/article/details/109631077