[Network Principle 3] Application layer protocols HTTP and HTTPS

HTTP

insert image description here

What is HTTP

HTTP: Hypertext Transfer Protocol. It is a widely used should-layer protocol.

The meaning of the so-called "hypertext" is that the transmitted content is not only text (such as html, css is text), but also some other resources, such as pictures, videos, audio and other binary data.

The complete application is composed of front-end + back-end, and the communication between the front-end and the back-end depends on HTTP . This is just like consumers buying things online. A courier company is needed between the merchant and the buyer, and HTTP is the courier company. The request method GET/POST is equivalent to different types of courier types (standard courier, expedited courier, etc.) express delivery).

work process

When we enter a "URL" in the browser, the browser will send an HTTP request (request) to the corresponding server . After receiving the request, the other party's server will return an HTTP response (response) after calculation and processing .
insert image description here

Open the developer tools of chrome through F12, and switch to the Network tab. Then refresh the page to see the effect as shown below. Each record is an HTTP request/response.

insert image description here

protocol format

HTTP is a protocol in text format. You can use Chrome developer tools or Fiddler to capture packets and analyze the details of HTTP requests/responses. The left window
shows all HTTP requests/responses, and you can select a request to view details.
The upper right Displays the message content of the HTTP request. (Switch to the Raw tab to see the detailed data format) The
bottom right shows the message content of the HTTP response. (Switch to the Raw tab to see the detailed data format)
insert image description here

Agreement

insert image description here

HTTP request

Analyze the http request against the above protocol content.
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-VOhpHQdQ-1688635622371) (C:\Users\Jiawei\AppData\Roaming\Typora\typora-user-images \image-20230704212212499.png)]

Method

insert image description here

GET method: GET is the most commonly used HTTP method. It is often used to obtain a resource on the server. Enter the URL directly in the browser, and the browser will send a GET request.
POST method: The POST method is also a common method. It is mostly used to submit the data entered by the user to the server (such as a login page).

The difference between GET and POST:
①The data requested by GET is passed through the URL, that is, the data is spliced ​​to the back of the URL, separated by ?, and the parameters are separated by & symbols. Therefore, GET requests have a limit on the size of the transmitted data, usually within a few thousand characters. The POST request is to put the data in the request body of the HTTP request for transmission, there is no size limit, and a large amount of data can be transmitted.
②The data transmitted by the GET request is in clear text, so the data is easy to be intercepted and tampered with. The data transmitted by the POST request is placed in the request body, so it is relatively private.
③GET requests can be cached. When the browser requests the same URL again, it can directly obtain data from the cache to speed up access. However, POST requests are not cacheable, because each submission of data may cause changes in the server state.
④The data requested by GET will be saved in history and server logs by the browser, which is easy to be used by malicious programs. However, POST requests will not be saved in history and server logs, which is relatively safer.

URL

What we usually call "URL" is actually URL (Uniform Resource Locator Uniform Resource Locator). Every file on the Internet has a unique URL, which contains information indicating the location of the file and how the browser should handle it .
insert image description here

http : Protocol scheme name. The common ones are http and https, and there are also other types. (For example, jdbc:mysql used when accessing mysql).

user:pass : login information. The identity authentication of the current website is generally no longer done through the URL. Generally, it will be omitted.
www.example.jp : server address. This is a "domain name", which will be resolved into a specific IP address through the DNS system (the IP address can be seen through the ping command).
80 : port number. When the port number is omitted, the browser will automatically determine which port to use according to the protocol type. For example, the http protocol uses port 80 by default, and the https protocol uses port 443 by default.
dir/index.htm : file path with hierarchy.
? : The starting symbol of the parameter
uid=1 : Query string (query string). It is essentially a key-value pair structure. Key-value pairs are separated by &. Keys and values ​​are separated by =. This part is also called query string, and the value and number of key and value in it are completely agreed by the programmer. In this way, we can customize and transmit the information we need to the server.
ch1 : Fragment identifier, mainly used for jumping pages and quickly locating tags.

URL encoding and decoding

Characters like / ? : have been understood by url as special meanings. Therefore, these characters cannot appear randomly. For example, if these special characters are required in a parameter, the special characters must be escaped first.
insert image description here

Version

The version number of HTTP is widely used as HTTP1.1.

request header

key-value key-value pair. Each key-value pair occupies one line. Use a semicolon to separate the key and value.
Common headers:
Host : Indicates the address and port of the server host.
Content-Length : Indicates the data length in the body.
Content-Type : Indicates the data format in the requested body.
application/x-www-form-urlencoded : form form The submitted data format.
application/json : The data is in json format
User-Agent (UA) : Indicates the attribute of the browser/operating system
Referer : Indicates which page the page is redirected from
Cookie : Cookie is a mechanism for the client to save data, typical The application scenario is to save the user's login information.
insert image description here

After deleting the local cookie, it will be judged as not logged in when visiting the website again.

Session : Identify which user has accessed the server.
The session works on the server side, and the server creates a session object, which can be used to save data related to the current session. The server maintains a Map to save the session, and generates a sessionld as the Key, and the session as the value.
insert image description here

Each session itself is also a Map, which can customize Key and value. The server returns the sessionld to the client through the set-cookie in the response header. The client sends the sessionld to the server each time it requests, and the server can distinguish which client sent the request, and obtain the target session from the Map maintained by the server. Data saved for the current session.

Workflow : When the user logs in, the server adds a new record in the Session, and returns the sessionld/token to the client (for example, through the Set-Cookie field in the HTTP response). When the client sends a request to the server later, it needs to include sessionld/token in the request. (For example, bring it in the Cookie field in the HTTP request) After the server receives the request, it obtains the corresponding user information in the Session information according to the sessionId/token in the request, and then proceeds with subsequent operations.

request body

Related to programmer passing parameters.

HTTP response

status code

Indicates whether the access result of a page is successful or failed or otherwise.
200: This is the most common status code, indicating that the access is successful.
404 Not Found : The resource was not found. The browser enters a URL for the purpose of accessing a resource on the other server. If the resource identified by the URL does not exist, 404 will appear.
403 Forbidden : Indicates that access is denied. Some pages usually require users to have certain permissions to access (only after logging in). If the user accesses directly without logging in, it is easy to see 403.
500 Internal Server Error : The server has an internal error. Generally, this status code is generated when the server encounters some special circumstances during the code execution process (the server crashes abnormally).
504 Gateway Timeout : When the server load is relatively heavy, the server will take a long time to process a single request, which may lead to a timeout.
302 Move temporarily : Temporary redirection. Temporary move. Similar to 301. But the resources are only moved temporarily. Clients should continue to use the original URI.
301 Moved Permanently : Permanent redirection. When the browser receives this response, subsequent requests will be automatically changed to a new address. 301 also uses the Location field to indicate the new address to be redirected to.

response header

The basic format of the response header is basically the same as that of the request header. The meanings of attributes such as Content-Type and Content-Length are also consistent with those in the request.
Common values ​​of Content-Type in the response are as follows:
1. text/html : body data format is HTML
2. text/css : body data format is CSS
3. application/javascript : body data format is JavaScript
4. application /json : body data format is JSON

HTTPS

The content of the HTTP protocol is transmitted in plain text. This leads to some cases of tampering during transmission. HTTPS is also an application layer protocol. It introduces an encryption layer based on the HTTP protocol.

HTTPS execution process

The client uses HTTPS to access the server.
The server returns a digital certificate, and uses asymmetric encryption to generate a public key to the client (the private key is kept by the server itself).
The client verifies whether the digital certificate is valid. If it is invalid, the access is terminated. If it is valid:
① Use symmetric encryption to generate a shared secret key;
② Use symmetric encrypted shared secret key to encrypt data;
③ Use asymmetric encrypted public key encryption (symmetric encryption Generated) shared secret key;
④ Send the encrypted secret key and data to the server.
The server side uses the private key to decrypt the shared secret key of the client (generated using symmetric encryption), and then uses the shared secret key to decrypt the specific content of the data. After that, the client and server interact with the content encrypted using the shared secret key.
Since the efficiency of symmetric encryption is much higher than that of asymmetric encryption, asymmetric encryption is only used when the key is negotiated in the initial stage, and symmetric encryption is still used for subsequent transmissions .
insert image description here

encryption

Encryption is to perform a series of transformations on plaintext (information to be transmitted) to generate ciphertext . In the network transmission, the plaintext is no longer directly transmitted, but the "ciphertext" after encryption.
Decryption is to perform a series of transformations on the ciphertext and restore it to plaintext .
In the process of encryption and decryption, one or more intermediate data is often needed to assist in this process, and such data is called a key .
There are many encryption methods, but the whole can be divided into two categories: symmetric encryption and asymmetric encryption .

Symmetric encryption

Symmetric encryption is actually to encrypt plaintext into ciphertext and decrypt ciphertext into plaintext through the same "key". The sender and receiver must share a secret key to communicate, which makes symmetric encryption algorithms very efficient in terms of confidentiality and performance.

But the server actually provides services to many clients at the same time. With so many clients, the secret key used by everyone must be different (if it is the same, the key is too easy to spread, and hackers can also get it). Therefore, the server needs to maintain the association relationship between each client terminal and each key , which is also a very troublesome thing. Ideally , when the client and the server establish a connection, the two parties negotiate to determine what the key is this time. But if the key is directly transmitted in plain text, then the hacker will be able to obtain the key~~ At this time, the subsequent encryption operation will be useless. Therefore, the transmission of the key must also be encrypted ! But if you want to encrypt the key symmetrically, you still need to negotiate and determine a "key of the key". This becomes the problem of "which comes first, the chicken or the egg" At this time, it is not feasible to use symmetric encryption for the transmission of the key at this time, so asymmetric encryption needs to be introduced .

asymmetric encryption

Asymmetric encryption uses two keys, one is called "public key" and the other is called "private key". The sender uses the receiver's public key to encrypt, and the receiver uses its private key to decrypt. The public key and the private key are paired. The biggest disadvantage is that the operation speed is very slow , which is much slower than symmetric encryption.

Then the next question comes again: how does the client obtain the public key? How does the client determine that the public key is not forged by hackers?

Certificate

When the client and server just establish a connection, the server returns a certificate to the client . This certificate contains the public key just now, and also contains the identity information of the website.
This certificate is like a person's ID card, as the identity of this website. To build an HTTPS website, you need to apply for a certificate at the CA institution. (Similar to applying for an ID card at the Public Security Bureau).
insert image description here


Keep going~
insert image description here

Guess you like

Origin blog.csdn.net/qq_43243800/article/details/131581703