【Application layer】Detailed introduction of HTTPS protocol

Article directory


foreword

HTTPS is also an application layer protocol. It introduces an encryption layer on the basis of the HTTP protocol. Since the content of the HTTP protocol is transmitted in plain text, this leads to some tampering during the transmission process. Below we Describe in detail how https solves these problems.


1. What is "encryption"

Encryption is to perform a series of transformations on plaintext (information to be transmitted) to generate ciphertext.

Decryption is to perform a series of transformations on the ciphertext and restore it to plaintext.

In the process of encryption and decryption, one or more intermediate data is often needed to assist in this process, and such data becomes a key.

Let's take an example:

For example, our client wants to send a variable a = 100 to the server. The number of this variable is not wanted to be known by others, so we use the key variable to bitwise XOR a to form a c variable, send the c variable to the server, and then the server uses c The bitwise XOR key of the variable gets the a variable. We found that only when the client knows that a is 100 at the beginning, all the c variables are transmitted during network transmission.

The above process of XORing variable a is encryption, and the process of server XORing variable c to obtain variable a is called decryption.

Let's take a very common example:

When we need to download a certain game, we start to download it after searching it in the browser, and finally find out that the downloaded game is App Store? ? This is because when our browser initiates a request to the game server, the response from the game server to us includes the game download path but is hijacked by the operator, and the operator changes the game path to the path of the App Store, and then sends it to the user.

Since any data packets we transmit through the network will pass through the operator's network equipment (routers, switches, etc.), then the operator's network equipment can parse out the content of the data you transmit and tamper with it. When we click the download button, the actual The above is to send an http request to the server, and the obtained http response actually contains the download link of the APP. After the operator hijacks it, it automatically changes the response given to the user to the download address of other applications.

Therefore: because the content of http is transmitted in plain text, the plain text data will pass through multiple physical nodes such as routers, wifi hotspots, communication service operators, proxy servers, etc. If the information is hijacked during transmission, the transmitted content will be completely exposed. The hijacker can also tamper with the transmitted information without being noticed by both parties. This is a man-in-the-middle attack, so we need to encrypt the information.

2. Common encryption methods

Symmetric encryption: The encryption method of the single-key cryptosystem is adopted. The same secret key can be used for encryption and decryption of information at the same time. This encryption method is called symmetric encryption, also known as single-key encryption. Features: used for encryption and decryption The secret key is the same.

Features: The algorithm is open, the amount of calculation is small, the encryption speed is fast, and the encryption efficiency is high.

In fact, the XOR we demonstrated above is symmetric encryption, and encryption and decryption can be completed with keys on both sides.

Asymmetric encryption: Two keys are required for encryption and decryption, which are public key and private key.

Features: The strength of the algorithm is complex, and the security depends on the algorithm and the secret key. However, due to the complexity of the algorithm, the speed of encryption and decryption is not as fast as that of symmetric encryption and decryption.

Asymmetric encryption uses two secret keys, one is called the public key and the other is called the private key. The public key and the private key are paired, and the biggest disadvantage is that the operation speed is very slow, which is much slower than symmetric encryption.

3. Data summary (data fingerprint)

Data fingerprint (data summary): the basic principle is to use a one-way hash function (Hash function) to operate on information to generate a series of fixed-length digital summaries. Digital fingerprints are not an encryption mechanism, but they can be used to judge data Has it been tampered with.

Digest features: The difference from the encryption algorithm is that the digest is not strictly encrypted, because there is no decryption, but it is difficult to deduce the original information from the digest, and it is usually used for data comparison.

With the above three sets of concepts, let's explore how http is encrypted:

1. Can using only symmetric encryption solve the problem?

 When our client sends data to the server, we first generate a secret key through an algorithm and then encrypt the data. However, our server does not have a secret key generated by the client. To decrypt the encrypted data, the client is required. Send the secret key to the server, but we said that http is in clear text, which causes the secret key sent by the client to the server to be hijacked by the middleman (hacker), so we can't solve the problem by using symmetric encryption alone.

2. Only use asymmetric encryption

First we assume that only the server has public and private keys:

 When the client initiates a secret key negotiation handshake with the server, the server will send the public key to the client, but the hacker will also get the public key at this time, and then the client encrypts the data with the public key and sends it to the server. It can be guaranteed that the data will only be decrypted by the server, because only the server has the private key. After the server decrypts, it needs to respond to the client, but the server's response cannot be encrypted, because if it is encrypted with the public key, it must be decrypted with the private key, and the client does not have a private key. If it is encrypted with the private key, then the hacker has The public key will still tamper with the information.

Below we assume that both client and server use asymmetric encryption:

First, both the client and the server have their own generated public key and private key. When the first handshake is negotiated, the client sends its public key to the server, and then the server sends its own public key to the client when it sends a response. , when the two parties exchange the public key, the client can use the server’s public key to encrypt the data and send it to the server, then the server uses its own private key to decrypt, and then the server uses the client’s public key to encrypt the response and send it to the client. The client decrypts with its own private key. This seems to be no problem, but we said that asymmetric encryption is very slow, and there is a hidden security problem we will talk about in the next method.

3. In order to solve the efficiency problem, we use the combination of symmetric encryption + asymmetric encryption 

 First, the server itself has a public key and a private key, and then the first time the client initiates a request to the server, the server gives the public key to the client, and the client generates a symmetric key by itself, and then uses the server's public key to pair Encrypt the symmetric key to get x, and then give x to the server, because only the server has a unique key, so it can decrypt x to get C, so that both the client and the server have the symmetric key C, in the future The communication can be encrypted through the symmetric key. We said that the speed of the symmetric key is very fast, so the problem of slow speed is solved. 

We found that we have overlooked a problem in all the above solutions, that is, if the middleman attacks from the beginning, we cannot complete the correct public key exchange, as shown in the following figure:

 First, the server has its own public key P and private key S, and then the client initiates a request to the server for the first time, and then the server is hijacked by a hacker when sending P to the client, and the hacker also prepares his own private key D and public key Then the hacker saves the public key P of the server, sends his own public key M to the client, and the client forms a symmetric secret key C, which is encrypted with the hacker’s public key and then sent to the server. Hijacking, the hacker decrypts X with his only private key D, obtains the symmetric key C formed by the client, and then encrypts C with the original server’s public key P and sends it to the server. After the server receives Decrypt with the only private key S to obtain the symmetric key C, and then use C to communicate with the client, but neither the server nor the client knows that the hacker has obtained the symmetric key C.

So how to solve the above problem? In fact, for the above situation, we only need to solve the problem of hackers tampering with the public key at the beginning. The essential problem is that the client does not have the ability to distinguish whether the public key is legal, so as long as the client can distinguish the legality of the public key, this problem can be solved.

Before solving this problem, let's introduce some concepts:

4. Certificate

CA certification. Before using HTTPS, the server needs to apply for a digital certificate from the CA organization. The digital certificate contains certificate applicant information, public key information, etc. The server transmits the certificate to the browser, and the browser just needs to obtain the public key from the certificate. The certificate is like an ID card, proving the authority of the server's public key.

 An electronic signature is included in the digital certificate, which we will talk about in detail later. Then there are company information, domain names, etc. The most important thing is that the public key is in the certificate.

Let's focus on the formation of digital signatures:

First, the company submits the company registration information, and then the CA agency converts this array into a hash value (also called a data summary) through hash hashing, and then the CA agency encrypts the hash value with the CA's own secret key, encrypting Then a data signature is formed. Note: The secret key used by the CA organization is only owned by the CA organization. After having a digital signature, to ensure the authenticity of a certificate, we only need to extract the contents of the certificate to make a data digest, and then use the public key in the certificate to decrypt the data digest. If the decrypted value is the same as the digital signature, It means that the certificate is real and has not been tampered with, otherwise it means that the certificate has been tampered with.

The process of applying for a certificate is as follows:

The anti-tampering process is as follows:

 NOTE: It is very important that CAs have their own public keys built into all browsers. It is because of this that our browsers can tell which websites are trusted and which are not.

Let's analyze why the problem can be solved after having a CA:

1. Is it feasible for a hacker to modify the public key?

Obviously not, if the hacker hijacks the certificate sent by the server to the client and modifies the public key in the certificate (note that the public key in the certificate is the public key generated by the server itself, which is different from the public key in the scheme explained by our signature is the same), because when the CA generates the certificate, it will digest all the public key information and registration information submitted by the company and then encrypt it with the CA's own private key, so once the client browser uses the CA's built-in public key to verify the The digital signature is decrypted, and then compared with the data digest in the certificate. If they are different, it means that the certificate has been tampered with.

2. The hacker modifies the public key and digital signature at the same time

Of course, it is not possible, because the modification of the digital signature must require the private key of the CA organization.

3. The hacker directly drops the entire certificate

First of all, hackers must not be able to create fake certificates, because there is no private key of the CA. To have a certificate, you must go to a CA organization to apply for a real certificate, but the certificate contains domain name and other information. The domain name of each server is unique. If you hold the domain name of a company that others have applied for, you must not be able to apply for a certificate. .

 After understanding the above knowledge, the final solution is here: asymmetric encryption + symmetric encryption + certificate authentication

First, the server itself has a public key and a private key, and then the first time the client initiates a request to the server, the server gives the certificate to the client, and the client generates a symmetric key by itself, and then uses the public key in the certificate to pair Encrypt the symmetric key to get x, and then give x to the server, because only the server has a unique key, so it can decrypt x to get C, so that both the client and the server have the symmetric key C, in the future The communication can be encrypted by the symmetric key. The existence of the certificate solves the scenario where the middleman tampers with the public key from the beginning, so this is the encryption principle of https.

Of course, we can also view the relevant certificates in our browser:

 Many of the above organizations other than CA are sub-organizations of CA.

 It can be seen that both the public key and the private key are a bunch of strings representing uniqueness.


Summarize

The most essential difference between https and http is that https introduces an encryption layer on the basis of http. Previously, data sent in clear text by http can be encrypted in https, which greatly improves security.

Guess you like

Origin blog.csdn.net/Sxy_wspsby/article/details/131528723