【Computer Network】https protocol

Table of contents

concept preparation

what is encryption

Why encryption is needed

Common Encryption Methods

Symmetric encryption

asymmetric encryption

Data summary (digital fingerprint)

digital signature

The working process of https

Option 1: Only use symmetric encryption

Option 2: Only use asymmetric encryption

Option 3: Both parties use asymmetric encryption

Solution 4: Asymmetric encryption + symmetric encryption

man-in-the-middle attack

CA certification

certificate of understanding

digital signature

client authentication

common problem

Is it possible for a middleman to tamper with the certificate?

Why must the summary content be encrypted to form a signature when transmitted over the network?

https complete process 


       

        In the previous chapter, we explained the http protocol. We know that the http protocol is transmitted in clear text and is very unsafe. In order to solve this security problem, we adopted a new protocol- https protocol.

concept preparation

        For security, we need to encrypt the data to prevent others from hijacking the data and causing privacy leaks. So what is encryption?

what is encryption

Encryption is to perform a series of transformations on plain text (information to be transmitted) to generate cipher text.

Decryption is to perform a series of transformations on the cipher text and restore it to plain text.

In this encryption and decryption process, one or more intermediate data are often needed to assist in this process . Such data is called a key.

Why encryption is needed

carrier hijacking

        You may have encountered this situation. When we download a software from the Internet, after the download is completed, it may not be the software we wanted to download at all. This is because the result of our request was hijacked and replaced with another connection.

        Since any data packets we transmit through the network will pass through the operator's network equipment (routers, switches, etc.), the operator's network equipment can parse out the content of the data you transmit and tamper with it.

        Clicking the "Download Button" actually sends an HTTP request to the server, and the HTTP response obtained actually contains the download link of the APP. After the operator hijacked it, it was discovered that the request was to download the software such as Tiantian Dongting. , then the response given to the user is automatically tampered with, for example, the download address of "QQ Browser" .

        The diagram is as follows:

Therefore: Because the content of http is transmitted in clear text , the clear text data will pass through multiple physical nodes such as routers, wifi hotspots, communication service operators, proxy servers, etc. If the information is hijacked during the transmission process, the transmitted content will be completely exposed. up. The hijacker can also tamper with the transmitted information without being noticed by both parties . This is a man-in-the-middle attack , so we need to encrypt the information.

        Of course, not only operators can hijack user requests, but some hackers can also use similar means to hijack and then obtain or tamper with data. Assuming that we submit our payment password, the consequences will be very serious, so the http request needs to be encrypted to further protect the user's privacy.

Common Encryption Methods

Symmetric encryption

        Using the encryption method of a single-key cryptosystem , the same key can be used for encryption and decryption of information at the same time . This encryption method is called symmetric encryption , also called single-key encryption. Features: All encryption and decryption The key used is the same

        • Common symmetric encryption algorithms (understand): DES, 3DES, AES, TDEA, Blowfish, RC2, etc.

        • Features: Open algorithm, small amount of calculation , fast encryption speed, high encryption efficiency. Symmetric encryption actually encrypts plain text into cipher text through the same "key", and can also decrypt cipher text into plain text.


Let's take the simplest example, such as XOR, let's assume that the key key=8888.

        Then there is a plain text a=1234 that needs to be sent to the other party. First, a is encrypted with a^key=9834. Then the other party will receive the data "9834", and then decrypt the data: 9834 ^ key = 1234. This way The data sent by the sender is obtained. (Here, the same data is used to XOR twice to equal itself )

        Of course, bitwise XOR is just the simplest symmetric encryption. Bitwise XOR is not used in HTTPS.

asymmetric encryption

Two keys         are required for encryption and decryption. These two keys are the public key (public key for short) and the private key (private key for short).

        • Common asymmetric encryption algorithms (understand): RSA, DSA, ECDSA

        • Features: The strength of the algorithm is complex, and security depends on the algorithm and key. However, due to the complexity of the algorithm, the encryption and decryption speed is very slow , not as fast as symmetric encryption and decryption.


        Asymmetric encryption uses two keys, one is called the "public key" and the other is called the "private key". The public key and the private key are paired.

The biggest disadvantage is that the calculation speed is very slow, much slower than symmetric encryption

• Encrypt plain text using the public key and turn it into cipher text

• Decrypt the cipher text using the private key and turn it into plain text

Suppose that for a piece of data, we first use the public key to encrypt the data, and then send it to the other party. The other party uses the private key to decrypt the encrypted data. This private key is only known to the recipient in the world. This also ensures data security.

        Example: A wants to give B some important documents, but B may not be there. So A and B make an agreement in advance: B says: There is a box on my desk, and then I will give you a lock, and you can put the documents in the box. I put the file in the box and locked it with a lock, and then I turned around and took the key to unlock the file and get the file. In this scenario, the lock is equivalent to the public key, and the key is the private key. The public key can be given to anyone (without fear of leakage) ), but the private key is only held by B himself. Only those who hold the private key can decrypt it.

Data summary (digital fingerprint)

        • Data summary (digital fingerprint), the basic principle is to use a one-way hash function (Hash function) to operate on information to generate a series of fixed-length digital summaries. Digital fingerprinting is not an encryption mechanism, but it can be used to determine whether data has been tampered with.

        • Common digest algorithms: MD5, SHA1, SHA256, SHA512, etc. The algorithm maps infinite to finite, so there may be a collision (two different information, the calculated digest is the same, but the probability is very low)

        • Summary characteristics : The difference from encryption algorithms is that data summary is not encryption in the strict sense because there is no decryption, but it is difficult to infer the original information from the summary. It is usually used for data comparison. It can also be said that data summary is irreversible .


        For example, when you use Baidu Netdisk to transfer files, usually very large data, such as dozens of gigabytes, can be completed in one to two seconds. How is this done?

        In fact, when the first person uploads a resource, a data summary will be generated for the resource using a hash function, and then stored in the Baidu Netdisk server. When another person uploads or transfers the resource, it will first be generated locally for the resource. A data summary is then compared with all summaries in the network disk. If there is one, a link file is directly generated and pointed directly to it, so the speed will be very fast. And if the file you upload has never been uploaded by anyone else, the speed may be very slow. This is an application of data summarization.

digital signature

The encrypted data formed above is the digital signature . What is the use of this? We will explain this content in detail later in this article.

The working process of https

        Since data security is to be ensured, "encryption" is required. Plain text is no longer directly transmitted during network transmission, but the encrypted "cipher text". There are many ways to encrypt, but the whole can be divided into two categories. : Symmetric encryption and asymmetric encryption

Option 1: Only use symmetric encryption

        If both parties to the communication hold the same key X and no one else knows it, the communication security of the two parties can of course be guaranteed (unless the key is cracked)

        After the introduction of symmetric encryption, even if the data is intercepted, since the hacker does not know what the key is, he cannot decrypt it and does not know the true content of the request.
        But things are not that simple. The server actually provides services to many clients at the same time . With so many clients , the secret key used by everyone must be different (if they are the same, the key will be spread too easily. ?The client can also get it). Therefore, the server needs to maintain the association between each client and each key, which is also a very troublesome thing. 

        According to what we said, it is enough for both parties to agree on the key in advance. However, if the key is directly transmitted in plain text, then the hacker will be able to obtain the key. At this time, the subsequent encryption operation will be in vain. Therefore
. The transmission of the key must also be encrypted!
But if you want to symmetrically encrypt the key, you still need to negotiate and determine the key of a key. This becomes a "chicken or egg" problem. . At this time, symmetric encryption will no longer work for key transmission .


Option 2: Only use asymmetric encryption

        In view of the asymmetric encryption mechanism, if the server first transmits the public key to the client in clear text, and then the client uses this public key to encrypt the data before transmitting it to the server , the channel from the client to the server is safe. Yes , because only the server has the corresponding private key and can decrypt the data encrypted by the public key.
        But how to ensure security on the path from server to browser?

        If the server uses its private key to encrypt the data and sends it to the client, the client can decrypt it using the public key. This public key is initially transmitted to the browser in clear text. If this public key is hijacked by a middleman , Then he can also use the public key to decrypt the information sent by the server.

If the data is encrypted with the public key and sent to the client, but the client does not know the private key, it cannot decrypt the data .

So this method is not feasible either.


Option 3: Both parties use asymmetric encryption

1. The server has the public key S and the corresponding private key S', and the client has the public key C and the corresponding private key C'
2. The client and the server exchange public keys
3. The client sends a message to the server : Use S to encrypt the data first, and then send it. It can only be decrypted by the server, because only the server has the private key S'
4. The server sends information to the client: Use C to encrypt the data first, and then send it, which can only be decrypted by the client. Client decryption, because only the client has the private key C',
seems to work, but
• the efficiency is too low (asymmetric encryption itself is very slow, and both parties use asymmetric encryption, the speed will be even slower)
still has Security Question

  • In actual situations, there are some security issues when both parties use asymmetric encryption for key exchange, such as man-in-the-middle (MITM) attacks .
  • A man-in-the-middle attack means that the attacker pretends to be a server to establish a connection with the client, and establishes an asymmetric encrypted connection with both parties. The attacker can exchange public keys with both the client and the server at the same time, replacing the keys of both parties with the keys he holds. As a result, an attacker can obtain, view, and modify communications between the two parties. (I will talk about it later)

Solution 4: Asymmetric encryption + symmetric encryption

In this plan, we must first solve the efficiency problem.

  1. The server has an asymmetric public key S and a private key S'
  2. The client initiates an https request and obtains the server public key S
  3. The client generates the symmetric key C locally, encrypts it with the public key S, and sends it to the server.
  4. Since the intermediate network device does not have a private key, even if the data is intercepted, the original internal text cannot be restored, and the symmetric key cannot be obtained.
  5. The server decrypts the private key S' and restores the symmetric key C sent by the client. It also uses this symmetric key to encrypt the response data returned to the client.
  6. Subsequent communication between the client and the server can only use symmetric encryption. Since the key is only known by the client and the server, other hosts/devices do not know the key even if they intercept the data.

man-in-the-middle attack

        In scenario 4, after the client obtains the public key S, it encrypts the symmetric key X formed by the client with the public key S given to the client by the server. Even if the middleman steals the data, the middleman It is true that people
cannot decipher the key , assuming that the hacker has successfully become the middleman.

1. The server has the public key S and the private key S' of the asymmetric encryption algorithm.
2. The middleman has the public key M and the private key M' of the asymmetric encryption algorithm.
3. The client initiates a request to the server, and the server transmits the public key in plain text. Key S is given to the client
4. The middleman hijacks the data message, extracts the public key S and saves it, then replaces the public key S in the hijacked message with its own public key M, and sends the forged message to Client
5. The client receives the message, extracts the public key M (of course it does not know that the public key has been changed), forms a symmetric secret key X, encrypts X with the public key M, and sends the message To the server 6. After being hijacked by the middleman, it directly uses its own private key
M' to decrypt to obtain the communication secret key
After getting the message, use your own private key S' to decrypt it and get the communication key X.
8. The two parties begin to use X for symmetric encryption to communicate. But everything is under the control of the middleman. It is possible to hijack data, eavesdrop or even modify it.

The above attack scheme is also applicable to scheme 2. Scheme 3?
Where is the essence of the problem?

1. Intermediaries can tamper with data

2. The client cannot be sure that the received datagram containing the public key was sent by  the target server !

CA certification

certificate of understanding

        Before using HTTPS, the server needs to apply for a digital certificate from the CA organization . The digital certificate contains the certificate applicant information, public key information, etc. The server transmits the certificate to the browser, and the browser obtains the public key from the certificate. The certificate is like an ID card, proving the authority of the server's public key. The CA organization is equivalent to the government.

        The above is a macro process. First of all, the server needs to apply for a certificate from the CA organization. The information includes public key pair and private key pair, domain name, applicant and other information. Then wait for the CA agency's review. After the review is completed, a certificate will be issued. The certificate includes plain text information + digital signature, where the plain text information includes domain name, public key and other information;

digital signature

        First, the plaintext information is generated by a hash function to generate a fixed-length digest , and then the digest is encrypted using the private key of the CA organization . The private key of the CA organization is only known by the CA organization in the world, and no one else can know it. Then the final thing you get is a digital signature.

The plain text information and digital signature are combined to form a complete certificate .


client authentication

        When the server successfully obtains the certificate, subsequent communications between the server and the client will no longer transmit the public key, but directly transmit the certificate. The certificate contains information such as the public key.

Client authentication:
When the client obtains the certificate, it will verify the certificate (to prevent the certificate from being forged).
        • Determine whether the validity period of the certificate has expired
        • Determine whether the issuing authority of the certificate is trusted (operation A trusted certificate issuing authority built into the system).
        • Verify whether the certificate has been tampered with : Get the public key of the certificate issuing authority from the system, decrypt the signature, and obtain a hash value (data digest), which is set to hash1 .Then calculate the hash value of the entire certificate and set it to hash2. Compare hash1 and hash2 to see if they are equal. If they
are equal, it means that the certificate has not been tampered with.


common problem

Is it possible for a middleman to tamper with the certificate?

The answer is impossible.
• Since he does not have the private key of the CA organization , he cannot hash it and then encrypt it with the private key to form a signature. Then there is no way to form a matching signature on the tampered certificate
• If the certificate is forcibly tampered with, the client will After receiving the certificate, the client will find that the plain text and the decrypted value of the signature are inconsistent, indicating that the certificate has been tampered with and the certificate is not trustworthy, thus terminating the transmission of information to the server to prevent information from being leaked to the middleman.

Why must the summary content be encrypted to form a signature when transmitted over the network?

        In network transmission, the purpose of encrypting the digest content to form a signature is to ensure the integrity and authentication of the transmitted data .
        Digital signature is the process of encrypting a digest using a private key. The sender signs the digest using the private key, and the recipient then uses the corresponding public key to verify the validity of the signature. In this way, the receiver can determine whether the message has been tampered with. (As mentioned above)
        During the transmission process, if only the original data is transmitted without signing or encryption, then an attacker in the middle may eavesdrop, tamper with, or forge the data. Once a signature is formed by encrypting the digest and transmitting the signature along with the original data, it can be verified at the receiving end. If the signature is verified successfully, the recipient can be confident of the integrity and authenticity of the data.
        The signature formed through encryption ensures that the data is not tampered with during transmission and verifies the identity of the source of the data. This provides a certain guarantee for protecting the security and credibility of data, which is especially important in network communications.

Here is an example without encryption:

Assume that our certificate is just a simple string hello. Calculate the hash value of this string (such as md5), and the result is
BC4B2A76B9719D91
. If any characters in hello are tampered with, such as becoming hella, then the calculated The md5 value will change greatly.
BDBD6F9CF51F2FD8
Then we can return the string hello and hash value BC4B2A76B9719D91 from the server to the client. At this time,
how can the client verify whether hello has been tampered with?

Then just calculate the hash value of hello and see if it is BC4B2A76B9719D91.

But there is still a problem. If the hacker tampered with the hello and recalculated the hash value, the client would not be able to tell the difference.

Therefore, the transmitted hash value cannot transmit plaintext and needs to transmit ciphertext.

https complete process 

The left side is what the client does, and the right side is what the server does.

Summarize:

There are three groups of keys involved in the working process of HTTPS. 

The first group (asymmetric encryption): used to verify whether the certificate has been tampered with . The server holds the private key (the private key is obtained when forming the CSR file and applying for the certificate), and the client holds the public key (the operating system contains the public key). What are the trusted CA certification authorities and hold the corresponding public key). When the client requests it, the server returns the certificate carrying the signature. The client uses this public key to perform certificate verification to ensure the legitimacy of the certificate and further Ensure the authority of the server public key carried in the certificate.

The second group (asymmetric encryption): used to negotiate the generation of symmetric encryption keys. The client uses the public key (which can be trusted) in the received CA certificate to randomly generate the symmetric encryption key. Encrypted and transmitted to the server, the server obtains the symmetric encryption key through private key decryption.

The third group (symmetric encryption): Subsequent data transmission by the client and server are encrypted and decrypted through this symmetric key.

At this point, all the content of https is completed.

Guess you like

Origin blog.csdn.net/weixin_47257473/article/details/132721837