Explain in detail how HTTPS ensures security?

Https introduction

What is Https

HTTPS (full name: Hypertext Transfer Protocol over Secure Socket Layer) is an HTTP channel with security as the goal. Simply put, it is a secure version of HTTP. That is, the SSL layer is added under HTTP, and the security basis of HTTPS is SSL, so SSL is required for the details of encryption

The role of Https

  • Content encryption Establish an information security channel to ensure the security of data transmission;
  • Identity Verification Confirm the authenticity of the website
  • Data integrity to prevent content from being impersonated or tampered with by third parties

Disadvantages of Https

  • Encryption and decryption of data determines that it is slower than http

    Asymmetric encryption and decryption are required, and a three-way handshake is required. The first connection is slower, of course, there are many optimizations now.

For security reasons, the browser will not save the HTTPS cache locally. In fact, as long as you use specific commands in the HTTP header, HTTPS can be cached. Firefox only caches HTTPS in memory by default. However, as long as there is Cache-Control: Public in the header command, the cache will be written to the hard disk. IE can cache https content as long as the http header allows it, and the caching strategy has nothing to do with whether the HTTPS protocol is used.

The difference between HTTPS and HTTP

  • The https protocol requires a CA to apply for a certificate. Generally, there are few free certificates and a fee is required.
  • http is a hypertext transfer protocol, and information is transmitted in plain text; https is a secure ssl encrypted transfer protocol.
  • HTTP and https use completely different connection methods and use different ports. The former is 80 and the latter is 443.
  • The http connection is very simple and stateless; the HTTPS protocol is a network protocol constructed by the SSL+HTTP protocol for encrypted transmission and identity authentication, which is more secure than the http protocol.

HTTP uses port 80 by default, https uses port 443 by default

The following is the entire structure of https. Now https basically uses TSL. Because it is more secure, the SSL in the figure below should be replaced with SSL/TSL .
Https

The following is a general introduction to the knowledge points in the above figure.

Encryption and decryption related knowledge

Symmetric encryption

Symmetric encryption (also called private key encryption) refers to an encryption algorithm that uses the same key for encryption and decryption. Sometimes called traditional cryptographic algorithm, the encryption key can be calculated from the decryption key, and the decryption key can also be calculated from the encryption key. In most symmetric algorithms, the encryption key and the decryption key are the same, so this encryption algorithm is also called a secret key algorithm or a single key algorithm.
Common symmetric encryption: DES (Data Encryption Standard), AES (Advanced Encryption Standard), RC4, IDEA

Asymmetric encryption

Different from symmetric encryption algorithm, asymmetric encryption algorithm requires two keys: public key and private key; and encryption key and decryption key appear in pairs. Asymmetric encryption algorithms use different keys in the encryption and decryption process. Asymmetric encryption is also called public key encryption. In a key pair, one of the keys is publicly available to everyone, called Public key, one of which is not public is called private key.

Asymmetric encryption algorithms have restrictions on the length of the encrypted content, which cannot exceed the length of the public key. For example, the commonly used public key length is 2048 bits, which means that the content to be encrypted cannot exceed 256 bytes.

Digest algorithm

The digital digest uses a single Hash function to "digest" the plaintext that needs to be encrypted into a string of fixed-length (128-bit) ciphertexts. This string of ciphertexts is also called a digital fingerprint. It has a fixed length and different plaintext digests. The results of ciphertext are always different, and the abstract of the same plaintext must be consistent. "Digital abstract" is the fundamental reason why https can ensure data integrity and tamper resistance.

digital signature

Digital signature technology is the application of two technologies: "asymmetric key encryption and decryption" and "digital digest". It encrypts the digest information with the sender's private key and transmits it to the receiver together with the original text. The receiver can only use the sender's public key to decrypt the encrypted summary information, and then use the HASH function to generate a summary information for the received original text, and compare it with the decrypted summary information. If they are the same, it means that the received information is complete and has not been modified during transmission. Otherwise, it means that the information has been modified, so the digital signature can verify the integrity of the information.
The digital signature process is as follows:
明文 --> hash运算 --> 摘要 --> 私钥加密 --> 数字签名

Digital signature has two functions:
First, it can confirm that the message is indeed signed and sent by the sender, because others cannot fake the sender's signature.
2. The digital signature can confirm the integrity of the message.

Note: The
digital signature can only verify the integrity of the data. Whether the data itself is encrypted is not under the control of the digital signature

Digital certificate

Why should there be a digital certificate?

For the requesting party, how can it be sure that the public key it obtained must have been issued from the target host and has not been tampered with? Or is the host of the request itself engaged in improper acts of stealing user information? At this time, we need to have an authoritative and trustworthy third-party organization (usually an organization audited and authorized by the government) to uniformly issue the public key of the host organization. As long as the requesting organization obtains the public key, the above-mentioned is avoided. The problem occurred.

Digital certificate issuance process

The user first generates his own key pair, and transmits the public key and some personal identification information to the certification center. After verifying the identity, the certification center will perform some necessary steps to make sure that the request is indeed sent by the user. Then, the certification center will issue the user a digital certificate that contains the user's personal information and his public key information , And the signature information of the certification center (root certificate private key signature) is also attached. Users can use their own digital certificates for various activities. Digital certificates are issued by independent certificate issuing agencies. Digital certificates are different. Each certificate can provide different levels of credibility.

What the certificate contains

  • The name of the certification authority
  • Digital signature of the certificate itself
  • Certificate holder public key
  • Hash algorithm used for certificate signing

Verify the validity of the certificate

The browser defaults to a built-in CA root certificate, where the root certificate contains the CA’s public key

  1. The certificate issuing authority is forged: the browser does not recognize it, and directly thinks it is a dangerous certificate
  2. The certificate issuing authority does exist, so according to the CA name, find the corresponding built-in CA root certificate, CA public key. Use the CA's public key to decrypt the digest of the forged certificate, and find that it can't be solved, and it is considered a dangerous certificate.
  3. For a tampered certificate, use the CA's public key to decrypt the digital signature to obtain digest A, and then calculate the certificate digest B according to the signed Hash algorithm. Compare A and B. If they are equal, it is normal. If they are not equal, it is tampered. Over.
  4. A certificate can be revoked before it expires, usually because the private key of the certificate has been compromised. Newer browsers such as Chrome, Firefox, Opera and Internet Explorer have implemented the Online Certificate Status Protocol (OCSP) to eliminate this situation: the browser sends the serial number of the certificate provided by the website to the certificate authority via OCSP, the latter Will tell the browser whether the certificate is still valid.

Points 1 and 2 are for forged certificates. 3 is the verification of the certificate after tampering, and 4 is the verification of the expired certificate.

SSL and TLS

SSL (Secure Socket Layer, secure socket layer)

SSL is developed by Netscape to ensure the security of data transmission on the Internet. The use of data encryption (Encryption) technology can ensure that data will not be intercepted during transmission on the network. The current version is 3.0.

The SSL protocol can be divided into two layers: SSL Record Protocol: It is built on a reliable transmission protocol (such as TCP) and provides basic functions such as data encapsulation, compression, and encryption for high-level protocols. SSL Handshake Protocol (SSL Handshake Protocol): It is built on top of the SSL record protocol and is used to authenticate the identity of the communicating parties, negotiate encryption algorithms, and exchange encryption keys before the actual data transmission starts.

TLS (Transport Layer Security, Transport Layer Security Protocol)

Used to provide confidentiality and data integrity between two applications.
TLS 1.0 is a new protocol formulated by IETF (Internet Engineering Task Force, Internet Engineering Task Force). It is based on the SSL 3.0 protocol specification. It is the subsequent version of SSL 3.0 and can be understood as SSL 3.1. It is written Of the RFC. The protocol consists of two layers: TLS Record and TLS Handshake. The lower layer is the TLS record protocol, located on top of a reliable transport protocol (such as TCP).

SSL/TLS protocol function:

  • Authenticate users and servers to ensure that data is sent to the correct client and server;
  • Encrypt data to prevent data from being stolen in the middle;
  • Maintain the integrity of the data and ensure that the data is not changed during transmission.

Advantages of TLS over SSL

  1. Use the key hash method for message authentication: TLS uses the "key hash method of message authentication code" (HMAC), when the record is transmitted on an open network (such as the Internet), this code ensures that the record will not be changed. SSLv3.0 also provides keyed message authentication, but HMAC is more secure than the (message authentication code) MAC function used by SSLv3.0.
  2. Enhanced pseudo-random function (PRF): PRF generates key data. In TLS, HMAC defines PRF. PRF uses two hash algorithms to ensure its security. If any algorithm is exposed, as long as the second algorithm is not exposed, the data is still safe.
  3. Improved completed message verification: Both TLS and SSLv3.0 provide completed messages to the two endpoints, and the message verified that the exchanged messages have not been changed. However, TLS bases this completed message on PRF and HMAC values, which is also more secure than SSLv3.0.
  4. Consistent certificate handling: Unlike SSLv3.0, TLS attempts to specify the types of certificates that must be exchanged between TLS.
  5. Specific alert messages: TLS provides more specific and additional alerts to indicate problems detected by any conversation endpoint. TLS also records when certain alerts should be sent.

SSL, TLS handshake process

The whole process of the SSL and TLS handshake is shown in the following figure. The specific content of each step will be described in detail below:

The client makes the first request

Because the client (such as a browser) has different levels of support for some encryption and decryption algorithms, the same set of encryption and decryption algorithms must be used in the TLS protocol transmission process to ensure that the data can be encrypted and decrypted normally. In the TLS handshake phase, the client must first inform the server which encryption algorithms it supports, so the client needs to send a list of locally supported cipher suites to the server. In addition, the client must generate a random number. On the one hand, this random number needs to be stored on the client side, and on the other hand, it needs to be transmitted to the server side. The client side random number needs to be combined with the random number generated by the server side to generate the back Master Secret to talk about.

The client needs to provide the following information:
-Supported protocol version, such as TLS 1.0
-A random number generated by the client, which is later used to generate a "session key"
-Supported encryption methods, such as RSA public key encryption
-Supported Compression method

The server responds for the first time

After the server receives the Client Hello from the client, the server needs to determine the version of the encryption protocol and the encryption algorithm, and then also generate a random number, and send its own certificate to the client and send it to the client. The random number here is the second random number in the whole process

Information that the server needs to provide:
-Protocol version
-Encryption algorithm
-Random number
-Server certificate

Client responds again

The client first verifies the certificate issued by the server. After the verification is passed, the following operations will continue. The client generates a random number (the third random number) again, and then encrypts it with the public key in the server certificate. And put a ChangeCipherSpec message, that is, the message with the encoding change, as well as the hash value of all previous messages, perform server verification, and then use the new key to encrypt a piece of data and send it to the server to ensure that there is no error before the formal communication.
The client uses the previous two random numbers and the newly generated new random number, and uses the encryption algorithm determined with the server to generate a Session Secret.

ChangeCipherSpec
ChangeCipherSpec is an independent protocol, which is reflected in the data packet as a byte of data, used to inform the server that the client has switched to the state of the previously negotiated cipher suite (Cipher Suite), and is ready to use the previously negotiated The encryption suite encrypts the data and transmits it.

Server responds again

After receiving the encrypted data of the third random number from the client, the server uses the private key to decrypt the encrypted data, verify the data, and generate the secret key in the same way as the client. After everything is ready, a ChangeCipherSpec will be sent to the client to inform the client that it has switched to the negotiated cipher suite state and is ready to use the cipher suite and Session Secret to encrypt data. After that, the server will also use the Session Secret to encrypt a Finish message and send it to the client to verify the success of the encryption and decryption channel established through the handshake.

Communication between client and server

After the secret key is determined, the server and the client will encrypt the message with the agreed secret key and communicate. The entire handshake process is basically completed.

It is worth mentioning that:
SSL protocol uses asymmetric encryption in the handshake phase, and uses symmetric encryption in the transmission phase, which means that the data transmitted over SSL is encrypted with a symmetric key! Because asymmetric encryption is slow and consumes resources. In fact, when the client and host establish a connection using asymmetric encryption, the client and host have already determined the symmetric encryption algorithm and the key symmetric encryption key used in the transmission process. Because this process itself is safe and reliable, that is, The symmetric encryption key cannot be stolen and misused. Therefore, it is ensured that the symmetric encryption of data during transmission is also safe and reliable, because in addition to the client and host, it is impossible for a third party to steal and decrypt the symmetric encryption. Key! If someone eavesdrops on the communication, he can know the encryption method chosen by both parties and two of the three random numbers. The security of the entire call depends only on whether the third random number (Premaster secret) can be cracked.

Other supplements

For very important confidential data, the server also needs to verify the client to ensure that the data is transmitted to a safe and legitimate client. The server can send a Cerficate Request message to the client, requesting the client to send a certificate to verify the legitimacy of the client. For example, financial institutions often only allow authenticated customers to connect to their own network, and will provide formal customers with a USB key, which contains a client certificate.

The first two bytes of PreMaster secret are the version number of TLS, which is an important version number used to check the handshake data, because in the Client Hello phase, the client will send a list of cipher suites and the currently supported SSL/TLS The version number is given to the server, and it is transmitted in clear text. If the handshake data packet is cracked, the attacker is likely to collude the data packet and choose a less secure encryption suite and version to the server. Data is cracked. Therefore, the server needs to compare the version number of the PreMaster decrypted in the ciphertext with the version number of the previous Client Hello stage. If the version number becomes lower, it means that it has been tampered with, and it immediately stops sending any messages.

session recovery

There are two ways to restore the original session: one is called session ID, and the other is called session ticket.

session ID

The idea of ​​session ID is very simple, that is, every conversation has a number (session ID). If the conversation is interrupted, the next time you reconnect, as long as the client gives this number and the server has a record of this number, both parties can reuse the existing "session key" without having to regenerate one.

Session ID is currently supported by all browsers, but its disadvantage is that session ID is often only kept on one server. Therefore, if the client's request is sent to another server, the conversation cannot be resumed

session ticket

The client sends a session ticket sent by the server in the last conversation. This session ticket is encrypted and only the server can decrypt it. It includes the main information of the conversation, such as the conversation key and encryption method. When the server receives the session ticket, there is no need to regenerate the session key after decryption.

Currently only Firefox and Chrome browser support.

to sum up

https actually adds SSL/TLS between the TCP layer and the http layer to protect the security of the upper layer. It mainly uses symmetric encryption, asymmetric encryption, certificates, and other technologies to encrypt data between the client and the server, and finally achieve Ensure the security of the entire communication.

Guess you like

Origin blog.csdn.net/xulong5000/article/details/109157315