Graphical HTTPS


We all know that HTTPS can encrypt information so that sensitive information cannot be obtained by third parties. Therefore, many banking websites or e-mails and other services with higher security levels will use the HTTPS protocol.

Introduction to HTTPS

HTTPS is actually composed of two parts: HTTP + SSL / TLS, that is, a module that processes encrypted information is added to HTTP. The information transmission between the server and the client is encrypted by TLS, so the transmitted data is encrypted data. Specifically how to encrypt, decrypt, and verify, see the figure below.

1. The client initiates an HTTPS request

This is nothing to say, that is, the user enters an https URL in the browser, and then connects to the 443 port of the server.

2. Server configuration

A server using the HTTPS protocol must have a set of digital certificates, which can be made by yourself or applied to an organization. The difference is that the certificate issued by yourself needs to be verified by the client before you can continue to access, while the certificate applied by a trusted company will not pop up a prompt page (startssl is a good choice, with a 1-year free service). This certificate is actually a pair of public key and private key. If you don't understand the public key and private key, you can imagine it as a key and a lock, but you are the only one in the world who has this key, you can give the lock to others, and others can use this lock to lock important things Lock it and send it to you, because only you have the key, so only you can see what's locked by this lock.

3. Send the certificate

This certificate is actually the public key, but contains a lot of information, such as the certificate authority, expiration time, and so on.

4. Client Parse Certificate

This part of the work is done by the client's TLS. First, it will verify whether the public key is valid, such as the issuing authority, expiration time, etc. If an exception is found, a warning box will pop up, indicating that there is a problem with the certificate. If there is no problem with the certificate, then a random value is generated. This random value is then encrypted with the certificate. As mentioned above, lock the random value with a lock, so that unless there is a key, you cannot see the locked content.

5. Transmission of encrypted information

This part transmits the random value encrypted with the certificate. The purpose is to let the server get this random value, and then the communication between the client and the server can be encrypted and decrypted through this random value.

6. Service segment decryption information

After decrypting with the private key, the server obtains the random value (private key) sent by the client, and then encrypts the content symmetrically through this value. The so-called symmetric encryption is to mix the information and the private key together through a certain algorithm, so that unless the private key is known, the content cannot be obtained, and both the client and the server know the private key, so as long as the encryption algorithm is strong enough, The private key is complex enough, and the data is secure enough.

7. Transmission of encrypted information

This part of the information is the information encrypted by the private key of the service segment and can be restored on the client side

8. Client decryption information

The client decrypts the information sent by the service segment with the previously generated private key, and obtains the decrypted content. In the whole process, even if the third party monitors the data, there is nothing they can do.

Author: Zhu Qilin Source: http://zhuqil.cnblogs.com The copyright of this article belongs to the author and the blog garden. Reprints are welcome, but this statement must be retained without the author's consent, and a link to the original text should be given in an obvious position on the article page, otherwise it will be reserved. The right to pursue legal responsibility.

Why do you need https

HTTP is transmitted in clear text, which means that any node between the sender and the receiver can know what the content of your transmission is. These nodes may be routers, proxies, etc.

For the most common example, user login. The user enters the account number and password. If you use HTTP, you can get your password by doing some tricks on the proxy server.

User login -> proxy server (manipulation) -> actual authorization server

Encrypt the password on the sender side? It's useless, although others don't know what your original password is, but if you can get the encrypted account password, you can still log in.

How HTTPS is Secure

HTTPS actually means secure http , which is an upgraded version of HTTP. Students who have a little understanding of the basics of the network know that HTTP is an application layer protocol, and under the HTTP protocol is the transmission protocol TCP. TCP is responsible for the transmission, and HTTP defines how the data is packaged.

HTTP –> TCP (clear text transmission)

How is HTTPS different from HTTP? In fact, an encryption layer TLS/SSL is added between HTTP and TCP .

What is TLS/SSL?

In layman's terms, TLS and SSL are actually similar things. SSL is an encryption suite that is responsible for encrypting HTTP data. TLS is an upgraded version of SSL. Now referring to HTTPS, the cipher suite basically refers to TLS.

Transmission encryption process

Originally, the application layer sent data directly to TCP for transmission, but now the application layer sends data to TLS/SSL, encrypts the data, and then sends it to TCP for transmission.

Roughly as shown.

That's it. Encrypting the data and then transmitting it instead of letting the data run naked on the complex and dangerous network ensures the security of the data to a large extent. In this way, even if the data is intercepted by intermediate nodes, the bad guys can't understand it.

How HTTPS Encrypts Data

Students who have an understanding of the basics of security or cryptography should know common encryption methods. Generally speaking, encryption is divided into symmetric encryption and asymmetric encryption (also called public key encryption).

Symmetric encryption

Symmetric encryption means that the key used to encrypt data is the same key used to decrypt data.

The advantage of symmetric encryption is that the encryption and decryption efficiency is usually relatively high. The disadvantage is that the data sender and the data receiver need to negotiate, share the same key, and ensure that the key is not leaked to others. In addition, for multiple individuals with data exchange requirements, a key needs to be allocated and maintained between them, and the cost of this is basically unacceptable.

Asymmetric encryption

Asymmetric encryption means that the key (public key) used to encrypt data is different from the key (private key) used to decrypt data.

What is a public key? In fact, it literally means that the public key can be found by anyone. Therefore, asymmetric encryption is also called public key encryption.

Correspondingly, the private key is a non-public key, generally held by the administrator of the website.

What is the relationship between the public key and the private key?

Simply put, data encrypted with the public key can only be decrypted with the private key. Data encrypted with the private key can only be decrypted with the public key.

Many students know that the data encrypted by the public key can be decrypted by using the private key, but they have overlooked one point. The data encrypted by the private key can also be decrypted by the public key. This is critical to understanding the entire encryption and authorization system of HTTPS.

An example of asymmetric encryption

  • Login user: Xiao Ming
  • Authorized website: a well-known social networking site (hereinafter referred to as XX)

Xiao Ming is a user of a well-known social networking site XX, and XX uses asymmetric encryption in the login place for security reasons. Xiao Ming enters the account number and password on the login interface, and clicks "Login". Therefore, the browser uses the public key to encrypt Xiao Ming's account and password, and sends a login request to XX. XX's login authorization program decrypts the account and password through the private key, and passes the verification. After that, Xiao Ming's personal information (including privacy) is encrypted with the private key and transmitted back to the browser. The browser decrypts the data through the public key and shows it to Xiao Ming.

  • Step 1: Xiao Ming enters the account password -> the browser encrypts with the public key -> the request is sent to XX
  • Step 2: XX decrypts with the private key, passes the verification –> obtains Xiaoming’s social data, encrypts it with the private key –> the browser decrypts the data with the public key, and displays it.

Can asymmetric encryption solve the problem of data transmission security? As mentioned earlier, the data encrypted by the private key can be decrypted by the public key, and the public key is encrypted. That is, asymmetric encryption can only guarantee the security of one-way data transmission.

Also, there is the issue of how the public key is distributed/obtained. These two issues are discussed further below.

Public Key Encryption: Two Obvious Problems

The previous example of Xiaoming's login to the social networking site XX was mentioned, and it was mentioned that there are two obvious problems in simply using public key encryption.

  1. How to get the public key
  2. Data transfer is only one-way secure

Question 1: How to obtain the public key

How does the browser get the public key of XX? Of course, Xiao Ming can check it online, and XX can also post his public key on his homepage. However, for a social networking site that can easily succeed or fail by tens of millions, it will cause great inconvenience to users. After all, most users do not know what a "public key" is.

Problem 2: Data transmission is only one-way secure

As mentioned earlier, only the private key can unlock the data encrypted by the public key, so Xiao Ming's account and password are safe, and he is not afraid of being intercepted halfway.

Then there is a big problem: the data encrypted by the private key can also be decrypted by the public key . In addition, the public key is public, and Xiao Ming's private data is equivalent to running naked on the Internet in a different way. (After the intermediate proxy server gets the public key, it can decrypt Xiao Ming's data without hesitation)

The following answers to these two questions respectively.

Question 1: How to obtain the public key

There are two very important concepts involved here: certificate, CA (Certificate Authority).

Certificate

It can be temporarily understood as the ID card of the website. This ID contains a lot of information, including the public key mentioned above.

That is to say, when Xiaoming, Xiaowang, Xiaoguang and other users access XX, they no longer need to search for XX's public key all over the world. When they visit XX, XX will send the certificate to the browser, telling them to say, good, use the public key in this to encrypt the data.

Here is a question, where does the so-called "certificate" come from? This is the responsibility of the CA mentioned below.

CA (Certificate Authority)

Emphasize two points:

  1. There are many CAs (both at home and abroad) that can issue certificates.
  2. Only a few CAs are considered authoritative and fair, and the certificates issued by these CAs are considered trustworthy by browsers. Such as VeriSign . (It's not that the CA forged its own certificate has not happened...)

The details of certificate issuance are not expanded here. It can be simply understood that the website submits an application to the CA. After the CA passes the review, the certificate is issued to the website. When the user accesses the website, the website sends the certificate to the user.

As for the details of the certificate, it is also mentioned later.

Problem 2: Data transmission is only one-way secure

As mentioned above, data encrypted with the private key can be decrypted and restored with the public key. So, does this mean that the data the website transmits to the user is not secure?

The answer is: yes! ! ! (Three exclamation marks represent the emphasis on the third power)

Seeing this, you may have this thought in your heart: using HTTPS, the data is still streaking, so unreliable, it is better to use HTTP directly to save trouble.

However, why is the industry's voice for HTTPS becoming more and more popular? This is obviously contrary to our perceptual knowledge.

Because: although HTTPS uses public key encryption, it also combines other means, such as symmetric encryption, to ensure the efficiency and security of authorization and encrypted transmission.

In a nutshell, the entire simplified encrypted communication process is:

  1. Xiaoming visits XX, XX gives his certificate to Xiaoming (actually it is given to the browser, Xiaoming will not perceive it)
  2. The browser gets the public key A of XX from the certificate
  3. The browser generates a symmetric key B with only its own, encrypts it with the public key A, and passes it to XX (in fact, there is a negotiation process, here is simplified for the sake of understanding)
  4. XX decrypts with the private key and gets the symmetric key B
  5. Browser and data communication after XX are encrypted with key B

Note: For each user accessing XX, the generated symmetric key B is theoretically different. For example, Xiaoming, Xiaowang, and Xiaoguang may generate B1, B2, and B3.

Refer to the picture below: (attach the source of the original picture )

enter image description here

What are the possible problems with the certificate

After understanding the process of HTTPS encrypted communication, the doubts about data streaking should be basically dispelled. However, attentive viewers may have questions: how to ensure that the certificate is legal and valid?

There may be two situations in which the certificate is illegal:

  1. The certificate is forged: it was not issued by a CA at all
  2. The certificate has been tampered with: for example, the public key of the XX website has been replaced

for example:

We know that there is a thing called a proxy in this world. Therefore, the above Xiaoming's login to the XX website may be like this. Xiaoming's login request first arrives at the proxy server, and then the proxy server forwards the request to the authorization server.

Xiao Ming –> evil proxy server –> login authorization server
Xiao Ming <– evil proxy server <– login authorization server

Then, there are too many bad people in this world. One day, the proxy server has a bad idea (it may also be hacked) and intercepts Xiao Ming's request. At the same time, an invalid certificate was returned.

Xiaoming –> evil proxy server –x –> login authorization server
Xiaoming <– evil proxy server –x –> login authorization server

If the kind-hearted Xiao Ming believed this certificate, then he would run naked again. Of course not, then, what mechanism is used to prevent this kind of thing from being released?

Next, let's take a look at what the "certificate" contains, and then we can roughly guess how to prevent it.

Certificate Introduction

Before formally introducing the format of the certificate, a small advertisement is inserted, and the digital signature and abstract are popularized, and then a non-in-depth introduction to the certificate is given.

why? Because digital signatures and digests are very critical weapons for certificate anti-counterfeiting.

Digital Signature and Digest

Simply put, "summary" is a fixed-length string calculated by hash algorithm for the content of the transmission (is it associated with the article abstract). Then, the digest is encrypted by the private key of the CA, and the result obtained after encryption is the "digital signature". (The private key of CA is mentioned here, which will be introduced later)

Plaintext –> hash operation –> digest –> private key encryption –> digital signature

Combining the above content, we know that this digital signature can only be decrypted by the public key of the CA.

Next, let's take a look at what the mysterious "certificate" contains, and then roughly guess how to prevent illegal certificates.

For more information on digital signatures and digests, please refer to this article .

certificate format

Shamelessly paste a large paragraph of content first, the certificate format comes from this good article " OpenSSL and SSL Digital Certificate Concept Post "

There is a lot of content, here we need to pay attention to a few points:

  1. The certificate contains the name of the authority that issued the certificate – CA
  2. Digital signature of certificate content itself (encrypted with CA private key)
  3. Certificate holder's public key
  4. Hash algorithm used for certificate signing

In addition, one thing needs to be added, that is:

  1. CA itself has its own certificate, which is called "root certificate" by Jianghu people. This "root certificate" is used to prove the identity of the CA and is essentially an ordinary digital certificate.
  2. Browsers usually have the root certificates of most major authoritative CAs built in.

certificate format

1. 证书版本号(Version)
版本号指明X.509证书的格式版本,现在的值可以为:
    1) 0: v1
    2) 1: v2
    3) 2: v3
也为将来的版本进行了预定义

2. 证书序列号(Serial Number)
序列号指定由CA分配给证书的唯一的"数字型标识符"。当证书被取消时,实际上是将此证书的序列号放入由CA签发的CRL中,
这也是序列号唯一的原因。

3. 签名算法标识符(Signature Algorithm)
签名算法标识用来指定由CA签发证书时所使用的"签名算法"。算法标识符用来指定CA签发证书时所使用的:
    1) 公开密钥算法
    2) hash算法
example: sha256WithRSAEncryption
须向国际知名标准组织(如ISO)注册

4. 签发机构名(Issuer)
此域用来标识签发证书的CA的X.500 DN(DN-Distinguished Name)名字。包括:
    1) 国家(C)
    2) 省市(ST)
    3) 地区(L)
    4) 组织机构(O)
    5) 单位部门(OU)
    6) 通用名(CN)
    7) 邮箱地址

5. 有效期(Validity)
指定证书的有效期,包括:
    1) 证书开始生效的日期时间
    2) 证书失效的日期和时间
每次使用证书时,需要检查证书是否在有效期内。

6. 证书用户名(Subject)
指定证书持有者的X.500唯一名字。包括:
    1) 国家(C)
    2) 省市(ST)
    3) 地区(L)
    4) 组织机构(O)
    5) 单位部门(OU)
    6) 通用名(CN)
    7) 邮箱地址

7. 证书持有者公开密钥信息(Subject Public Key Info)
证书持有者公开密钥信息域包含两个重要信息:
    1) 证书持有者的公开密钥的值
    2) 公开密钥使用的算法标识符。此标识符包含公开密钥算法和hash算法。
8. 扩展项(extension)
X.509 V3证书是在v2的基础上一标准形式或普通形式增加了扩展项,以使证书能够附带额外信息。标准扩展是指
由X.509 V3版本定义的对V2版本增加的具有广泛应用前景的扩展项,任何人都可以向一些权威机构,如ISO,来
注册一些其他扩展,如果这些扩展项应用广泛,也许以后会成为标准扩展项。

9. 签发者唯一标识符(Issuer Unique Identifier)
签发者唯一标识符在第2版加入证书定义中。此域用在当同一个X.500名字用于多个认证机构时,用一比特字符串
来唯一标识签发者的X.500名字。可选。

10. 证书持有者唯一标识符(Subject Unique Identifier)
持有证书者唯一标识符在第2版的标准中加入X.509证书定义。此域用在当同一个X.500名字用于多个证书持有者时,
用一比特字符串来唯一标识证书持有者的X.500名字。可选。

11. 签名算法(Signature Algorithm)
证书签发机构对证书上述内容的签名算法
example: sha256WithRSAEncryption

12. 签名值(Issuer's Signature)
证书签发机构对证书上述内容的签名值

How to Identify Illegal Certificates

As mentioned above, the XX certificate contains the following:

  1. The certificate contains the name of the authority that issued the certificate – CA
  2. Digital signature of certificate content itself (encrypted with CA private key)
  3. Certificate holder's public key
  4. Hash algorithm used for certificate signing

The root certificate of the browser's built-in CA contains the following key contents:

  1. CA's public key (very important!!!)

Ok, let's explain how to identify the two illegal certificates mentioned above.

Completely fake certificate

This case is relatively simple, check the certificate:

  1. The certificate issuing authority is forged: the browser does not recognize it and directly thinks it is a dangerous certificate
  2. The certificate issuing authority does exist, so according to the CA name, find the corresponding built-in CA root certificate and CA's public key.
  3. Use the CA's public key to decrypt the digest of the forged certificate, and find that it cannot be solved. considered dangerous certificate

Tampered certificate

Suppose the agent obtains the certificate of XX through some way, and then secretly changes the public key of the certificate to his own, and then happily thinks that the user is going to be hooked. But it's too simple:

  1. Check the certificate and find the corresponding CA root certificate and CA's public key according to the CA name.
  2. Use the CA's public key to decrypt the digital signature of the certificate to obtain the corresponding certificate digest AA
  3. Calculate the digest BB of the current certificate according to the hash algorithm used in the certificate signature
  4. Comparing AA and BB, found inconsistency -> judged to be a dangerous certificate

HTTPS handshake process

The above talk about a big pass, how HTTPS ensures the security of data encryption and transmission is basically covered, and if it is too technical, it will be skipped directly.

Finally there are two last questions:

  1. How does the website give the certificate to the user (browser)
  2. How is the symmetric key mentioned above negotiated?

The above two problems are actually what to do in the HTTPS handshake phase. The data transmission process of HTTPS is similar to HTTP as a whole, and it also includes two stages: handshake and data transmission.

  1. Handshake: certificate issuance, key negotiation (all in clear text at this stage)
  2. Data transmission: This stage is encrypted, using the symmetric key negotiated in the handshake stage

Teacher Ruan's article is very well written and easy to understand. Interested students can read it.

Attachment: "Overview of SSL/TLS Protocol Operating Mechanism": http://www.ruanyifeng.com/blog/2014/02/ssl_tls.html

write on the back

Popular science articles, some of the content is not rigorous enough, if there are any mistakes, please point out :)

http://www.cnblogs.com/chyingp/p/https-introduction.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325725560&siteId=291194637