Four basic characteristics of IM system-security (and solutions)

Three dimensions of message security

 Since security is so important and uncompromising for instant messaging services, what are the links that may cause message security problems?

Generally speaking, from the breakdown of message generation and circulation, we roughly describe the security of messages from three dimensions:

1. Message transmission security

2. Message storage security

3. Message content security

How to ensure the security of message transmission

Transmission security is a better understanding. There are instant messaging services for network interaction, and most of them need to transmit messages and signaling through open networks. There are relatively many places that may cause security risks. For example, DNS hijacking will cause the request to the IM service to be intercepted and sent to other servers, resulting in content leakage or invalidation; or the content of the message transmitted in plaintext is hijacked by an intermediate device and then tampered. The content is sent to the IM server to cause business errors and other problems. In the process of message transmission, we mainly focus on two issues: " Access entrance security " and " Transmission link security ", which are also important precautions in two Internet-based instant messaging scenarios.

1. Ensure access security: HttpDNS

For instant messaging services, a public network "access service" is generally provided as a gateway for users to send and receive messages, and is provided to clients through domain names. For access to this entrance, often due to various reasons, "cannot access" and "address error" problems.

Regarding the access portal, our common problem is DNS hijacking. The common causes of DNS hijacking for access domain names are as follows.

The first type is that the DNS settings of the router have been tampered with by illegal intrusion. This kind of problem is common in some home broadband routers. Due to insufficient security settings (such as using the default password), the router has been modified by hackers or Trojan horses. The DNS is set to a malicious DNS address. These problematic DNS servers will prevent you from accessing a certain DNS server. Some websites sometimes return counterfeit content or insert pop-up advertisements.

The second category is that the operator's LocalDNS may cause the resolution of the access domain name to be hijacked.

For example, the following three more typical situations.

(1). LocalDNS is that some operators cache the content of some domain names in order to reduce cross-network traffic, and force the domain name to point to the IP address of their content caching server.

(2). The operator may modify the DNS TTL (Time-To-Live, DNS cache time), which will delay the effective effect of the DNS change and affect service availability. The TTL of our previous online business domain name can reach 24 hours in some provinces and cities.

(3) In order to reduce the pressure on their own resources, some small operators forward DNS requests to other operators for resolution. The IP addresses allocated in this way may have cross-operator access problems, which may result in slower or unavailable requests.

solution:

(1). For the problem that the DNS settings of the broadband router have been tampered with, generally, we will reset the router configuration and then modify the default routing management login password. Basically, it can be solved. I will not go into details here.

(2). To solve the domain name hijacking and scheduling errors of the operator LocalDNS, the most commonly used solution in the industry is HttpDNS. HttpDNS bypasses the operator's LocalDNS and directly interacts with the DNS server through the HTTP protocol (rather than the UDP-based DNS standard protocol), which can effectively prevent the problem of domain name hijacking by the operator. And because the HttpDNS server can obtain the real user exit IP, it can choose a node closer to the user for access, or return multiple access IPs at a time, allowing the client to select a faster access IP through speed measurement, etc. , So the overall access scheduling is more accurate. Of course, another prerequisite for accurate scheduling is that the HttpDNS service itself needs to be supported by a relatively complete IP library. At present, many major manufacturers basically support HttpDNS as the main model, and operators LocalDNS as the supplementary model. Like many third-party cloud vendors, they also provide external HttpDNS resolution services. The implementation structure of HttpDNS is as follows:

Here is an introduction to this picture. The user's request is no longer through the operator to query the resolution of the domain name, but through a method independently provided by HTTP. The back end of the HTTP interface then requests the authoritative DNS and performs a data synchronization.

2. Ensure the safety of the transmission link: TLS transport layer encryption protocol

The security risks of messages in the transmission link can basically be summarized as the following.

(1) Interruption, the attacker destroys or cuts off the network, destroys the service availability.

(2) Interception: The attacker illegally steals the content of the transmitted message, which is a passive attack.

(3) Tampering, the attacker illegally tampered with the content of the transmitted message, destroying the integrity and true semantics of the message.

(4) Forgery: The attacker forges normal communication messages to simulate normal users or IM server.

solution:

1. Regarding the message link interruption, we adopt a multi-channel approach to solve it.

In instant messaging systems, for active attacks such as "interrupted transmission" that disrupt service availability, multi-channel methods can generally be adopted to improve link availability. For example, in the implementation of many IM systems, if the main link fails or is connected If it is unstable, it will try to automatically switch to the failover channel. The failover channel can be:

(1) Selectively switch from multiple "access IPs" returned by HttpDNS service to prevent the intermediate link of a certain "access IP" from being damaged.

(2) Switch from the current data transmission protocol to other transmission protocols, such as switching from the QUIC protocol based on the UDP protocol to the private protocol based on the TCP protocol; or provide the HTTP Tunnel for the TCP private protocol to encapsulate the data twice (Weibo This method is currently supported) to prevent certain interrupt attacks against specific protocols.

2. Regarding the interception, tampering, and forgery of the message transmission process, we use proprietary protocols and TLS technology to prevent and control.

The industry also has many countermeasures to protect against situations in which the message content is intercepted by a third party, the message content is maliciously tampered with, and the third party forges the IM server or client to obtain the message or perform malicious operations.

(1) Proprietary protocol: The instant messaging system that adopts the binary proprietary protocol has certain anti-theft and anti-tampering capabilities due to encoding problems. Compared with the use of JSON, XML, HTML and other plaintext transmission systems, it will be intercepted by a third party. The relative cost of content cracking is higher, so the security will be better.

(2)TLS

Therefore, in order to solve the above series of security issues, the industry generally adopts the TLS protocol to protect business data. TLS cleverly combines "symmetric encryption algorithm", "asymmetric encryption algorithm", "key exchange algorithm", and "message authentication code algorithm". The combination of "digital signature certificate" and "CA certification" effectively solves the problems of interception, tampering, and forgery during message transmission.

TLS encryption process:

(1) Asymmetric encryption algorithm and secret key exchange algorithm are used to ensure that the key for message encryption is not cracked and leaked.

(2) The symmetric encryption algorithm encrypts the message to ensure that the business data transmission process cannot be cracked or tampered with after being intercepted.

(3) Digital signature and CA certification can verify the validity of the public key of the certificate holder and prevent the forgery of the identity of the server.

Compared with the original TCP three-way handshake, TLS itself requires more algorithm confirmation, key negotiation exchange, certificate verification, etc., so there will be 1-2 more RTTs (Round-Trip Time) in the handshake, so TLS There is certain additional overhead in connection efficiency and transmission performance.

In response to this problem, the latest TLS 1.3 version has been optimized to support 1-RTT or even 0-RTT handshake links, which can greatly reduce the additional consumption of TLS. The final version of TLS 1.3 was finalized in August 2018 (RFC 8446). ), it will take some time for large-scale use. For example, WeChat implemented the "MMTLS protocol based on TLS1.3" as early as the TLS 1.3 draft stage a few years ago to protect the security of message transmission.

How to ensure the security of message storage

Due to business needs such as message roaming and offline messaging, most instant messaging services will temporarily store messages in the IM server’s database and keep them for a certain period of time. For some private message content and user privacy data, if there is an illegal query or database by insiders Being "dragged to the library" may lead to the disclosure of private information. Account and password storage security: "one-way hash" algorithm

solution:

1. Account and password storage security: "one-way hash" algorithm

The storage security of account passwords generally uses "high-strength one-way hashing algorithm" (such as SHA, MD5 algorithm) and exclusive "salt" for each account (the "salt" here is a very long random String) to encrypt the original password for storage. The "one-way hash" algorithm is difficult to reversely deduce the password plaintext from the ciphertext under non-violent cracking, and the difficulty of reverse cracking is further increased by "salting". Of course, if both "ciphertext" and "salt" are obtained by hackers, these methods will only increase the cost of cracking and cannot fully guarantee the security of the password. Therefore, it is necessary to integrate comprehensive prevention and control from network isolation, DB access authority, and storage separation.

2. Message content storage security: end-to-end encryption

Regarding the storage security of the message content, if it is stored on the server, there is a risk of leakage regardless of the plaintext or ciphertext of the message content. Therefore, the best way to ensure the security of message content storage is:

(1) The content of the message adopts "end-to-end encryption" (E2EE), and no link will decrypt the message in the middle.

(2) The content of the message is not stored on the server.

The "end-to-end encryption" method is adopted for communication. Except for the sending and receiving parties, no other intermediate links can obtain the original content of the message. Even developers cannot "crack" and obtain data. This encryption method should be stopped at most. Many chat software in the industry, such as WhatsApp and Telegram, use an "end-to-end encryption" method to ensure the security of message content. However, most of the domestic instant messaging software, such as QQ and WeChat, has not yet adopted "end-to-end encryption" due to network security requirements. The reason why "end-to-end encryption" is more secure is that it is different from server-side TLS encryption. The communication parties of "end-to-end encryption" generate secret key pairs and exchange public keys, and the private keys are stored locally. Not to the IM server. The sender's message is encrypted with the recipient's public key, so even if the IM server gets the encrypted information, it cannot decrypt the message because it does not have the recipient's private key.

Message content security

Content security mainly refers to the control of the identification and dissemination of message content. For example, some malicious links are sent to live broadcast rooms or groups through instant messages, which may cause users who click to be lured to some phishing websites; others are anti-political and The dissemination of obscene pictures, videos and other news will cause undesirable negative effects, which requires identification and treatment and avoid secondary dissemination. The security of message content generally relies on third-party content identification services to prevent "risk content".

solution:

1. Establish a sensitive vocabulary database and perform safe identification of text content.

2. Relying on image recognition technology to identify and dispose of pornographic images and videos, advertising images, and political-related images.

3. Use "Voice to Text" and OCR (Picture Text Recognition) to assist in further mining and recognition of pictures and speech.

4. Use crawler technology to further analyze the link content and identify "risk external links".

Generally speaking, there are many ways and ways to identify content security, and there are also many mature third-party SaaS services that can be accessed and used. For the IM server, more needs to be done to establish a variety of punishment mechanisms that match the “identification”. For example, if an individual in the group is identified to post pornographic videos or pictures, it can be linked to “ban” the user. If there are multiple people in a group posting illegal videos, you can “prohibit sending multimedia messages” or “disband the group” for the group.

to sum up:

In instant messaging, message security is the core requirement of various private social scenarios. Generally, security can be evaluated from three dimensions.

1. Message transmission security. "Access entrance security" and "transmission link security" are important precautions in the Internet-based instant messaging scenario. For "access entry security", HttpDNS can be used to solve the problem of malicious tampering of routers and local DNS of operators; and the TLS transport layer encryption protocol is a common means to ensure that messages are not intercepted, tampered, or forged during message transmission.

2. Message storage security. For the storage security of account passwords, the "high-strength one-way hash algorithm" and "salt" mechanism can be used to improve the reversibility of encrypted passwords; for instant messaging scenarios that pursue extreme security and policies allow, the server should try its best No message content is stored, and "end-to-end encryption" is adopted to provide more secure message transmission protection.

3. Security of message content. The security identification of message content can rely on various methods such as "sensitive vocabulary", "picture recognition", "OCR and voice-to-text", "outside crawler crawling analysis" and other methods, and cooperate with "linkage punishment and disposal" to carry out risk identification. Set closed loop.

Guess you like

Origin blog.csdn.net/madongyu1259892936/article/details/106071631