Web-side instant messaging technology principle sharing

The IM application on the web side, due to the compatibility of the browser and its inherent communication model of "the client requests the server to process and respond", makes it necessary to implement an IM application with better compatibility in the browser. The communication process must be The combination of many technologies, the purpose of this paper is to discuss these technologies in detail and analyze their principles and processes.

 

Traditional Web Communication Principles

As a thin client, the browser itself does not have the function of directly communicating with another client browser in a remote location through system calls. This is different from how our desktop applications work. Usually, desktop applications can establish a TCP connection with a process on the other end of the remote host through sockets, so as to achieve full-duplex instant communication.

Since its birth, the browser has been following the pattern that the client requests the server and the server returns the result, which has not changed even after development. Therefore, it is certain that in order to realize the communication between two clients, the information must be forwarded through the server. For example, if A wants to communicate with B, A should first send the information to the IM application server, and the server will forward it to B according to the receiver carried in the information of A. The same is true for B to A.

Problems that need to be solved in the realization of IM application by traditional communication methods

We realize that the implementation of IM software based on the web still requires the browser to request the server. In this way, the development of IM software needs to solve the following three problems:

    Dual full- duplex communication:
    that is, the browser can pull (pull) server data, and the server will push (push) data to the browser;
    low latency:
    that is, the information sent by browser A to B must be quickly forwarded to B through the server. Information should also be handed over to A quickly. In fact, any browser is required to quickly request data from the server, and the server can quickly push data to the browser;
    support cross-domain:
    Usually, the client browser and the server are in different locations on the network. The browser itself does not allow direct access to servers under different domain names through scripts, even if the IP address is the same and the domain name is different, and the domain name is the same and the port is different. This aspect is mainly for security reasons.


Instant Messaging Network Note: Regarding the security issues caused by browser cross-domain access, there is a method called CSRF network attack, please see the following excerpt

CSRF (Cross-site request forgery), Chinese name: Cross-site request forgery, also known as: one click attack/session riding, abbreviated as: CSRF/XSRF.

You can understand CSRF attacks this way: an attacker steals your identity and sends malicious requests in your name. The things that CSRF can do include: send emails in your name, send messages, steal your account, even buy goods, transfer virtual currency... The problems caused include: personal privacy leakage and property security.

This attack method of CSRF has been proposed by foreign security personnel in 2000, but in China, it was not paid attention until 2006. In 2008, many large communities and interactive websites at home and abroad broke out CSRF vulnerabilities, such as: NYTimes .com (The New York Times), Metafilter (a large BLOG site), YouTube and Baidu HI... And now, many sites on the Internet are still so defenseless that the security industry calls CSRF a "sleep". Giants".

Full-duplex low-latency solution

This is the simplest solution. The principle is that the client sends a request to the server through Ajax every short period of time, the server returns the latest data, and then the client updates the interface according to the obtained data. In this way, instant communication is achieved indirectly. The advantage is simplicity, but the disadvantage is that it puts a lot of pressure on the server and wastes bandwidth traffic (usually the data has not changed). Instant messaging chat software app development can add Weikeyun's v:weikeyyun24 consultation

 

long-polling

In the above polling solution, since a request is sent each time, the server sends data regardless of whether the data changes, and the connection is closed after the request is completed. A lot of communication in the middle is unnecessary, so there is a long-polling method. This method is that the client sends a request to the server, the server checks whether the data requested by the client has changed (whether there is the latest data), and if there is a change, the response is returned immediately, otherwise the connection is maintained and the latest data is checked regularly until it occurs A data update or connection timed out. At the same time, once the client connection is disconnected, the request is sent again, which greatly reduces the number of times the client requests the server within the same time.

Communication based on http-stream

In the above long-polling technology, in order to maintain the long connection between the client and the server, the server is blocked (the response is not returned), and the client polls. In the Comet technology, there is also a http-stream-based stream. way of communication. The principle is to allow the client to keep the connection with the server in one request, and then the server to continuously transmit data to the client, just like a data stream, instead of sending all the data to the client at one time. The difference between it and polling is that the client only sends a request once in the entire communication process, and then the server maintains a long connection with the client, and uses this connection to send data back to the client.

This scheme is divided into several different data stream transmission methods.

SSE (Server-sent Events)

In order to solve the problem that the browser can only transmit data to the server in one direction, HTML5 provides a new technology called server push event SSE (for a detailed introduction of this technology, please refer to "SSE Technology Detailed Explanation: A Brand New HTML5 Server Push Event Technology" ), which enables the client to request the server, and then the server uses the communication connection established with the client to push data to the client, and the client receives the data and processes it. From an independent point of view, the SSE technology provides the function of one-way push data from the server to the browser, but with the active request of the browser, two-way communication between the client and the server is actually realized. Its principle is to construct an eventSource object on the client side, which has the readySate attribute, which are respectively expressed as follows:

0: connecting to the server;
1: opening the connection;
2: closing the connection.

At the same time, the eventSource object will maintain a long connection with the server, and will automatically reconnect after disconnection. If you want to force the connection, you can call its close method. It can listen to the onmessage event, the server follows the SSE data transmission format to the client, and the client can receive the data when the onmessage event is triggered, so as to perform some kind of processing.

cross domain solution

Regarding what cross-domain is, due to space limitations, I will not introduce it here. There are many detailed articles on the Internet, and only the solutions are listed here.

XHR-based COSR (Cross-Origin Resource Sharing)

CORS (Cross-Origin Resource Sharing) is a technology that allows browser scripts to send requests to servers under different domain names. It is based on native XHR requests. When XHR calls the open method, the address points to a cross-domain address. On the server side, by setting the 'Access-Control-Allow-Origin': '*' response header, the browser is told that the data sent is from cross-domain and the server allows the response. It will bypass the usual cross-domain restrictions, so it is no different from the usual XHR communication. The main advantage of this method is that the client code does not need to be modified, and the server only needs to add the 'Access-Control-Allow-Origin':'*' header. Applicable to non-IE browsers such as ff, safari, opera, chrome, etc. Compared with non-cross-domain XHR, cross-domain XHR has some limitations. This is required for security. The main limitations are as follows:

    Clients cannot use setRequestHeader to set custom headers;
    cannot send and receive cookies;
    calling getAllResponseHeaders() always returns an empty string.


All of the above measures are for security reasons, preventing common cross-site scripting attacks (XSS) and cross-site request forgery (CSRF).

XDR-based CORS

For IE8-10, it does not support the use of native XHR objects to request cross-domain servers. It implements an XDomainRequest object, which is similar to XHR objects and can send cross-domain requests. It mainly has the following limitations:

    The cookie will not be sent with the request or returned with the response;
    only the Content-Type field in the request header can be set; the
    response header cannot be accessed;
    only Get and Post requests
    are supported; only IE8-IE10 is supported.

JSONP-based cross-domain

This method does not need to add the Access-Control-Allow-Origin header information on the server side. The principle is to make use of the fact that the script tag on the HTML page has no restrictions on cross-domain, so that the src attribute points to the address requested by the server side. An http request is sent through the script tag. After the server receives the request, the returned data is its own data plus a call to the client-side JS function. The principle is similar to the iframe streaming method we mentioned above. The client browses When the browser receives the returned script call, it will parse and execute, so as to achieve the purpose of updating the interface.

WebSocket

In the above solutions, they are all hack technologies formed by using the one-way request from the browser to the server or the one-way push data from the server to the browser. In HTML5, in order to strengthen the function of the web, a websocket is provided. It is not only a web communication method, but also an application layer protocol. It provides native dual full-duplex cross-domain communication between the browser and the server. By establishing a websocket connection (actually a TCP connection) between the browser and the server, it can realize client-to-server and server-to-client data at the same time. send. Regarding the principle of this technology, before looking at the code, you need to understand the entire working process of websocket.

The first is the client's new websocket object, which will send an http request to the server. The server finds that this is a websocket request, agrees to the protocol conversion, and sends back to the client a response with a status code of 101. The above process is called one time. Handshake. After this handshake, the client establishes a TCP connection with the server. On this connection, the server and the client can communicate in both directions. At this time, the two-way communication at the application layer is the ws or wss protocol, which has nothing to do with http. The so-called ws protocol is to require the client and the server to follow a certain format to send data packets (frames), and then the other party can understand.

Guess you like

Origin blog.csdn.net/wecloud1314/article/details/126483503