Dewu Customer Service One-stop Workbench Caton Optimization Road

Original | Debt Technology - Aftermath

1. Background

The customer service one-stop workbench includes four functional modules: online, telephone, work order and tools. Many common modules, such as work order details and order details, are nested in the form of iframes, which is time-consuming in the system loading process. In addition, the online message communication module strongly relies on the third-party SDK of tinode. Many methods are It directly calls the API provided by tinode, and also inherits many unreasonable ways of tinode. Since the use of tinode so far, due to the investment of iterative resources, the source code of tinode has not been optimized or improved. When the mode of message communication is changed to After the broadcast, the session stuck problem was exposed.

By reading the message link module of the tinode source code, I found that there is a lot of room for optimization. This article is a specific optimization implementation for the message link.

2. Find the problem

2.1 There are defects in the message data processing process

After reading the source code of the third-party SDK of tinode, it is found that the customer service has a lot of room for optimization in the link of "receiving" and "sending" messages. In the original logic, from sending messages to fast rendering pages to tinode response After returning the result to refresh the rendering page, and when the customer service receives the message, the entire message will be refreshed, deserialization, sorting, deduplication, status processing, etc., all require multiple cycles, plus the communication mode changes. For the broadcast mode, the large amount of data cyclic task is a serious challenge for performance.

Dewu Customer Service One-stop Workbench Caton Optimization Road

Customer service "receive" and "send" message link overview (1)

Dewu Customer Service One-stop Workbench Caton Optimization Road

Overview of the "receive" and "send" message links of customer service (2)

There are many for loops in the red area in the figure, which is the most time-consuming scenario. The reason is to obtain the communication records between users and customer service ( the method provided in the original tinode, topic.message() will be executed n times ), deserialization, Session state processing, sorting, and deduplication will traverse all chat messages, and deserialization is the most time-consuming scenario. In addition, JavaScript is a single thread, and if the number of traversals is too large, it will cause blocking. As a result, when the customer service is switching sessions quickly, the cycle has not ended, and the page has not been rendered, resulting in a stuck phenomenon.

3. Optimization ideas

When each user enters the line from the client to the agent customer service workbench, a session id (sessionId) will be generated, and each manual message under each session id will have a message id (msgid) customer service is talking to the user. There are many message rounds that communicate back and forth between them. In order to reduce the operation of multiple loops in the "old code" that reduces performance, the core task is to try to avoid traversing the message data of the chat (because there are too many messages). The principle of traversing chat messages without traversing has rewritten the logic of "de-duplication" and "sorting" in the original logic. At this time, the session id and message id mentioned above play a very important role.

3.1 Deduplication

In this optimization plan, a msgidCacheMaps Map data structure is globally maintained . This data structure has two dimensions, sessionId  and msgid , which are used to save the msgid of each message in the current session (sessionId). In the message dialogue, the messages sent by the human customer service It will go through two stages from virtual message to real message (the virtual message here refers to the fact that after the customer service sends a message to the gateway in a manual conversation, in order to quickly display the message in the chat area, a virtual message is generated by the previous message seq + 0.002 The seq is: virtualSeq , wait until the gateway returns the real seq, and then replace the virtualSeq with the real seq), the virtual message stage will save the msgid to the Map, for the message pushed by the system, there is no msgid, you do not need to go through this process, directly Put it into the session pool, the real message (tinode returns seq) stage, query according to the msgid to msgidCacheMaps Map data structure, the existence of this msgid means that it is duplicate data, and it can be replaced with seq.  

3.2 Sorting

The optimization plan this time is to use  the binary search insertion sorting method to maintain a seqCacheMaps Map data structure globally . This data structure is somewhat similar to the above deduplication, and also has two dimensions, sessionId and seq. The binary search insertion sorting method uses seq ( real seq ) and virtualSeq ( virtual seq ) are used as the basis for searching, each time a message comes in, quickly find the current seq insertable position according to the dichotomy method, virtual message stage, direct insertion, real message stage (msgidCacheMaps has this msgid), directly replace , but there is a problem at this time, because every message sent by the customer service to the user during the manual conversation will be checked for sensitive words at the gateway, and the message will be sent to the customer service end if the sensitive words are not triggered to be displayed to the user. When a sensitive word is triggered, the message containing the sensitive word will be intercepted by the gateway, and the message will not reach the user side. At this time, the gateway will not return the seq, so what should I do if the seq is not returned? That is, in the tinode return stage, the previous virtualSeq will be replaced with the previous message seq  + 0.002 to ensure that its position is ordered and displayed in the chat area in an orderly manner .

Dewu Customer Service One-stop Workbench Caton Optimization Road

"Deduplication" and "Sort" overview

3.3 Cache recycling (end session destruction)

As mentioned in the above  deduplication  and sorting , in order to reduce the number of traversals, two data warehouses (msgidCacheMaps Map data structure, seqCacheMaps Map data structure) are maintained globally, but the daily session volume of each customer service is 100+ , plus each The number of back-and-forth messages between the customer service and the user in the conversation is about 40+ . If the customer service checks the historical messages again, there are 20 messages per page. If you only save but not delete, the amount of stored data is still relatively large, which may easily lead to memory overflow. So when will you delete it? more appropriate? According to the business situation, the final choice is to destroy the hash map mounted on the global and release the memory when the session is ended, the session is transferred, and the push is offline .  

Dewu Customer Service One-stop Workbench Caton Optimization Road

Overview of Data "Storage" and "Deletion"

3.4 Message Status

The message status here refers to: read, unread, received, sending, sending failed, etc.

During the communication between the customer service and the user, the message status displayed on the customer service side and the user side is updated in real time. The customer service sends a message to the user. When the user reads the message, it will return the info protocol (push message notification) to tell h5 The message has been read on the side, and then the h5 side updates the status of the message.

  • Original processing method: After the customer service sends a message to the user, it traverses all the historical messages of the user in the current session and performs all reset operations. At this time, if there are many messages communicated between the user and the customer service, it will lead to traversal The number of times is large, resulting in serious performance consumption and other problems.
  • Optimization plan: first filter out historical messages and messages sent by non-customers, find the message by means of dichotomy, and then directly change the status. After receiving the message sent by the user, update the status of the messages sent by the customer service in messagePools (all conversations of the current user) to read in reverse order, because since the users have all sent messages, it means that the messages sent by the customer service have been read. , you don't need to traverse each message to set the state according to the old logic, which wastes performance. Except for the messages that are being sent and failed to be sent, all are rendered as read.
    • Sending a message: At present, sending a message will only be performed twice. The first time it will quickly display the message on the communication page, and then send the message (wss). When the ack is received, the message status will be updated twice. msgid will find the message that needs to be updated to update, and it is no longer necessary to use the topic.message method provided by tinode for full traversal;
    • Receive messages: Customer service will only trigger a message update when receiving user messages, and there is no need to traverse the current user's full data to update the new state, and will also return ack;
    • Specific implementation : The client pushes the long-chain note event to tell H5 that the H5 side records the seq of the message that has been read, and updates the status of the message data sent by the customer service whose seq is less than or equal to seq, that is: recv(received) => read( Have read).

3.5 Sensitive word interception processing

After the user enters the IM chat page, the messages sent between the user and the agent customer service will be monitored for online sensitive words (only monitoring and prohibition of sending).

  • Original solution: After editing the message, the agent customer service clicks send and calls the back-end sensitive word interface. It can only be sent after the sensitive word verification is not triggered. If there is a network fluctuation and the interface returns slowly, it will make the customer service feel like sending a message Cards to get out.
  • Optimization scheme: Through the gateway interception, when the customer service sends a message, it is directly rendered to the chat area, and the gateway checks whether the sent message triggers a sensitive word. If a sensitive word is triggered, the gateway will return a status to tell h5, and h5 will follow The returned result changes the state to prompt the customer service.

Dewu Customer Service One-stop Workbench Caton Optimization Road

Sensitive word logic overview

4. Data comparison before and after optimization

The technical solution of the optimized link will be released on a certain day as a whole, so the data comparison before and after optimization is drawn by taking the release day as the cut-off point , as shown below.

4.1 Before optimization

Dewu Customer Service One-stop Workbench Caton Optimization Road

 

As shown in the figure above, two data indicators of the total incoming line in a certain period of time are counted:

  • Average first response time: 8.40 seconds
  • Average response time: 19.9 seconds

4.2 After optimization

Dewu Customer Service One-stop Workbench Caton Optimization Road

As shown in the figure above, two data indicators of the total incoming line in a certain period of time are counted:

Average first response time : 6.82 seconds, 1.58 seconds less than before optimization

Average response time : 18.22 seconds, 1.68 seconds less than before optimization

5. Summary

Generally speaking, the number of users and activity of IM products are usually very large, and it is easy to cause traffic peaks at some special time points. Therefore, it is technically necessary to be able to cope with the sudden magnitude. At the same time, IM generally mainly includes these four characteristics : Real-time, reliability, consistency, and security. There is still a long way to go for the optimization of IM. Under the condition of ensuring business stability, we will continue to work hard to polish the four characteristics in the future, so that we can get the best results. Its own IM SDK is becoming more and more perfect, forming a benchmark for industry messaging and communication.

 

*Text/Aftermath

Pay attention to Dewu Technology, and update the technical dry goods at 18:30 every Monday, Wednesday, and Friday night.
If you think the article is helpful to you, please comment, forward it and like it~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5783135/blog/5550084