Web-side instant messaging practice: realizing hundreds of thousands of long connections on a single machine

The content will cover how we use the Play framework and the Akka Actor Model to manage persistent connections and actively send events from the server.

Introduction to SSE (Server-sent events) technology

Server-sent events (SSE) is a communication technology between client and server (see the article "Detailed SSE Technology: A New HTML5 Server Push Event Technology" organized by Instant Messaging Network for details). After the client establishes an ordinary HTTP connection to the server, the server pushes a continuous data stream to the client through this connection when an event occurs, without requiring the client to continuously send subsequent requests. Clients need to use the EventSource interface to continuously receive events or data blocks sent by the server in the form of text or event streams, without having to close the connection. All modern web browsers support the EventSource interface, and there are libraries out-of-the-box on iOS and Android.

 

Stress testing with real production traffic

All systems are ultimately tested with real production traffic, but real production traffic is not so easy to replicate, because there are not many tools that you can use to simulate stress testing. But how do we test with real production traffic before deploying to the real production environment? At this point we use a technique called "undercover" which will be discussed in detail in our next article.

To keep this post on its own topic, let's assume we can already generate real production stress on our server cluster. An effective way to test the limits of the system is to continuously increase the pressure directed to a single node, so that the problems that should be exposed when the entire production cluster is under extreme pressure are exposed very early.

Through this and other aids, we discovered several limitations of the system. The next few sections talk about how we made a single server finally support hundreds of thousands of connections through a few simple optimizations.

Encountered problem 1: The maximum number of connections in a pending state on a Socket

In some of the earliest stress tests we often encountered a strange problem, we could not establish many connections at the same time, about 128 was the limit. Note that the server can easily handle several thousand concurrent connections, but we cannot add more than 128 connections to the connection pool at the same time. In a real production environment, this is roughly equivalent to having 128 members initiating connections to the same server at the same time.
After doing some research, we found the following kernel parameter:
    
net.core.somaxconn

The meaning of this kernel parameter is the maximum number of TCP connections that the program is ready to accept while waiting to establish a connection. If a connection establishment request comes when the queue is full, the request will be rejected directly. This value defaults to 128 on many major operating systems.

After increasing this value in the "/etc/sysctl.conf" file, the "connection refused" problem on our Linux server was resolved.

Please note that Netty 4.x and above will automatically get this value from the operating system and use it directly when initializing the Java ServerSocket. However, if you also want to configure it at the application level, you can set it in the Play program's configuration parameters like this:
    
play.server.netty.option.backlog=1024

Problem 2: Number of JVM threads

After the relatively large production traffic hit our servers for the first time, we received an alert within a few hours, and the load balancer began to fail to connect to some of the servers. After further investigation, we found the following in the server logs:
    
java.lang.OutOfMemoryError: unable to create new native thread

After further investigation, we found that the reason is that there is a bug in the support of Netty's idle timeout mechanism in LinkedIn's implementation of the Play framework, and the original Play framework code will create a new HashedWheelTimer instance for each incoming connection accordingly. This patch is very clear about the cause of this bug.

If you also run into JVM thread limit issues, chances are there are some thread leaks in your code that need to be addressed. However, if you find that all your threads are actually doing the work you expect, is there a way to change the system to allow you to create more threads and accept more connections?

As always, the answer is very interesting. It's an interesting topic to discuss the relationship between limited memory and the number of threads that can be created in the JVM. A thread's stack size determines the amount of memory that can be used for static memory allocation. Thus, the theoretical maximum number of threads is the size of a process's user address space divided by the thread's stack size. However, in practice the JVM also uses memory for dynamic allocation on the heap. After doing some simple experiments with a small Java program, we confirmed that if the heap allocates more memory, the stack can use less memory. This way, the limit on the number of threads decreases as the heap size increases.

The conclusion is that if you want to increase the thread limit, you can reduce the stack size used by each thread (-Xss) and also reduce the memory allocated to the heap (-Xms, -Xmx).

Problem 3: Ephemeral port exhaustion

We didn't actually hit that limit, but we wanted to write it here because people usually hit that limit when they wanted to support hundreds of thousands of connections on a single server. Whenever the load balancer connects to a server node, it occupies an ephemeral port. During the lifetime of the connection, the port will be associated with it, hence the name "ephemeral". When the connection is terminated, the ephemeral port is released and can be reused. However, long-lived connections do not terminate like normal HTTP connections, so the pool of available ephemeral ports on the load balancer will eventually be exhausted. The state at this time is that there is no way to establish a new connection, because all the port numbers that the operating system can use to establish a new connection have been used. There are many ways to address ephemeral port exhaustion on newer load balancers, but those are beyond the scope of this article. Instant messaging chat software app development can add Weikeyun's v:weikeyyun24 consultation

 

Luckily we can support up to 250,000 connections per load balancer. However, when you reach this limit, work with the team that manages your load balancer to increase the limit on the number of open connections between the load balancer and your server nodes.

Problem 4 encountered: file descriptors

After we had 16 servers in the data center and could handle a decent amount of production traffic, we decided to test the limit on the number of long connections each server could handle. The specific test method is to shut down a few servers at a time, so that the load balancer will direct more and more traffic to the remaining servers. Such testing resulted in the following beautiful graph of the number of file descriptors used by our server process on each server, which we nicknamed internally: "The Caterpillar Graph".

File descriptors are abstract handles in operating systems such as Unix. Unlike others, they are used to access network sockets. Not surprisingly, the more persistent connections supported on each server, the more file descriptors that need to be allocated. You can see that when only 2 of the 16 servers are left, each of them is using 20,000 file descriptors. When we turned off one of them, we saw the following log on the remaining one:
    
java.net.SocketException: Too many files open

By directing all connections to a single server, we hit the single-process file descriptor limit. To see the limit of file descriptors available to a process, look at the value of "Max open files" in the file below.
    
$ cat /proc/<pid>/limits
Max open files 30000

As in the example below, this can be increased to 200,000, just add the following line to the file /etc/security/limits.conf:

<process username> soft nofile 200000
<process username> hard nofile 200000

Note that there is also a system-wide file descriptor limit that can be adjusted by kernel parameters in the file /etc/sysctl.conf:
    
fs.file-max

In this way, we have increased the single-process file descriptor limit on all servers, so you see, we can now easily handle more than 30,000 connections per server.

Encountered problem 5: JVM heap

Next, we repeated the process above, only when we directed about 60,000 connections to the surviving one of the two remaining servers, things started to go bad again. The number of allocated file descriptors, and the corresponding number of active long-lived connections, dropped dramatically, and latency rose to unacceptable levels.

After further investigation, we found out that the reason was that we exhausted 4GB of JVM heap space. Shows that each time the memory collector can reclaim less and less heap space, until finally it is all used up.

We use TLS for all internal communications in our instant messaging service in the data center. In practice, each TLS connection consumes about 20KB of memory in the JVM, and it will increase with the number of active long-lived connections, eventually leading to the memory exhausted state shown in the figure above.

We adjusted the size of the JVM heap space to 8GB (-Xms8g, -Xmx8g) and re-run the test, constantly sending more and more connections to a server, and finally when a server processed about 90,000 connections The memory ran out again and the number of connections started to drop.

What's our next test? Since each of our servers has a very extravagant 64GB memory configuration, we directly adjusted the JVM heap size to 16GB. Since then, we have never hit this memory limit in performance testing, and have successfully handled more than 100,000 concurrent long connections in production. However, as you have seen above, we also encounter certain limitations when the pressure continues to build. What do you think it will be? Memory? CPU? Discussions are welcome.

Guess you like

Origin blog.csdn.net/wecloud1314/article/details/126478818