Evolution of Rainbow Bridge Architecture - Performance

I. Introduction

A year ago, " The Evolution of Rainbow Bridge Architecture " focused on the two directions of stability and functionality. In the past year, despite the continuous growth of business demand and the surge of traffic several times, Rainbow Bridge has still maintained a zero-fault status, which is a good initial result. In this architectural evolution, we mainly share some recent architectural adjustments and optimizations for performance. The biggest adjustment is that the thread mode of the Proxy-DB layer is changed from BIO to NIO with better performance. The specific transformation details and optimizations will be introduced in detail below.

It is estimated that it will take 20 to 30 minutes to read this article. The overall content will be a bit boring and difficult to understand. It is recommended to read the previous article on the evolution of the Rainbow Bridge architecture (the road to the evolution of the Rainbow Bridge architecture ) and the basic knowledge related to the MySQL protocol before reading.

2. Structure before transformation

First, let’s review the panoramic architecture diagram of the Rainbow Bridge:

Proxy three-layer module

For the Proxy layer, it can be roughly divided into three layers: Frontend, Core, and Backend:

  • Frontend-service exposure layer : Use Netty as the server to encode and decode the received & returned data according to the MySQL protocol.
  • Core-Function & Kernel Layer : Through core capabilities such as parsing, rewriting, and routing, core functions such as data sharding, read-write separation, and shadow library routing are realized.
  • Backend-the underlying DB interaction layer : implements operations such as interacting with the database, changing the columns of the result set, merging, etc. through JDBC.

Problems in BIO mode

Here, the Core layer is a pure computing operation, while both Frontend and Backend involve IO operations. The Frontend layer uses Netty to expose services in NIO mode, but Backend uses the traditional JDBC driver provided by the database manufacturer, which is BIO mode. Therefore, the overall architecture of Proxy is still BIO mode. In the BIO model, each connection requires an independent thread to handle. This model has some obvious disadvantages:

  • High resource consumption : Each request creates an independent thread, accompanied by a large amount of thread overhead. Thread switching and scheduling consume additional CPU.
  • Limited scalability : Affected by the system thread upper limit, performance drops sharply when processing a large number of concurrent connections.
  • I/O blocking : In the BIO model, read/write operations are blocking, causing threads to be unable to perform other tasks, resulting in a waste of resources.
  • Complex thread management : Thread management and synchronization issues increase the difficulty of development and maintenance.

Let's look at the simplest scenario: after JDBC initiates a request, the current thread will be blocked until the database returns data. When a large number of slow queries occur or the database fails, a large number of threads will be blocked, and eventually an avalanche will occur. In the previous Rainbow Bridge architecture evolution article, we made some improvements to avoid some problems under the BIO model, such as using thread pool isolation to solve the problem of global avalanche caused by single library blocking.

However, as the number of logical libraries increases, the number of threads in the Proxy will eventually expand. The scalability and throughput of the system are challenged. Therefore, it is necessary to upgrade the existing blocking connection based on JDBC driver to use NIO (non-blocking I/O) to connect to the database.

3. Structure after transformation

  • BIO->NIO

If I want to change the overall architecture of Proxy from BIO->NIO, the simplest way is to replace the traditional BIO database driver JDBC with the NIO database driver. However, after research, I found that there are not many open source NIO drivers, and there is basically no optimal one. practice. Finally, after referring to the previous research done by the ShardingSphere community ( https://github.com/apache/shardingsphere/issues/13957 ), we decided to use Vertx to replace JDBC. The first reason for using Vert. However, the final result was unsatisfactory. Due to the abstract architecture of Vertx, when the link is long, the depth of the entire call stack is extremely exaggerated. The throughput increase in the final stress test was only less than 5%, and there were many compatibility issues. So I started over and decided to develop my own database driver and connection pool.

  • Skip unnecessary encoding and decoding stages

Since the JDBC driver will automatically encode and decode MySQL's byte data into Java objects, the Proxy will then encode and return these result sets to the upstream after some processing (metainformation modification, result set merging). If you develop your own driver, you can control the encoding and decoding process more carefully, forward the data that does not need to be processed by the Proxy directly to the upstream, and skip meaningless encoding and decoding. We will introduce later which scenarios do not require Proxy to process the result set.

Self-developed NIO database driver

The database driver mainly encapsulates the interaction protocol with the DB layer and encapsulates it into a high-level API. The following two pictures are some core interfaces of Connection and Statement in the java.sql package.

So first we need to understand how to interact with the database. Taking MySQL as an example, use Netty to connect to MySQL. The simple interaction process is as follows.

After using Netty to establish a connection with MySQL, all we have to do is follow the data format specified by the MySQL protocol, authenticate first, and then send the specific command package. The following is the authentication process and command execution process in the official MySQL documentation:

The following is to implement the encoding and decoding Handle according to the MySQL documentation. Let's take a brief look at the implemented code.

  • decode decode

It is to decode the data packet returned by MySQL, parse the Palyload according to the length, encapsulate it into MySQLPacketPayload and pass it to the corresponding Handle for processing.

  • encode encoding

Convert specific command classes into specific MySQL data packages. The MySQLPacket here has multiple implementation classes, which correspond to the Command type of MySQL one-to-one.

Now you also need an implementation class similar to java.sql.Connection to assemble MySQLPacket and write it into the Netty channel, and parse the encoded MySQLPacketPayload and convert it into a ResultSet.

It seems relatively simple. The interaction process is almost the same as traditional JDBC. However, since it is an asynchronous process now, all responses are returned through callbacks, so there are two difficulties here:

  • Since MySQL cannot accept new commands before the previous command ends, how to control the serialization of commands for a single connection?
  • How to bind the data packet returned by MySQL to the Request that initiated the command one by one?

First, NettyDbConnection introduces a lock-free, non-blocking queue ConcurrentLinkedQueue.

When sending a Command, if there is no ongoing Command, it will be sent directly. If there is an ongoing Command, it will be directly thrown into the queue and wait for the completion of the previous Command processing to promote the execution of the next command. Guaranteed serialization of individual connection commands.

Secondly, NettyDbConnection passes in a Promise when executing a command. After all MySQL data packets are returned, this Promise will be set and can be bound one by one to the Request that initiated the command.

Self-developed NIO database connection pool

The NettyDbConnection class was introduced earlier, which implements interaction with MySQL and provides a high-level API for executing SQL. However, in actual use, it is impossible to create a connection every time and then close it after executing the SQL. Therefore, NettyDbConnection needs to be pooled to uniformly manage the connection life cycle. Its function is similar to the traditional connection pool HikariCP. It has done a lot of performance optimization on the basis of completing the basic capabilities.

  • Connection lifecycle management and control
  • Dynamic scaling of connection pool
  • Perfect monitoring
  • Connection asynchronous keepalive
  • Timeout control
  • EventLoop affinity

In addition to EventLoop affinity, there are several other functions that anyone who has used traditional database connection pools should be familiar with. I won’t go into too much detail here. Here we mainly introduce the affinity of EventLoop.

At the beginning of the article, we talked about the three-layer module of Proxy, Frontend, Core, and Backend. If we now replace the components of the Backend layer that interact with the database with our self-developed driver, then Proxy will be both Netty Server and Netty Client. So Frontend and Backend can share an EventLoopGroup. In order to reduce thread context switching, when a single request is received from Frontend, forwarded to MySQL after Core layer calculation, then receives the MySQL service response, and finally writes back to the Client, these series of operations should be processed in one EventLoop thread as much as possible. .

The specific method is that when Backend chooses to connect to the database, it gives priority to the connection bound to the current EventLoop. This is the EventLoop affinity mentioned earlier, which ensures that the next request in most scenarios is processed by the same EventLoop from beginning to end. Let's take a look at the specific code implementation.

Use a Map in the NettyDbConnectionPool class to store idle connections in the connection pool. Key is EventLoop, and Value is the idle connection queue bound to the current EventLoop.

When obtaining, the connection bound to the current EventLoop is given priority. If the current EventLoop has no bound connection, the connection of other EventLoop will be borrowed.

In order to improve the EventLoop hit rate, you need to pay attention to several configuration points:

  • The number of EventLoop threads should be consistent with the number of CPU cores.
  • The more the maximum number of connections in the connection pool exceeds the number of EventLoop threads, the higher the EventLoop hit rate.

Below is a picture of the hit rate monitoring in a stress test environment (8C16G, maximum number of connections in the connection pool: 10~30), most of which remain around 75%.

Skip unnecessary codecs

As mentioned earlier, some SQL result sets do not require Proxy for processing, that is, the data stream returned by MySQL can be directly forwarded to the upstream intact, eliminating the need for encoding and decoding operations. So what kind of SQL does not require Proxy for processing? Let's give an example to illustrate.

Assume that there is a table User in logical library A, which is divided into two libraries: DB1 and DB2. The sharding algorithm is user_id%2.

  • SQL 1

‍SELECT id, name FROM user WHERE user_id in (1, 2)

  • SQL 2

‍SELECT id, name FROM user WHERE user_id in (1)

Obviously, SQL 1 will eventually match 2 nodes because it has 2 shard Values, while SQL 2 will only match 1 node.

SQL 1 cannot skip encoding and decoding because it needs to merge the result set. SQL 2 does not need to merge the result set. It only needs to modify the column definition data in the result set. The real Row data does not need to be processed. In this case, Row data can be forwarded directly to the upstream.

Full link asynchronous

After the Backend layer replaces the original HikariCP+JDBC with a self-developed connection pool+driver, all blocking operations from the Frontend-Core-Backend link need to be replaced with asynchronous coding, which is implemented through Netty's Promise and Future.

Because in some scenarios, when a Future is obtained, the current Future may have been completed. If the Listener is added mindlessly every time, the call stack will be lengthened, so we have defined a general tool class to handle Future, namely future.isDone() It will be executed directly, otherwise addListener will be added, maximizing the depth of the entire call stack.

compatibility

In addition to the transformation of the above basic code, a lot of compatibility work also needs to be done:

  • Special database field type processing
  • JDBC URL parameters compatible
  • All ThreadLocal related data needs to be migrated to ChannelHandlerContext
  • Log MDC, TraceContext related data transfer
  • ……

4. Performance

After several rounds of performance stress testing, the performance of the NIO architecture has been greatly improved compared to the BIO architecture:

  • Overall maximum throughput increased by 67%
  • LOAD dropped by about 37%
  • Under high load conditions, BIO has experienced process congestion many times, while NIO is relatively stable.
  • The number of threads is reduced by about 98%

‍5. Summary

The workload of transforming the NIO architecture was quite huge, and there were some twists and turns in the process, but the final result was satisfactory. Thanks to the high performance of ShardingShpere itself at the kernel level and this NIO transformation, Rainbow Bridge can basically be regarded as the first echelon in terms of DAL middleware performance.

*Text/Shinichi

This article is original to Dewu Technology. For more exciting articles, please see: Dewu Technology official website

Reprinting without the permission of Dewu Technology is strictly prohibited, otherwise legal liability will be pursued according to law!

Alibaba Cloud suffered a serious failure and all products were affected (restored). Tumblr cooled down the Russian operating system Aurora OS 5.0. New UI unveiled Delphi 12 & C++ Builder 12, RAD Studio 12. Many Internet companies urgently recruit Hongmeng programmers. UNIX time is about to enter the 1.7 billion era (already entered). Meituan recruits troops and plans to develop the Hongmeng system App. Amazon develops a Linux-based operating system to get rid of Android's dependence on .NET 8 on Linux. The independent size is reduced by 50%. FFmpeg 6.1 "Heaviside" is released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5783135/blog/10143389