Optimization of connection number of sharding-jdbc sub-database | Jingdong logistics technical team

1. Background:

Both the courier order fulfillment center (cp-eofc) and the logistics platform fulfillment center (jdl-uep-ofc) systems of the distribution platform group use the sharding-jdbc of the ShardingSphere ecology as the sub-database middleware, and the entire cluster uses only sub-databases There are 16 MYSQL instances in total, each instance has 32 libraries, and the cluster has a total of 512 libraries.

When adding a client host, a MYSQl instance needs to add at least 32 connections (usually a connection pool is used, and the number of connections may be enlarged by 5 to 10 times according to the maximum number of connections configured). And usually a system will be divided For multiple applications such as web, provider, worker, etc., these applications share a set of data sources. With the increase of the number of application machines, the number of connections of MYSQL instances will soon reach the upper limit, which hinders the expansion of the system and cannot be horizontally Increasing the number of machines can only increase the configuration of machines vertically to cope with the growth of traffic.

As the core system of Jingdong Logistics, the business is growing rapidly, and the traffic the system undertakes is gradually increasing, so it is urgent to solve the bottleneck that restricts the expansion of the system.

2. Introduction to related concepts of sub-database and sub-table

2.1 Why sub-database and table

2.1.1 Sub-library

With the development of business, the amount of data in a single database continues to increase, the QPS of the database will become higher and higher, and the time spent reading and writing to the database will increase accordingly. At this time, the read and write performance of a single database will inevitably become the system's priority Bottleneck point. At this time, a single database can be split into multiple databases to share the pressure of the database and improve performance. At the same time, the distribution of multiple databases on different machines also improves the availability of the database.

2.1.2 Subtable

With the increase in the amount of data in a single table, for data query and update, even if there is some optimization at the bottom of the database, the quantitative change will inevitably cause qualitative changes, resulting in a sharp decline in performance. At this time, the single table can be divided into tables. Table data is split into multiple tables according to certain rules to reduce the amount of data in a single table and improve system performance.

2.2 Introduction to sharding-jdbc

ShardingSphere

It is an ecosystem composed of an open source distributed database middleware solution. It consists of three independent products: Sharding-JDBC, Sharding-Proxy and Sharding-Sidecar (planned). They all provide standardized data analysis Sharding, distributed transactions, and database governance functions can be applied to various application scenarios such as Java isomorphism, heterogeneous languages, containers, and cloud native.

Sharding-JDBC

Positioned as a lightweight Java framework, it provides additional services in Java's JDBC layer. It uses the client to directly connect to the database and provides services in the form of jar packages without additional deployment and dependencies. It can be understood as an enhanced version of the JDBC driver and is fully compatible with JDBC and various ORM frameworks.

Applicable to any Java-based ORM framework, such as: JPA, Hibernate, Mybatis, Spring JDBC Template or use JDBC directly. Based on any third-party database connection pool, such as: DBCP, C3P0, BoneCP, Druid, HikariCP, etc.

Any database that implements the JDBC specification is supported. Currently supports MySQL, Oracle, SQLServer and PostgreSQL.

Let's first look at the example of rule configuration based on the Spring namespace given by the official website of ShardingSphere:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:sharding="http://shardingsphere.io/schema/shardingsphere/sharding" 
    xsi:schemaLocation="http://www.springframework.org/schema/beans 
                        http://www.springframework.org/schema/beans/spring-beans.xsd
                        http://shardingsphere.io/schema/shardingsphere/sharding 
                        http://shardingsphere.io/schema/shardingsphere/sharding/sharding.xsd 
                        ">
    <!-数据源ds0->
    <bean id="ds0" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
        <property name="driverClassName" value="com.mysql.jdbc.Driver" />
        <property name="url" value="jdbc:mysql://localhost:3306/ds0" />
        <property name="username" value="root" />
        <property name="password" value="" />
    </bean>
    <!-数据源ds1->
    <bean id="ds1" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
        <property name="driverClassName" value="com.mysql.jdbc.Driver" />
        <property name="url" value="jdbc:mysql://localhost:3306/ds1" />
        <property name="username" value="root" />
        <property name="password" value="" />
    </bean>
    
    <!-分片策略->
    <sharding:inline-strategy id="databaseStrategy" sharding-column="user_id" algorithm-expression="ds$->{user_id % 2}" />
    <sharding:inline-strategy id="orderTableStrategy" sharding-column="order_id" algorithm-expression="t_order$->{order_id % 2}" />
    <sharding:inline-strategy id="orderItemTableStrategy" sharding-column="order_id" algorithm-expression="t_order_item$->{order_id % 2}" />
    
    <!-sharding数据源配置->
    <sharding:data-source id="shardingDataSource">
        <sharding:sharding-rule data-source-names="ds0,ds1">
            <sharding:table-rules>
                <sharding:table-rule logic-table="t_order" actual-data-nodes="ds$->{0..1}.t_order$->{0..1}" database-strategy-ref="databaseStrategy" table-strategy-ref="orderTableStrategy" />
                <sharding:table-rule logic-table="t_order_item" actual-data-nodes="ds$->{0..1}.t_order_item$->{0..1}" database-strategy-ref="databaseStrategy" table-strategy-ref="orderItemTableStrategy" />
            </sharding:table-rules>
        </sharding:sharding-rule>
    </sharding:data-source>
</beans>

Configuration summary:

1. Need to configure multiple data sources ds0, ds1;

2. The sharding-column and algorithm-expression configured in the sharding strategy must conform to the groovy syntax;

3. Configure the logical table name (logic-table), database fragmentation strategy (database-strategy-ref) and table fragmentation strategy (table-strategy-ref) in the sharding:table-rule tag in the sharding data source , actual- The data-node attribute consists of data source name + table name, separated by decimal points, and is used for broadcasting tables;

3. Problem analysis and solution

3.1 Problem Analysis

As mentioned at the beginning of the article, our current MYSQL cluster architecture is as follows, 16 MYSQL instances, each instance has 32 libraries, and the cluster has a total of 512 libraries. When the client host is started and connected to the 32 libraries in the MYSQL_0 instance, respectively 32 data sources will be established, and the maximum number of connections configured in the connection pool is 5, that is to say, in extreme cases, a client and a MYSQL instance will establish a maximum of 32*5=160 connections. For some core logistics systems in large It is very common to expand the capacity of hundreds of units at a short time, so the maximum number of connections of a single instance will soon reach the upper limit.

The current form of client connection to connect to the database cluster is shown in the figure:

3.2 Feasible solutions

Our goal is to reduce the number of connections to a single MYSQL instance, among which we have discussed several solutions as follows:

3.2.1 A single instance does not divide the database but only divides the table

Such a client only needs to connect to a single database instance through a connection pool, which greatly reduces the number of connections. However, this solution changes the existing sharding rules and requires a new database cluster to synchronize historical data and incremental data according to the new rules. However, the most difficult and risky part is the online switching process, which may cause data inconsistency, and the rollback plan will be very complicated if something goes wrong.

3.2.2 Using a database that supports elastic expansion

Use Jingdong's jed, tidb and other databases that support elastic expansion to synchronize data to the new database. The advantage of this type of database is that developers only need to focus on the business and do not need to deal with the underlying details of the database connection.

3.2.3 Using sharding-proxy

The positioning of Sharding-Proxy is a transparent database proxy. We can deploy a set of Sharding-Proxy on the server. The client only needs to connect to the proxy service, and then the proxy server connects to the MYSQL cluster. In this way, the number of connections to the MYSQL cluster is only as large as the proxy server. It is related to the number and decoupled from the client.

3.2.4 By transforming sharding-jdbc

In theory, as long as we get the connection of a library on the database instance, we can access the data in other libraries on this instance through the method of "library name.table name" (of course, the premise is that the user has the permission to access the library) , can we achieve this access method by transforming sharding-jdbc?

For the above schemes, 3.2.1 and 3.2.2 all need to create a new database, synchronize historical and incremental data, and also involve switching data sources online. 3.2.3 needs to deploy a set of proxy services, and clusters must be used for high availability. The workload and risks of these three schemes are relatively high. Based on the principle of minimum cost, we finally choose the scheme of transforming sharding-jdbc.

3.3 Explore sharding-jdbc

3.3.1 Workflow

The workflow of sharding-jdbc can be divided into the following steps:

sql analysis - lexical analysis and syntax analysis;
sql routing - match the sharding strategy of the database and table according to the parsing context, and generate a routing path;
sql rewriting - rewrite logical SQL to SQL that can be executed correctly in the real database;
sql execution - execute sql concurrently using multiple threads;
Result merging - combine the multi-data result sets obtained from each data node into one result set and return it to the requesting client correctly;

Obviously, the sharding of databases and tables is processed in the sql routing stage, so we use the sql routing logic as the entry point to analyze the source code.

3.3.2 Source code analysis

The route method in the ShardingStandardRoutingEngine class is the entry to calculate the route, and the returned result is a fragmented collection of databases and tables:

The core logic in the route method is in the route0 method of this class, where the routeDataSources method is responsible for database routing, the routeTables method is responsible for table routing, and the actual route calculation is in the doSharding method of StandardShardingStrategy, let’s go deeper.

There are two member attributes in the StandardShardingStrategy class, preciseShardingAlgorithm (precise sharding algorithm), rangeShardingAlgorithm (range sharding algorithm), since our sql only specifies the sharding key for precise query, the result calculated by preciseShardingAlgorithm is used, PreciseShardingAlgorithm is an interface, then we can implement this interface to customize the fragmentation algorithm.

At the same time, the corresponding label support is also found on the sharding-sphere official website:

So we only need to implement the PreciseShardingAlgorithm interface and configure it in the tag to implement the custom sharding strategy.

3.4 Transformation steps

3.4.1 Renovation of library fragmentation

Currently, the application configures a total of 512 data sources from ds_0 to ds_511. We only need to configure a total of 16 data sources from ds_0 to ds_15. Each data source is configured with the first library on a single instance.

For the sharding rules, we can still use the sharding:inline-strategy tag, just rewrite the Groovy expression, the sharding key is order_code, and the previous sharding algorithm is (Math.abs(order_code.hashCode()) % 512 ) is to use the hash value of the order_code column to take the modulus of 512 to get 0~511. We only need to divide the result by 32 to get 0~16, that is, the expression is rewritten as (Math.abs(order_code.hashCode()) % 512).intdiv(32).

Configuration of sub-database rules before transformation:

Configuration of sub-database rules after transformation:

3.4.2 Table Fragmentation Transformation

Implement the PreciseShardingAlgorithm interface, rewrite the table sharding algorithm, and return the calculation result in the form of "actual library name + table name";

For example: query the data of user_id=35711 in the t_order table on the DB_31 database, the data source returned by the database fragmentation algorithm is "DB_0", and the table fragmentation algorithm returns "DB_31.t_order";

Custom table fragmentation algorithm:

Define the sharding:standard-strategy tag in xml , and its attribute precise-algorithm-ref is configured as our custom table-splitting algorithm.

3.4.3 Database connection pool parameter adjustment

Before the transformation, one library corresponds to one data source connection pool. After the transformation, 32 libraries on one instance share one data source connection pool. Then the maximum number of connections in the connection pool, the minimum number of idle connections and other parameters need to be adjusted accordingly. This requires To make a reasonable assessment based on business traffic, of course, the most rigorous is to use the pressure test results as the basis.

After the transformation, the form of the client connecting to the cluster is shown in the figure:

Comparison of the number of database cluster connections before and after optimization:

Four. Episode

When rewriting the Groovy expression of the library fragmentation rule, divide 32 and directly configure "/32" on the original expression, that is, Math.abs(order_code.hashCode()) % 512 / 32. During debugging, it is found that executing sql will report " No database route info" error message. After debugging, it is found that sharding-jdbc will appear decimals (for example: ds_14.6857) when calculating the sharding rules, resulting in no data source. This is because Groovy does not provide a dedicated integer division operator. So to use the .intdiv() method, the final expression is rewritten as (Math.abs(order_code.hashCode()) % 512).intdiv(32).

5. Summary

This article introduces the concept and advantages of sub-database and sub-table, as well as the sharding-jdbc sub-database and sub-table middleware, and explores the execution process of sharding-jdbc routing rules. Of course, at the beginning of system design, for database sub-database and sub-table, Is it necessary to do it? Is it better to have more sub-databases or more sub-tables? There is no one-size-fits-all rule, and it needs to be combined with the characteristics of the system (such as qps, tps, single-table data volume, disk specifications, data retention time, Business increment, data hot and cold schemes, etc.) to make decisions and trade-offs, there are pros and cons to make decisions, and trade-offs are required.

Author: JD Logistics Zhang Zhongliang

Source: JD Cloud Developer Community Ziyuanqishuo Tech