Hot technology explanation: ShardingJdbc sub-database sub-table actual combat case analysis (on)

In the process of R&D and maintenance of real-time online business systems such as orders, transactions, and payments, with the rapid growth of business volume, we often encounter problems due to excessive growth of single-table data in relational databases (such as MySql). Online accidents caused by these accidents; although most of these accidents are caused by system avalanches due to unreasonable slow SQL, sometimes there will be systemic performance degradation caused by IO contention in database hot blocks. In short, the unlimited growth of the amount of data in a single table will always increase the instability of the system in one case or another.

Therefore, in the design of large-scale real-time systems, in addition to focusing on the distributed application structure, the real-time storage of databases and the scalability of computing capabilities should not be ignored. At present, there are generally two ways of solving real-time data growth: one is to directly use distributed databases (for example: Tidb, OceanBase , etc.); the other is to sub-databases and tables of relational databases to maximize the use of existing databases Real-time computing power. In most cases, the latter option tends to be more realistic.

The main content of this article is to simulate the order library of a trading system to specifically demonstrate how to implement the sub-database and table storage of transaction order data through ShardingJdbc . In this process, there will be three main scenarios involving the practice of sub-database and sub-table: 1. The new system directly uses the sub-database and sub-table scheme at the beginning of the design; 2. How to smoothly implement the sub-database and sub-table after the historical system has been running for a period of time 3. Data migration issues involved in the Scaling operation of the existing sub-database sub-table logic (including reducing sub-tables and increasing sub-tables).

Spring Boot integrates ShardingJdbc to achieve sub-database and table

The order data of the trading system is a very typical scenario for sub-databases and tables. Since the trading system requires high real-time processing performance for a single piece of data, once the data volume of a single order table reaches 1 billion+, it is easy to appear due to database hot spots Performance degradation caused by block IO contention is also prone to systemic avalanches caused by individual inadvertent SQL operations.

But once you decide to implement sub-databases and tables, you must make a storage plan in advance and make a certain assessment of the scale of future data growth. At the same time, make a system Scaling plan for adding sub-databases and tables in the future. In addition, the implementation of sub-database and sub-table should also consider the access difficulty of the application. The detailed logic of sub-database and sub-table should be transparent to the application; therefore, in general, we need an intermediate agent layer to shield the sub-database and sub-table from the application itself. Coming intrusion.

At present, the well-known sub-library and sub-table proxy component in the Java community is ShardingJdbc (currently integrated in the Apache open source project ShardingSphere). ShardingJdbc is essentially a lightweight JDBC driver proxy, which only needs to be used in the process Depends on the relevant Jar package and does not need to deploy any additional services. The definition of sub-database and sub-table logic can be realized through the system configuration file, and the application of transparent proxy access can be realized.

Next, we take Spring Boot as an example to demonstrate how to integrate ShardingJdbc to realize the sub-database sub-table operation of transaction orders. The specific steps are as follows:

1) Planning of sub-database and sub-table of order data

At the beginning of the system design, if you can foresee the future growth of the data volume, it is very far-sighted to plan the sub-database and sub-tables in advance. In terms of the form of sub-database and sub-table, there are generally two types of planning methods: 1) Single-database horizontal sub-table . If a single database has relatively strong computing power, data tables can be split horizontally in the same database; 2 ), sub-database + sub-table , if the scale of data explosively grows, the computing resources of a single database are limited, in order to improve the overall computing and processing performance of the database, it is also possible to implement multiple database sub-database and table storage at the same time.

In the example of this article, we plan the order data sub-database sub-table as: 1), 2 database nodes (ds0, ds1); 2), the number of sub-tables in each database is 32 tables (0~31). The overall data sub-database sub-table logic of the order table is based on the "user_id field %2" in the order table to realize the sub-database; then on the basis of the sub-database logic, the horizontal sub-table is realized according to the "order_id field %32" in the order table. For example, there is an order data with a user_id of 1001 and an order number of 20200713001. According to the above sub-database sub-table rule 1001%2=1, 20200713001%32=9, then the data will be stored in the 9th sub-table in the ds1 library .

The specific order logic table structure is as follows:

create table t_order (
 id bigint not null primary key auto_increment,
 order_id bigint comment '业务方订单号（业务方系统唯一）',
 trade_type varchar (30) comment '业务交易类型，例如topup-表示钱包充值',
 amount bigint comment '交易金额，以分为单位',
 currency varchar (10) comment '币种，cny-人民币',
 status varchar (2) comment '支付状态，0-待支付；1-支付中；2-支付成功；3-支付失败',
 channel varchar (10) comment '支付渠道编码，0-微信支付，1-支付宝支付',
 trade_no varchar (32) comment '支付渠道流水号',
 user_id bigint (60) comment '业务方用户id',
 update_time timestamp null default current_timestamp on update current_timestamp comment '最后一次更新时间',
 create_time timestamp null default current_timestamp comment '交易创建时间',
 remark varchar(128)  comment '订单备注信息',
 key unique_idx_pay_id ( order_id ),
 key idx_user_id ( user_id ),
 key idx_create_time ( create_time )
);
alter table t_order comment '交易订单表';

The specific sub-table form of the above logical table is t order {0~31}, which are respectively distributed in two database nodes ds0 and ds1.

2), create the experimental project code structure

First, create a Spring Boot project based on Maven and integrate the MyBatis database access framework. The code structure is as follows:

Hot technology explanation: ShardingJdbc sub-database sub-table actual combat case analysis (on)

As shown in the figure above, we have created a basic project based on Spring Boot and integrated the database access function based on Mybatis. In addition, the project also realizes the separation management of unit/integration test code.

3), SpringBoot+ShardingJdbc realizes order sub-database sub-table rule configuration

Next, let's take a look at how to integrate ShardingJdbc in the Spring Boot project, and perform specific configurations in accordance with the planned sub-database and table rules.

First introduce ShardingJdbc's starter dependency package for Spring Boot project, as follows:

<!-- 引入Sharding-JDBC Spring Boot依赖组件 -->
<!-- Sharding-JDBC For Spring Boot Start -->
<dependency>
    <groupId>org.apache.shardingsphere</groupId>
    <artifactId>sharding-jdbc-spring-boot-starter</artifactId>
    <version>${sharding-sphere.version}</version>
</dependency>
<!-- for spring namespace -->
<dependency>
    <groupId>org.apache.shardingsphere</groupId>
    <artifactId>sharding-jdbc-spring-namespace</artifactId>
    <version>${sharding-sphere.version}</version>
</dependency>
<!-- Sharding-JDBC For Spring Boot End -->

After introducing the Spring Boot Starter dependency, ShardingJdbc will use its own data source configuration logic. To avoid conflicts, you need to exclude the default data source automatic configuration class from the main class, as follows:

//排除掉默认的数据源自动配置类
@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class})
public class OrderServerApplication {
    public static void main(String[] args) {
        SpringApplication.run(OrderServerApplication.class, args);
    }
}

After completing the above operations, the integration of ShardingJdbc and Spring Boot applications has been completed in terms of engineering logic. The next thing we need to do is to configure the database and table rules through the configuration file according to the planned database and table rules, as follows:

#SQL控制台打印（开发时配置）
spring.shardingsphere.props.sql.show = true
# 配置真实数据源
spring.shardingsphere.datasource.names=ds0,ds1

# 配置第1个数据源
spring.shardingsphere.datasource.ds0.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds0.url=jdbc:mysql://127.0.0.1:3306/order_0?characterEncoding=utf-8
spring.shardingsphere.datasource.ds0.username=root
spring.shardingsphere.datasource.ds0.password=123456

# 配置第2个数据源
spring.shardingsphere.datasource.ds1.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.ds1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds1.url=jdbc:mysql://127.0.0.1:3306/order_1?characterEncoding=utf-8
spring.shardingsphere.datasource.ds1.username=root
spring.shardingsphere.datasource.ds1.password=123456

# 配置t_order表规则
spring.shardingsphere.sharding.tables.t_order.actual-data-nodes=ds$->{0..1}.t_order_$->{0..31}
# 配置t_order表分库策略（inline-基于行表达式的分片算法）
spring.shardingsphere.sharding.tables.t_order.database-strategy.inline.sharding-column=user_id
spring.shardingsphere.sharding.tables.t_order.database-strategy.inline.algorithm-expression=ds${user_id % 2}
# 配置t_order表分表策略
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.algorithm-expression = t_order_$->{order_id % 32}

#如其他表有分库分表需求，配置同上述t_order表
# ...

In the above configuration file, we have configured two data sources. The corresponding databases are order_0 and order_1. The sub-table information {0~31} of the logical table t_order is stored in these two databases. There are 64 databases in total. Distributed storage of order information. There are two main dimensions of sub-database and sub-table: user ID as the sub-database key and order ID as the sub-table key.

4), compile the order entry logic to test the effect of ShardingJdbc sub-database sub-table

Through the above steps, we have functionally completed the sub-database sub-table logic for the order table. The specific operation logic of the order table is the same as the normal use of Mybatis to operate the database table. There is no need to perform additional code operations for the sub-database and sub-tables, because ShardingJdbc intercepts SQL at the database driver layer and matches the sub-database and sub-table rules. And routing operations.

Like normal Spring Mvc-based development, we write an Mvc-based order creation interface and start the application. The effect is as follows:

Hot technology explanation: ShardingJdbc sub-database sub-table actual combat case analysis (on)

It can be seen from the programming method that it is completely consistent with the hierarchical structure of our usual Java code. At this time, the order creation interface is simulated and called, and the specific request parameters are as follows:

{
    "orderId":123458,
    "tradeType":"topup",
    "amount":1000,
    "currency":"cny",
    "userId":63631725
}

According to the data rules of the request parameter, this order should be stored in the sub-table numbered 2 in the database number 1, and the specific calculation ("userId->63631725%2=1;orderId->123458%32=2") is completed. After the interface is called, you can query the database table data for verification!

Earlier we demonstrated that using ShardingJdbc to achieve application-transparent sub-database and sub-table operations under the pre-planned structure of sub-database and sub-table. However, in most cases, there are very few systems that can plan a long-term data table distributed storage solution with foresight in advance. Only when the data scale reaches a certain level and the system performance encounters the corresponding bottleneck, the sub-database sub-table solution will be logical. Put it in the desktop options.

And this generally involves two scenarios:

1) How to implement the scheme of sub-database and sub-meter smoothly in the single-database and single-meter system that has not yet performed sub-database sub-metering;
2) For systems that have real-time over-sub-database and sub-meter schemes, due to the continuous growth of data volume, the original sub-database and sub-meters are not enough and need to be expanded twice. *

In the above two scenarios, in either case, a large amount of data migration is required due to changes in storage rules. For systems running online, it is tantamount to changing the wheels of a car running at high speed. Carelessness will cause serious consequences of system crash. So programmers who dare to structurally reshape such systems are real warriors!

Write at the end

Welcome everyone to pay attention to my public account [The wind and waves are as calm as the code ], a large number of Java-related articles, learning materials will be updated in it, and the compiled materials will also be placed in it.

If you think the writing is good, just like it and add a follower! Pay attention, don’t get lost, keep updating! ! !