ShardingSphere-ShardingJdbc data sharding (sub-database, sub-table)

Summary:

In our actual development, there are always several large tables related to the business. The large tables here refer to the huge amount of data. Such as the user table, order table, or the main table in the company's business, the data of this table may soon reach the scale of millions, tens of millions, and billions, and the scale of growth has been very fast. In this case, a single table can no longer meet the storage requirements. At the same time, with such a large amount of data, even with a reasonable index, the database query is very slow. At this time, these large tables need to be divided into databases and tables. . The application needs to be able to parse, rewrite, route, result set and other operations on SQL, as well as distributed transactions, distributed id generators, etc.

There are two typical database middleware design schemes (Original: https://github.com/Meituan-Dianping/Zebra/wiki/%E6%95%B0%E6%8D%AE%E5%BA%93%E4% B8%AD%E9%97%B4%E4%BB%B6%E4%B8%BB%E6%B5%81%E8%AE%BE%E8%AE%A1 ):

  • Server proxy (proxy: proxy database): independently deploy a proxy service, which manages multiple database instances behind the proxy service. In the application, we establish a connection with the proxy server through a common data source (c3p0, druid, dbcp, etc.). All sql operation statements are sent to this agent, and the agent operates the underlying database, gets the result and returns it Give application. Under this scheme, the logic of sub-database, sub-table and read-write separation is completely transparent to developers. Typical cases : Alibaba open source cobar, mycat developed by the mycat team based on cobar, mysql-proxy officially provided by mysql, and atlas developed by Qihoo 360 based on mysql-proxy. Except for mycat, several other projects are basically not maintained.
  • Client proxy (datasource: proxy data source): the application needs to use a specific data source, its role is to proxy, internally manage multiple common data sources (c3p0, druid, dbcp, etc.), each common data source has its own Establish connections with different libraries. The sql generated by the application program is handed over to the data source agent for processing. The data source performs necessary operations on the sql, such as sql rewriting, etc., and then handed over to each common data source for execution, and the obtained results are merged and returned to the application. The data source agent usually also implements the API defined by the JDBC specification, so it can be directly integrated with the orm framework. In this scenario, the user's code needs to be modified to use the proxy data source instead of directly using connection pools such as c3p0, druid, and dbcp. Typical cases: Alibaba's open source tddl, Dianping's open source zebra, and Dangdang's open source sharding-jdbc.

This article mainly records how to use ShardingSpehere-ShardingJdbc in springboot to shard mysql data (ie, database, table)

1. Database environment preparation

Program environment: SpringBoot+MyBatis-plus

Database environment:

IP

database

data sheet
127.0.0.1:3306 shardingsphere user_split_0、user_split_1
127.0.0.1:3306 shardingsphere1 user_split_0、user_split_1

 

The details are shown in the figure below:

2. Introduce related dependencies

<!--shardingsphere数据分片、脱敏工具-->
<dependency>
    <groupId>org.apache.shardingsphere</groupId>
    <artifactId>sharding-jdbc-spring-boot-starter</artifactId>
    <version>4.1.0</version>
</dependency>

Second, configure data fragmentation rules

#### spring  ####
spring:
  # 配置说明地址 https://shardingsphere.apache.org/document/legacy/4.x/document/cn/manual/sharding-jdbc/configuration/config-spring-boot/#%E6%95%B0%E6%8D%AE%E5%88%86%E7%89%87
  shardingsphere:
    # 数据库
    datasource:
      # 数据库的别名
      names: ds0,ds1
      # 主库1
      ds0:
        ###  数据源类别
        type: com.alibaba.druid.pool.DruidDataSource
        driverClassName: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://127.0.0.1:3306/shardingsphere?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&serverTimezone=GMT%2B8
        username: root
        password: 123456
      # 从库1
      ds1:
        ###  数据源类别
        type: com.alibaba.druid.pool.DruidDataSource
        driverClassName: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://127.0.0.1:3306/shardingsphere1?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&serverTimezone=GMT%2B8
        username: root
        password: 123456

    # *** 数据库分库分表配置 start
    sharding:
      # 默认数据库
      default-data-source-name: ds0

      # 水平拆分的数据库(表) 配置分库 + 分表策略 行表达式分片策略
      # 1.默认分库策略 shardingsphere-->ds0 shardingsphere1-->ds1
      default-database-strategy:
        inline:
          sharding-column: user_id
          algorithm-expression: ds$->{user_id % 2}
      # 2.默认分表策略 user_split_0 user_split_1
      default-table-strategy:
        inline:
          sharding-column: age  # 分表策略 其中user为逻辑表 分表主要取决于age行
          algorithm-expression: user_split_$->{age % 2}
      # 数据节点
      tables:
        user:
          actual-data-nodes: ds$->{0..1}.user_split_$->{0..1}
      # *** 数据库分库分表配置 end

#    sharding:
#      # 默认数据库
#      default-data-source-name: ds0
#      default-database-strategy:
#        inline:
#          sharding-column: user_id
#          algorithm-expression: ds$->{user_id % 2}
#      tables:
#        user:
#            #指定user表里面主键id生成策略 雪花算法
#          key-generator:
#            column: user_id
#            type: SNOWFLAKE
#          actual-data-nodes: ds$->{0..1}.user_split_$->{0..1}
#          table-strategy:
#            inline:
#              sharding-column: age
#              algorithm-expression: user_split_$->{age % 2}
#      binding-tables: user

    props:
      # 打印SQL
      sql.show: true
      check:
        table:
          metadata: true
          # 是否在启动时检查分表元数据一致性
          enabled: true
      query:
        with:
          cipher:
            column: true

Rule description:

  • Sub-database strategy: According to the user_id of the user table, use ds0 if user_id% 2 = 0, and use ds1 if user_id% 2 = 1;
  • Split table strategy: split the database according to the age of the user table, if age% 2 = 0, use user_split_0, if age% 2 = 1, use user_split_1.

other instructions:

  • For the concepts of logical tables, real tables, and data nodes, see official documents
  • The fragmentation in the configuration file belongs to the row expression fragmentation. In actual business, you can implement the official interface yourself to realize the database and table algorithm required by your business. For the specific implementation of the 4 interfaces, see fragmentation . StandardShardingStrategy, ComplexShardingStrategy, HintShardingStrategy

Three, test fragmentation results

(1) user_id=100, age=18 (100%2=0, use ds0; 18%2=0, use user_split_0)

(2) user_id=101, age=18 (101%2=1, use ds1; 18%2=0, use user_split_0)

(3) user_id=102, age=17 (102%2=0, use ds0; 17%2=1, use user_split_1)

(4) Query results

When querying, if the condition is queried based on the shard key, then a certain table of a certain library is finally located. If there is no shard key in the condition, full routing will be performed, that is, all four libraries will be checked

ShardingSphere-jdbc did it for us. According to the configured sharding strategy, SQL analysis => executor optimization => SQL routing => SQL rewrite => SQL execution => result merging. See kernel analysis for specific documents

Official original text: https://shardingsphere.apache.org/document/legacy/4.x/document/cn/manual/sharding-jdbc/usage/sharding/

Source portal: https://github.com/oycyqr/springboot-learning-demo/tree/master/springboot-shardingsphere-split

Guess you like

Origin blog.csdn.net/u014553029/article/details/109274210