Step by step to achieve a sub-library sub-table plugin

Foreword

With the growing amount of data system, when talking about the database schema and database optimization, we will inevitably hear often sub-library sub-table this term.

Of course, there are many sub-library sub-table methodology, such as the vertical split, horizontal split; there are many intermediate products, such MyCat, ShardingJDBC.

Business scene selection of suitable resolving method, and then select a familiar open source framework that can help us to complete the project involved work to split the data.

This article does not intend to conduct in-depth discussion on these methodologies and open source framework, I would like to discuss another scenario:

If the system is not much need to split the table, just one or a small number of a few, whether we deserve to introduce some relatively complex middleware product; in particular, if we do not understand their principle, have the confidence to manage they?

Based on this, if your system has a small number of tables need to be split, there is no dedicated resources to research open source components, then we can own to achieve a simple plug-in sub-library sub-table; Of course, if your system is more complex, larger volume of business, or the use of open source components or assemblies from the research team to address this matter more secure.

First, the principle

Sub-library sub-table these things that simple and easy to say that's very complicated complex ...

Simply because its core processes more clearly. Is to parse the SQL statement and based on pre-configured, rewrite, or routed to a real database table to go;

Complex that, SQL statements, complex and flexible, such as paging, deduplication, sorting, grouping, polymerization, and other operations associated with the query, how to interpret them correctly.

Thus, even ShardingJDBCin the official website is also clear support items and items not supported.

Second, the annotation type configuration

With respect to the complex configuration file, we adopt a more lightweight annotation-type configuration, which is defined as follows:

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Documented
public @interface Sharding {
    String tableName();     //逻辑表名
    String field();         //分片键
    String mode();          //算法模式
    int length() default 0; //分表数量
}
复制代码

So, where to use it? For example, we require users table points table, it is marked on the User entity object.

@Data
@Sharding(tableName = "user",field = "id",mode = "hash",length = 16)
public class User {
    private Long id;
    private String name;
    private String address;
    private String tel;
    private String email;
}
复制代码

This explains, I have a total of 16 user tables, user ID, using Hash algorithm to calculate its position.

Of course, we have more than Hash algorithm, you can also define the date range.

@Data
@Sharding(tableName = "car",field = "creatTime",mode = "range")
public class Car {
    private long id;
    private String number;
    private String brand;
    private String creatTime;
    private long userId;
}
复制代码

Third, the fragmentation algorithm

Here, I realized two fragmentation way is HashAlgorithm和RangeAlgorithm.

1, the range of slice

If your system has the use of hot and cold data separation, we can follow the different months of the date data into different tables.

Such as the creation time of the vehicle is 2019-12-10 15:30:00, which pieces of data will be assigned to car_201912this table to go.

We intercepted by years part time, then add the logical table name.

public class RangeAlgorithm implements Algorithm {
    @Override
    public String doSharding(String tableName, Object value,int length) {
        if (value!=null){
            try{
                DateUtil.parseDateTime(value.toString());
                String replace = value.toString().substring(0, 7).replace("-", "");
                String newName = tableName+"_"+replace;
                return newName;
            }catch (DateException ex){
                logger.error("时间格式不符合要求!传入参数:{},正确格式:{}",value.toString(),"yyyy-MM-dd HH:mm:ss");
                return tableName;
            }
        }
        return tableName;
    }
}
复制代码

2, Hash fragment

Hash algorithm in the slice, we can first determine the number of tables is not a power of 2. If not, get through index arithmetic way, if it is, you get bit by the subscript operation mode. Of course, this is the source HashMap learned oh.

public class HashAlgorithm implements Algorithm {
    @Override
    public String doSharding(String tableName, Object value,int length) {
        if (this.isEmpty(value)){
            return tableName;
        }else{
            int h;
            int hash = (h = value.hashCode()) ^ (h >>> 16);
            int index;
            if (is2Power(length)){
                index = (length - 1) & hash;
            }else {
                index = Math.floorMod(hash, length);
            }
            return tableName+"_"+index;
        }
    }
}
复制代码

Fourth, the interceptor

Configuration and fragmentation algorithm has, the next is the main event. Here, we use Mybatis拦截器them to come in handy.

Perennial CRUD of us know a business is certainly SQL escape their scope. Among them, we delete functions are generally tombstone in the business, so, basically there will be no DELETE operations.

In comparison, the new and updated SQL are relatively simple and fixed format, SQL queries tend to be more flexible and complex. So, here the author defines two interceptors.

However, before the introduction of the interceptor, we have reason to know two other things: SQL syntax parser and slice algorithm processor.

1、JSqlParser

JSqlParserResponsible for parsing the SQL statement, and converted into Java class hierarchy. We can look at a simple example to understand it.

public static void main(String[] args) throws JSQLParserException {

	String insertSql = "insert into user (id,name,age) value(1001,'范闲',20)";
	Statement parse = CCJSqlParserUtil.parse(insertSql);
	Insert insert = (Insert) parse;

	String tableName = insert.getTable().getName();
	List<Column> columns = insert.getColumns();
	ItemsList itemsList = insert.getItemsList();
	System.out.println("表名:"+tableName+" 列名:"+columns+" 属性:"+itemsList);
}
输出: 表名:user 列名:[id, name, age] 属性:(1001, '范闲', 20)
复制代码

We can see that, JSqlParsercan parse the SQL syntax information. Accordingly, we can also change the contents of the object, so as to achieve the purpose of modifying the SQL statement.

2, algorithm processor

Our algorithm is more fragmented, concrete should be called to decide which one is in the running period. So, we use a Map algorithm first registered up, and then call it according to slice mode. This is also reflected in the strategy pattern.

@Component
public class AlgorithmHandler {
    private Map<String, Algorithm> algorithm = new HashMap<>();
    @PostConstruct
    public void init(){
        algorithm.put("range",new RangeAlgorithm());
        algorithm.put("hash",new HashAlgorithm());
    }
    public String handler(String mode,String name,Object value,int length){
        return algorithm.get(mode).doSharding(name, value,length);
    }
}
复制代码

3, interceptors

We know, MyBatis mapping process allows you to have a statement execution of a point to intercept calls.

If you are not familiar with its principles, you can take a look at the author of the article: Mybatis principle interceptors .

Overall, its process is as follows:

  • By Mybatisintercepting SQL to be executed;
  • By JSqlParserparsing SQL, like fetch logic table;
  • Call slicing algorithm to obtain the real name of the table;
  • Modify SQL, and modification BoundSql;
  • MybatisExecute SQL revised, to achieve the purpose.

For example, for a insertstatement of its core code is as follows:

String sql = boundSql.getSql();
Statement statement = CCJSqlParserUtil.parse(sql);
Insert insert = (Insert) statement;
Table table = insert.getTable();
String newName = this.handler.handler(mode, table.getName(), value,length);
table.setName(newName);
ReflectionUtil.setField(boundSql,"sql",insert.toString());
复制代码

V. inquiries and pagination

In fact, the new and relatively simple modification, more complex query statements.

However, we do not plug that to meet all of the query, but can be extended modified based on real business scenarios.

But paging function is basically not escape. Take PageHelper, for example, its principle is by Mybatisinterceptors to achieve. If it points to our table and plug together, you can create a conflict.

So plug in the points table, the author also integrated pagination, and basically PageHelperthe same, but did not use it directly. In addition, by the query, the query whether the conditions with a shard key, is also very critical.

1, the query

In the scope of the algorithm, in business we require only a month or a specific query data in recent months can be; in the Hash algorithm, we require each time with a primary key.

But the second condition often can not be established, the business side also can not meet every need with a primary key.

In view of this situation, we can only iterate all the tables, queries that meet the conditions of the data, and then return to the summary;

for (int i=0;i<sharding.length();i++){
    Statement parse = CCJSqlParserUtil.parse(boundSql.getSql());
    sqlParser.processSelect(parse,i);
    cacheKey.update(new Object());
    List<E> query = ExecutorUtil.query(parse.toString());
    result.addAll(query);
}
复制代码

Disadvantage of this approach become apparent poor performance. Another common way is that you can query a mapping relationship with the shard key, first find the shard key field values ​​based on search criteria in the query, then the query based on the shard key.

2, Page

As noted above, the plug-in integrated pagination, processes and achieve PageHelperthe same, but considering the conflict, not directly.

private <E> List<E> queryPage(){
    Long count = this.getCount();
    page.setTotal(count.intValue());
    page.setCountPage();
    String limitSql = getLimitSql(page,sql);
    List<E> query = ExecutorUtil.query();
    page.addAll(query);
    return page;
}
复制代码

Six other

In fact, when I think the title of this article, the comparison really distressed. Because 分库分表the industry is a word, but this article does not involve the sub-library plug-in, but the only sub-table operation only, but the focus of this article is the idea, eventually called 分库分表, please basin faithful forgive me, do not call me heading the party ~

Due to limited space, only a small amount of code in the text, the Friends of the basin if interested can go https://github.com/taoxun/shardingfor a full Demo.

The author of the code, test cases and contains a number of built form SQL, you can create a direct run the project after the completion of the table.

Guess you like

Origin juejin.im/post/5dfc6cc0518825126f3735d7