Freyja2版本对分库分表的处理方式

在freyja2里面，对于数据库sharding这方面一张表需要设定一个切分的key列

假设UserProperty 里面有 id,uid,pid,num这4个字段，uid为用户id。我们把这张表以uid来切分

数据库设定为2个，表切分为5个/库。

db_0,
t_user_0,t_user_1,t_user_2,t_user_3,t_user_4

t_user_property_0,t_user_property_1,t_user_property_2,t_user_property_3,t_user_property_4


db_1,
t_user_0,t_user_1,t_user_2,t_user_3,t_user_4

t_user_property_0,t_user_property_1,t_user_property_2,t_user_property_3,t_user_property_4

freyja2屏蔽了上层分库、分表逻辑

在开发过程中，insert、update、select、delete操作和非sharding 项目没有任何差别

insert一条user记录，

insert user(name)values(?)

如果user是以name列为切分的key列，那么name为?的这条记录会被分配到

	public DbResult getShardingTableName(Object value) {
		String tableName = getTableName();

		int hashCode = value.hashCode();
		if (hashCode < 0) {
			hashCode = -hashCode;
		}
		int dbNo = hashCode % ShardingUtil.engine.getDbNum();
		int tableNo = hashCode % ShardingUtil.engine.getTableNum();
		tableName = tableName + "_" + tableNo;
		return new DbResult(tableName, tableNo, dbNo);
	}

dbNo数据库的tableNo表

这个时候user的主键：id为自动增长，自然获得了一个对应的id值。那么在查询的时候，

无论是select * from User where id = ? 还是 select * from User where name = ?

都能够找到其所在的dbNo 和tableNo

idSubNum 为每个表的容量

	public DbResult getShardingTableNameById(Object idValue) {
		String tableName = getTableName();
		if (idValue == null || !isSubTable()) {
			return new DbResult(tableName, -1, -1);
		} else {
			int n = (Integer) idValue / ShardingUtil.engine.getIdSubNum();
			int dbNo = n / ShardingUtil.engine.getTableNum();
			int tableNo = n % ShardingUtil.engine.getTableNum();
			tableName = tableName + "_" + tableNo;
			return new DbResult(tableName, tableNo, dbNo);
		}

	}

单表的sharding就不多说了，联表查询？

假设有原有这样的一条查询sql:

select * from t_user_property p left join t_user u on p.uid = u.id where u.name = ?

对于sharding之后的应用来说 sql当然不能这样写。

前面说了2点：

1、freyja2是屏蔽上层sharding逻辑的。

2、user的name和id对应的是同一张表、库

在freyja2的处理里面，sql还是按照这样写，根据逻辑freyja根据name的值解析出了user所在的dbNo和tableNo

重新生成sharding的sql（假设tableNo为3，dbNo的作用在于找到对于的数据源）：

select * from t_user_property_3 p left join t_user_3 u on p.uid = u.id where u.name = ?

t_user_property 这个表因为也是以uid来分库的，所以其对应的表和库实际上与t_user是一致的。

有2种复杂一点的情况：

1、单表查询不包括key列

2、联合查询以非key列为join列

对于第一种情况，freyja会扫描所有的库、表汇总一个结果集返回来例如：

select * from t_user where gold > 0
会变为
db_0:
select * from t_user_0 where gold > 0;
select * from t_user_1 where gold > 0;
select * from t_user_2 where gold > 0;
select * from t_user_4 where gold > 0;
db_1:
select * from t_user_0 where gold > 0;
select * from t_user_1 where gold > 0;
select * from t_user_2 where gold > 0;
select * from t_user_4 where gold > 0;

对于第二种情况：

第二种情况比较少见，不可能跨数据源查询。一般既然分库，key列就成为了关键列。所有sql都要围绕着这个列，处理起来需要进行一些变通，暂时没有深挖这部分sql的处理

至于排序、分页存在一种情况：就是非key列的查询，sql中如果带有key列那么排序、分页实际上就是普通的查询一样，数据库来做。但是一旦需要扫描整个库，排序和分页是需要freyja2在返回结果集的时候处理的

在项目里面分出了2个库一共有3个库：db,db_0,db_1 db0和1是存储需要分库的表 db则放置不需要分库的表，毕竟不是每个表都要去分库的。

至于数据迁移和扩容，这部分还未想出比较好的方法。

sharding这部分我一直都认为这是数据库分内的事，应该可以把freyja的sharding这部分再提取出来放在驱动一层上，更加的脱离业务

Freyja2版本对分库分表的处理方式

猜你喜欢