Database sub-database sub-table (sharding) series (3) Considerations about using framework or self-development and sharding implementation

When the team has carefully sorted out the system business and database, and determined the segmentation plan, the next question is how to implement the segmentation plan. At present, there are many open source frameworks and products for reference in sharding. Many teams will also choose to develop and implement independently. Regardless of whether they choose a framework or develop independently, they will face the problem of which layer to implement sharding logic. This article will analyze and consider this series of problems one by one. Link to the original text of this article:  http://blog.csdn.net/bluishglc/article/details/7766508  Please indicate the source for reprinting!

1. Implementation level of sharding logic

 

From the perspective of the program architecture of a system , the sharding logic can be in the DAO layer, the JDBC API layer, the spring data access encapsulation layer (various spring templates) between DAO and JDBC, and between the application server and the database The sharding proxy server is implemented on four levels.

 

Figure 1. Sharding implementation level and related frameworks/products

  • Implemented at the DAO layer

When the team decides to implement sharding by itself, the DAO layer may be the preferred location to embed the sharding logic, because at this level, each DAO method clearly knows the data table and query parameters that need to be accessed, and can be directly located with the help of this information. To the target shard, there is no need to parse the SQL like the framework and then route according to the configured rules. Another advantage is not being constrained by the ORM framework. Because most applications now rely on a certain ORM framework on the data access layer, and most shrading frameworks often cannot support or can only support one kind of ORM framework, which makes the selection and application of frameworks subject to great constraints. Implementing sharding by yourself has no problems in this regard, and even different shards can work together in coordination with different orm frameworks. For example, most of the current Java applications use hibernate , but there is no very satisfactory hibernate-based sharding framework (about hibernate hards will be introduced below), so many teams will choose to implement sharding by themselves.

 

To briefly summarize, the advantages of implementing sharding at the DAO layer are: not restricted by the ORM framework, relatively simple to implement, easy to flexibly customize according to system characteristics, no need for SQL parsing and routing rule matching, and the performance will be slightly better. ; The disadvantage is: there are certain technical thresholds, the workload is larger than relying on the framework (on the other hand, the framework will have learning costs), it is not universal, and it can only work in a specific system. Of course, at the DAO layer, the sharding logic can also be extracted to the "external" through XML configuration or annotations to form a general framework. However, there is no such framework yet.

  • Implemented at the ORM framework layer

There are two directions for implementing sharding in the ORM framework layer. One is to provide sharding support on the premise of implementing OR Mapping, thereby positioning it as a distributed data access framework. The representative of this type of framework is guzz. The other direction is The sharding mechanism is added by modifying and enhancing the existing ORM framework. The representative product of this type is hibernate shard . It should be said that with the mainstream status of hibernate, the industry has a very urgent need for a sharding framework for hibernate, but as far as the current hibernate shards are concerned, the performance is not impressive Satisfied, mainly because it has too many restrictions on the use of hibernate, such as its very limited support for HQL. In terms of mybatis, there is no mature related framework yet. Some people propose to use the plug-in mechanism of mybatis to realize sharding, but unfortunately, the plug-in mechanism of mybatis cannot control the connection level of multiple data sources. On the mybatis framework, there is currently no framework for reference, and the team may have to work on the DAO layer or the Spring template class.

  • Implemented at the JDBC API layer

The JDBC API layer is an excellent place that many people think of to implement sharding. If we can provide a JDBC API implementation that implements sharding logic, then sharding is completely transparent to the entire application, and such an implementation can directly As a general sharding product. However, the technical threshold and workload of this solution are obviously beyond what ordinary teams can do, so basically no team will implement sharding at this level, and there is no such open source product. I know of only one commercial product , dbShards , that uses this approach.

  • Implemented in Spring data access encapsulation layer between DAO and JDBC

在springd大行其道的今天,几乎没有哪个java平台上构建的应用不使用spring,在DAO与JDBC之间,spring提供了各种template来管理资源的创建与释放以及与事务的同步,大多数基于spring的应用都会使用template类做为数据访问的入口,这给了我们另一个嵌入sharding逻辑的机会,就是通过提供一个嵌入了sharding逻辑的template类来完成sharding工作.这一方案在效果上与基于JDBC API实现的方案基本一致,同样是对上层代码透明,在进行sharding改造时可以平滑地过度,但它的实现却比基于JDBC API的方式简单,因此成为了不少框架的选择,阿里集团研究院开源的Cobar Client就是这类方案的一种实现。

  • 在应用服务器与数据库之间通过代理实现

在应用服务器与数据库之间加入一个代理,应用程序向数据发出的数据请求会先通过代理,代理会根据配置的路由规则,对SQL进行解析后路由到目标shard,因为这种方案对应用程序完全透明,通用性好,所以成为了很多sharding产品的选择。在这方面较为知名的产品是mysql官方的代理工具:Mysql Proxy和一款国人开发的产品:amoeba。mysql proxy本身并没有实现任何sharding逻辑,它只是作为一种面向mysql数据库的代理,给开发人员提供了一个嵌入sharding逻辑的场所,它使用lua作为编程语言,这对很多团队来说是需要考虑的一个问题。amoeba则是专门实现读写分离与sharding的代理产品,它使用非常简单,不使用任何编程语言,只需要通过xml进行配置。不过amoeba不支持事务(从应用程序发出的包含事务信息的请求到达amoeba时,事务信息会被抹去,因此,即使是单点数据访问也不会有事务存在)一直是个硬伤。当然,这要看产品的定位和设计理念,我们只能说对于那些对事务要求非常高的系统,amoeba是不适合的。

二、使用框架还是自主开发?

A lot of open source frameworks and products have been listed in the previous discussion, here we will sort them out: MySQL Proxy and Amoeba based on proxy methods, Hibernate Shards based on Hibernate framework, Cobar Client by rewriting spring's ibatis template class, these frameworks Each has its own advantages and shortcomings. The architect can make a choice based on the actual situation of the project after in-depth research, but in general, I am cautious about the choice of the framework. On the one hand, most frameworks lack the verification of successful cases, and their maturity and stability are questionable. On the other hand, whether some open source frameworks from successful commercial products (such as some open source projects from Alibaba and Taobao) are suitable for your project requires in-depth research and analysis by architects. Of course, the final choice must be determined based on comprehensive factors such as project characteristics, team status, technical thresholds, and learning costs.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326294902&siteId=291194637