Database (sub-database and sub-table) middleware comparison

Partitioning: It is transparent to the business. Partitioning just divides the files that store data into many small pieces. For example, a table in mysql corresponds to three files. MYD, MYI, and frm.

The data file (MYD) and the index file (MYI) are divided according to certain rules. The partitioned table is still a table. Partitioning can divide tables on different hard disks, but not on different servers.

  • Advantages: There are no multiple copies of the data, no data replication is required, and the performance is higher.
  • Disadvantages: The partition strategy must be fully considered to avoid data association between multiple partitions. Each partition is a single point. If a partition goes down, it will affect the use of the system.

 

Fragmentation: transparent to business, divided into multiple servers in physical implementation, different shards on different servers

Personally, I feel that there is no difference with the sharding database, but the name is different. It is worth mentioning that the concept and processing method of relational database and nosql database sharding are the same?

Please find the relevant information on your own to answer the questions

 

Sub-table: When the amount of data is large to a certain extent, it will lead to insufficient processing performance. At this time, there is no way to do it, but to perform sub-table processing. That is, the data in the database is divided into multiple data tables according to the principle of sub-database.

In this way, the large table can be turned into multiple small tables, and the data in different sub-tables is not repeated, thereby improving the processing efficiency.

There are also two options for sub-tables:

1. Sub-tables in the same database: All sub-tables are in one database. Since the table names in the database cannot be repeated, the data table names need to be named differently.

  • Advantages: Since they are all in one database, the common table does not need to be copied, and the processing is simpler
  • Disadvantages: Since it is still in a database, bottlenecks such as CPU, memory, file IO, and network IO cannot be solved, and the number of data records in a single table can only be reduced.

      Inconsistent table names will lead to complicated subsequent processing (refer to the mysql meage storage engine for processing)

2. Sub-tables in different databases: Since the sub-tables are in different databases, the same table name can be used at this time.

  • Advantages: bottlenecks such as CPU, memory, file IO, network IO, etc. can be effectively solved, the table names are the same, and the processing is relatively simple
  • Disadvantages: The public table needs to be replicated and synchronized because it is used in all sub-tables.

    Some aggregation operations, such as join, group by, order, etc., are difficult to carry out smoothly

Reference blog: http://www.cnblogs.com/langtianya/p/4997768.html, http://blog.51yip.com/mysql/949.html

 

Database sharding: Both table sharding and partitioning are based on data separation techniques in the same database, which can improve database performance to a certain extent, but with the increase in the amount of business data,

It turns out that all data is in one database, and network IO and file IO are concentrated in one database, so CPU, memory, file IO, and network IO may become system bottlenecks.

When the data capacity of the business system approaches or exceeds the capacity of a single server, and the QPS/TPS approaches or exceeds the processing limit of a single database instance, etc.

At this time, the data splitting method combining vertical and horizontal is often used to distribute data services and data storage to multiple database servers.

Sharding is just a colloquial term, a more standard name is data sharding, which is implemented by a method similar to the theoretical guidance of distributed databases, and achieves full transparency of data services and data storage for applications.

 

Read-write separation scheme

The storage and access of massive data, through the separation of read and write on the database, to improve the data processing capacity. The feature of the read-write separation scheme is that the database generates multiple copies.

The write operations of the database are concentrated in one database, and some read operations can be decomposed into other databases. In this way, as long as the cost of data replication is paid,

The processing pressure of the database can be decomposed into multiple databases, thereby greatly improving the data processing capability.

 

 


  

 

1>Cobar is a middleware that provides relational database (MySQL) distributed services. It allows traditional databases to be well linearly expanded, and looks like a database, which is transparent to applications.

Cobar is located between the front-end application and the actual database in the form of a proxy. The open interface to the front-end is the MySQL communication protocol. The front-end SQL statement is changed and sent to the appropriate back-end data sub-database according to the data distribution rules, and then the results are combined and returned to simulate Database behavior under a single library.

Cobar is a middle-tier solution that builds a layer of Proxy between the application and MySQL. The middle layer is between the application and the database, and needs to be forwarded once. There is no additional forwarding based on the JDBC protocol, and the application is directly connected to the database.

There is a slight advantage in performance. This does not mean that the middle layer is necessarily inferior to the direct connection of the client. In addition to performance, there are many factors that need to be considered. The middle layer is more convenient to implement functions such as monitoring, data migration, and connection management.

Cobar属于阿里B2B事业群,始于2008年,在阿里服役3年多,接管3000+个MySQL数据库的schema,集群日处理在线SQL请求50亿次以上。

由于Cobar发起人的离职,Cobar停止维护。后续的类似中间件,比如MyCAT建立于Cobar之上,包括现在阿里服役的RDRS其中也复用了Cobar-Proxy的相关代码。

 

2>MyCAT是社区爱好者在阿里cobar基础上进行二次开发,解决了cobar当时存 在的一些问题,并且加入了许多新的功能在其中。目前MyCAT社区活 跃度很高,

目前已经有一些公司在使用MyCAT。总体来说支持度比 较高,也会一直维护下去,发展到目前的版本,已经不是一个单纯的MySQL代理了,

它的后端可以支持MySQL, SQL Server, Oracle, DB2, PostgreSQL等主流数据库,也支持MongoDB这种新型NoSQL方式的存储,未来还会支持更多类型的存储。

MyCAT是一个强大的数据库中间件,不仅仅可以用作读写分离,以及分表分库、容灾管理,而且可以用于多租户应用开发、云平台基础设施,让你的架构具备很强的适应性和灵活性,

借助于即将发布的MyCAT只能优化模块,系统的数据访问瓶颈和热点一目了然,根据这些统计分析数据,你可以自动或手工调整后端存储,将不同的表隐射到不同存储引擎上,而整个应用的代码一行也不用改变。

MyCAT是在Cobar基础上发展的版本,两个显著提高:后端由BIO改为NIO,并发量有大幅提高; 增加了对Order By, Group By, Limit等聚合功能

(虽然Cobar也可以支持Order By, Group By, Limit语法,但是结果没有进行聚合,只是简单返回给前端,聚合功能还是需要业务系统自己完成)

 

3>TDDL是Tabao根据自己的业务特点开发了(Tabao Distributed Data Layer, 外号:头都大了)。主要解决了分库分表对应用的透明化以及异构数据库之间的数据复制,

它是一个基于集中式配置的jdbc datasourcce实现,具有主备,读写分离,动态数据库配置等功能。

TDDL并非独立的中间件,只能算作中间层,处于业务层和JDBC层中间,是以Jar包方式提供给应用调用,属于JDBC Shard的思想。

TDDL源码:https://github.com/alibaba/tb_tddl 
TDDL复杂度相对较高。当前公布的文档较少,只开源动态数据源,分表分库部分还未开源,还需要依赖diamond,不推荐使用。

 

4>DRDS是阿里巴巴自主研发的分布式数据库服务(此项目不开源),DRDS脱胎于阿里巴巴开源的Cobar分布式数据库引擎,吸收了Cobar核心的Cobar-Proxy源码

实现了一套独立的类似MySQL-Proxy协议的解析端,能够对传入的SQL进行解析和处理,对应用程序屏蔽各种复杂的底层DB拓扑结构,获得单机数据库一样的使用体验,

同时借鉴了淘宝TDDL丰富的分布式数据库实践经验,实现了对分布式Join支持,SUM/MAX/COUNT/AVG等聚合函数支持以及排序等函数支持,

通过异构索引、小表广播等解决分布式数据库使用场景下衍生出的一系列问题,最终形成了完整的分布式数据库方案。

 

5>Atlas是一个位于应用程序与MySQL之间的基于MySQL协议的数据中间层项目它是在mysql-proxy 0.8.2版本上对其进行优化,360团队基于mysql proxy 把lua用C改写,

它实现了MySQL的客户端和服务端协议,作为服务端与应用程序通讯,同时作为客户端与MySQL通讯。它对应用程序屏蔽了DB的细节。

Altas不能实现分布式分表,所有的字表必须在同一台DB的同一个DataBase里且所有的字表必须实现建好,Altas没有自动建表的功能。

原有版本是不支持分库分表, 目前已经放出了分库分表版本。在网上看到一些朋友经常说在高并 发下会经常挂掉,如果大家要使用需要提前做好测试。

 

6>DBProxy是美团点评DBA团队针对公司内部需求,在奇虎360公司开源的Atlas做了很多改进工作,形成了新的高可靠、高可用企业级数据库中间件

其特性主要有:读写分离、负载均衡、支持分表、IP过滤、sql语句黑名单、DBA平滑下线DB、从库流量配置、动态加载配置项

项目的Github地址是https://github.com/Meituan-Dianping/DBProxy

 

7>sharding-JDBC是当当应用框架ddframe中,从关系型数据库模块dd-rdb中分离出来的数据库水平分片框架,实现透明化数据库分库分表访问。

Sharding-JDBC是继dubbox和elastic-job之后,ddframe系列开源的第3个项目。

Sharding-JDBC直接封装JDBC API,可以理解为增强版的JDBC驱动,旧代码迁移成本几乎为零:

  • 可适用于任何基于Java的ORM框架,如JPA、Hibernate、Mybatis、Spring JDBC Template或直接使用JDBC。
  • 可基于任何第三方的数据库连接池,如DBCP、C3P0、 BoneCP、Druid等。
  • 理论上可支持任意实现JDBC规范的数据库。虽然目前仅支持MySQL,但已有支持Oracle、SQLServer等数据库的计划。

Sharding-JDBC定位为轻量Java框架,使用客户端直连数据库,以jar包形式提供服务,无proxy代理层,无需额外部署,无其他依赖,DBA也无需改变原有的运维方式。

Sharding-JDBC分片策略灵活,可支持等号、between、in等多维度分片,也可支持多分片键。

SQL解析功能完善,支持聚合、分组、排序、limit、or等查询,并支持Binding Table以及笛卡尔积表查询。

 

 

知名度较低的:

Heisenberg

Baidu.
其优点:分库分表与应用脱离,分库表如同使用单库表一样,减少db连接数压力,热重启配置,可水平扩容,遵守MySQL原生协议,读写分离,无语言限制,

mysqlclient, c, java都可以使用Heisenberg服务器通过管理命令可以查看,如连接数,线程池,结点等,并可以调整采用velocity的分库分表脚本进行自定义分库表,相当的灵活。

https://github.com/brucexx/heisenberg(开源版已停止维护)

CDS

JD. Completed Database Sharding.
CDS是一款基于客户端开发的分库分表中间件产品,实现了JDBC标准API,支持分库分表,读写分离和数据运维等诸多共,提供高性能,高并发和高可靠的海量数据路由存取服务,

业务系统可近乎零成本进行介入,目前支持MySQL, Oracle和SQL Server.
(架构上和Cobar,MyCAT相似,直接采用jdbc对接,没有实现类似MySQL协议,没有NIO,AIO,SQL Parser模块采用JSqlParser, Sql解析器有:druid>JSqlParser>fdbparser.)

DDB

网易. Distributed DataBase.
DDB经历了三次服务模式的重大更迭:Driver模式->Proxy模式->云模式。

Driver模式:基于JDBC驱动访问,提供一个db.jar, 和TDDL类似, 位于应用层和JDBC之间. Proxy模式:在DDB中搭建了一组代理服务器来提供标准的MySQL服务,

在代理服务器内部实现分库分表的逻辑。应用通过标准数据库驱动访问DDB Proxy, Proxy内部通过MySQL解码器将请求还原为SQL, 并由DDB Driver执行得到结果。

私有云模式:基于网易私有云开发的一套平台化管理工具Cloudadmin, 将DDB原先Master的功能打散,一部分分库相关功能集成到proxy中,

如分库管理、表管理、用户管理等,一部分中心化功能集成到Cloudadmin中,如报警监控,此外,Cloudadmin中提供了一键部署、自动和手动备份,版本管理等平台化功能。

 

OneProxy:

数据库界大牛,前支付宝数据库团队领导楼方鑫开发,基于mysql官方 的proxy思想利用c进行开发的,OneProxy是一款商业收费的中间件, 楼总舍去了一些功能点,

专注在性能和稳定性上。有朋友测试过说在 高并发下很稳定。

Oceanus(58同城数据库中间件)

Oceanus致力于打造一个功能简单、可依赖、易于上手、易于扩展、易于集成的解决方案,甚至是平台化系统。拥抱开源,提供各类插件机制集成其他开源项目,

新手可以在几分钟内上手编程,分库分表逻辑不再与业务紧密耦合,扩容有标准模式,减少意外错误的发生。

 

Vitess:

这个中间件是Youtube生产在使用的,但是架构很复杂。 与以往中间件不同,使用Vitess应用改动比较大要 使用他提供语言的API接口,我们可以借鉴他其中的一些设计思想。

Kingshard:

Kingshard是前360Atlas中间件开发团队的陈菲利用业务时间 用go语言开发的,目前参与开发的人员有3个左右, 目前来看还不是成熟可以使用的产品,需要在不断完善。

MaxScale与MySQL Route:

这两个中间件都算是官方的吧,MaxScale是mariadb (MySQL原作者维护的一个版本)研发的,目前版本不支持分库分表。

MySQL Route是现在MySQL 官方Oracle公司发布出来的一个中间件。

 

 

转载自:https://www.cnblogs.com/wangzhongqiu/p/7100332.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326024458&siteId=291194637