Overview of sub-tables and sub-databases

1. Why does sub-database and sub-table appear?

What should I do if the amount of application data is too large and the mysql server cannot support it?
Option 1: Improve data processing capabilities by improving server hardware capabilities, such as increasing storage capacity, CPU, etc. This option is very costly, and if the bottleneck is MySQL itself, then improving the hardware is also very costly.
Option 2: Disperse the data in different databases to reduce the amount of data in a single database to alleviate the performance problems of a single database, thereby achieving the purpose of improving database performance, as shown below: split the e-commerce database into several independent databases, And large tables are also split into several small tables. This database splitting method can solve database performance problems.

2. Four implementation forms of sub-database and sub-table

Sub-database and sub-table include two parts: sub-database and sub-table. In production, there are usually four methods: vertical sub-database, horizontal sub-database, vertical sub-table and horizontal sub-table.

Vertical table

Divide a table into multiple tables according to fields, and each table stores part of the fields.

  • In order to avoid IO contention and reduce the chance of table locking, users viewing details and product information browsing do not affect each other.
  • Give full play to the operational efficiency of popular data, and the high efficiency of product information operations will not be hindered by the low efficiency of product descriptions.

Vertical sub-library

Tables are classified according to business and distributed to different databases. Each database can be placed on a different server. Its core concept is that the database is dedicated.
The improvements it brings are:

  • Solve the coupling at the business level and make the business clear
  • Ability to perform hierarchical management, maintenance, monitoring, expansion, etc. of data of different businesses
  • In high-concurrency scenarios, vertical sub-library can increase the number of IO and database connections to a certain extent and reduce the bottleneck of single-machine hardware resources.
  • Vertical sub-database classifies tables by business and then distributes them in different databases, and these databases can be deployed on different servers, thereby achieving the effect of sharing the pressure on multiple servers, but it still does not solve the problem of excessive data volume in a single table. .

Horizontal sub-library

It splits the data of the same table into different databases according to certain rules, and each database can be placed on a different server.
The improvements it brings are:

  • It solves the performance bottleneck of big data in a single database and high concurrency.
  • Improved system stability and availability.
  • When it is difficult for an application to be vertically segmented at a finer granularity, or the number of rows of data after segmentation is huge, and there are bottlenecks in reading, writing, and storage performance of a single database, then it is necessary to perform horizontal segmentation. After optimization of horizontal segmentation, often It can solve the storage capacity and performance bottleneck of a single database. However, since the same table is distributed in different databases, additional routing work for data operations is required, which greatly increases the system complexity.

Level score table

In the same database, the data of the same table is split into multiple tables according to certain rules.
The improvements it brings are:

  • Optimize performance problems caused by excessive data volume in a single table
  • Avoid IO contention and reduce the chance of table locks
  • The horizontal table splitting in the database solves the problem of excessive data volume in a single table. The divided small table only contains part of the data, thereby reducing the data volume of a single table and improving retrieval performance.

3. Common technical solutions for sub-database and sub-table:

Technical solutions for sharding databases and tables are generally divided into two categories: application layer dependency middleware and middle layer proxy middleware .
Insert image description here

1. Application layer dependent class middleware

The characteristic of this type of sub-database and sub-table middleware is that it is strongly coupled with the application and requires the application to depend on the corresponding jar package (taking Java as an example), such as the well-known TDDL, Dangdang open source sharding-jdbc, Mogujie's TSharding, and Ctrip open source Ctrip-DAL etc.

The basic idea of ​​this type of middleware
is to re-implement the API of JDBC, by re-implementing the interfaces for operating the database such as DataSource and PrepareStatement, so that the application layer can be implemented transparently without changing the business code (note: basic is used here). The ability to sub-database and sub-table.
The middleware provides the familiar JDBC API to upper-layer applications, and internally obtains truly executable SQL through a series of preparations such as SQL parsing, SQL rewriting, and SQL routing. The bottom layer then obtains physical SQL using traditional methods (such as database connection pools). Connect to execute sql, and finally merge the data results into a ResultSet and return it to the application layer.
** Advantages: ** No additional deployment is required, just publish it together with the application binding
** Disadvantages: ** It cannot cross languages. For example, sharding-jdbc written in Java obviously cannot be used in C# projects, so Ctrip's dal We also need to rewrite a C# client.

2. Middle layer proxy middleware

The core principle of this type of sub-database and sub-table middleware is to set up a proxy layer between the application and the database. The upper-layer application uses the standard MySQL protocol to connect to the proxy layer, and then the proxy layer is responsible for forwarding the request to the underlying MySQL physical instance. , this method has only one requirement for the application, which is that it only needs to use the MySQL protocol to communicate, so a pure client like MySQL Workbench can directly connect to your distributed database, and naturally supports all programming languages. More representative products include the groundbreaking Amoeba, Alibaba's open source Cobar, and Mycat, which has relatively good community development.

Original blog post and related extensions:
Sub-database and sub-table: middleware solution comparison
TDDL
sub-database and sub-table - distributed transaction theory and solutions

Guess you like

Origin blog.csdn.net/weixin_43828467/article/details/129910654