Why Sub-database Sub-table & Why Use Mycat

In the Internet era, the storage and access of massive data has become a bottleneck in system design and use.

There are two types of scenarios: Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP).

Online transaction processing (OLTP) is also known as transaction-oriented processing system. Its basic feature is that the original data can be immediately transmitted to the computing center for processing, and the processing results can be given in a very short time, and the real-time reading and writing requirements are high.

Online Analytical Processing (OLAP) refers to analyzing, querying and reporting data in a multi-dimensional way. It can be used in conjunction with data mining tools and statistical analysis tools to enhance decision-making and analysis functions, with low real-time reading and writing requirements.

 

 

Therefore, the two major tricks to solve the distributed problem are: sub-database sub-table, read-write separation, that is, master-write-slave-reader

Simply put, it means that the data we store in the same database is distributed to multiple databases (main

machine) to achieve the effect of dispersing the load of a single device.

Data sharding can be divided into two sharding modes according to the type of sharding rules. One is to follow different tables (or

Schema) to be divided into different databases (hosts), this kind of division can be called vertical (vertical) division of data; the other is to divide the data in the same table according to the logical relationship of the data in the table. The data is split into multiple databases (hosts) according to certain conditions. This splitting is called horizontal (horizontal) splitting of data .

The biggest feature of vertical segmentation is that the rules are simple and the implementation is more convenient.

Small system with very clear business logic. In such a system, it is easy to split tables used by different business modules into different databases. Splitting according to different tables has less impact on the application, and the splitting rules will be simpler and clearer.

Horizontal segmentation is a bit more complicated than vertical segmentation. Because you want to split different data in the same table into different databases

For the application, the splitting rule itself is more complicated than splitting according to the table name, and the later data maintenance will also be more complicated.

 

vertical slice

A database consists of many tables, each of which corresponds to a different business. Vertical segmentation refers to classifying tables according to business and distributing them to different businesses.

on the database, so that the data or pressure is shared among different databases.

 

Advantages and disadvantages of vertical segmentation:

advantage:

After the split, the business is clear and the split rules are clear.

Easy integration or expansion between systems.

Data maintenance is simple.

shortcoming:

Some business tables cannot be joined and can only be solved through interfaces, which increases the system complexity.

 Due to the different limitations of each business, there is a single-database performance bottleneck, which is not easy to expand data and improve performance.

 Transaction processing is complex.

Since vertical segmentation distributes tables to different libraries according to business classification, some business tables will be too large, and there are single library read-write and storage bottles.

neck, so it needs to be split horizontally to solve it.

 

Horizontal segmentation

Compared with vertical splitting, horizontal splitting does not classify tables, but distributes them into multiple libraries according to certain rules of a field.

contains some data. To put it simply, we can understand the horizontal segmentation of data as the segmentation of data rows, that is, some rows in the table are divided into one database, and some other rows are divided into other databases.

 

 

There are advantages and disadvantages to splitting horizontally.

advantage:

The splitting rules are abstracted well, and the join operation can basically be done by the database.

There is no single database big data, high concurrency performance bottleneck.

There is less modification on the application side.

Improve the stability and load capacity of the system.

shortcoming:

Splitting rules are hard to abstract.

Shard transaction consistency is difficult to solve.

It is difficult to expand the data multiple times and maintain a great amount of maintenance.

Cross-library join performance is poor.

The differences and advantages and disadvantages of vertical segmentation and horizontal segmentation have been discussed earlier. You will find that each segmentation method has disadvantages, but the common features and disadvantages are:

Introduce the problem of distributed transactions.

The problem of joining across nodes.

Merge sort pagination issue across nodes.

Multiple data source management issues.

For data source management, there are currently two main ideas:

A. Client mode, configure and manage one (or more) data sources needed by itself in each application module, and directly access each data

library to complete data integration within the module;

B. Unified management of all data sources through the intermediate proxy layer, and the back-end database cluster is transparent to the front-end application;

Maybe more than 90% of people will tend to choose the second solution when faced with the above two solutions, especially when the system continues to become large and complex

when. Indeed, this is a very correct choice. Although the cost to be paid in the short term may be relatively larger, it is very helpful for the scalability of the entire system.

Mycat solves the shortcomings of traditional databases through data segmentation, and has the advantages of easy expansion of NoSQL. The majority is circumvented by an intermediate proxy layer

The data source processing problem is completely transparent to the application, and solutions are also made to the problems existing after data segmentation.

Due to the difficulty of data Join after data segmentation, I also share the experience of data segmentation here:

The first principle : try not to divide as much as possible.

The second principle : If you want to segment, you must choose the appropriate segmentation rules and plan in advance.

The third principle : Data segmentation should try to reduce the possibility of cross-database Join through data redundancy or table grouping.

Fourth principle : Because database middleware is difficult to grasp the advantages and disadvantages of data Join implementation, and it is extremely difficult to achieve high performance, business reading should be done as much as possible.

Use less multi-table Join.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326954320&siteId=291194637