MyCat Enlightenment: Database Architecture Evolution for Distributed Systems

MyCat is a database sub-database and sub-table middleware. Using MyCat can easily implement database sub-database and sub-table query, and reduce the business code in the project. Today, we will introduce the background of the birth of MyCat and the role of MyCat through the evolution of database architecture development, so that everyone can have a deep understanding of the birth of MyCat and its role.

Single database architecture

In the initial stage of a project, in order to verify the market as quickly as possible, the biggest requirement for the business system is to implement it quickly. At this stage, in order to quickly implement the business system, code developers generally write the business code of all levels (MVC) in the same project, and store all business data in the same database . At this point, the overall architecture diagram of the project is as follows:

 

As can be seen from the above figure, we have concentrated the business codes of the registration, login, and shopping modules in one project, and these three business modules all read the same business database.

However, with the continuous advancement of the project and the growing number of users, a single application server can no longer bear such a huge amount of traffic. The common practice at this time is to deploy the project in a distributed manner to disperse the traffic of a single server , so as to temporarily relieve the pressure on the application server caused by user growth. The project architecture diagram at this time is as follows:

 

But as we deploy more and more application servers, a single database server on the backend can no longer handle such a huge amount of traffic. In order to ease the user access pressure as soon as possible, we generally add a cache layer between the application server and the database server , which can offset part of the database query operations through the cache . The project architecture diagram at this time is as follows:

Distributed deployment-cache-single database architecture

However, increasing the database cache layer can only relieve the pressure of database access and intercept some database access requests. With the further growth of user access, the bottleneck of database access will be further highlighted. At this time, we have to transform the architecture of the data layer.

Master-Slave Database Architecture

The common solution at this time is to turn the original single database server into a database server in master-slave mode , that is, one database as the master database supports writing data, and the other database serves as the reading database to support query data. At this point, the architecture diagram of the project is as follows:

 

We achieve read-write separation through database master-slave synchronization, directing all read operations to the slave library, and directing all write operations to the master library.

Because we have reformed the database layer, it is stipulated that all read database operations must access the slave library, and all write database operations must access the main library, then we must reform the original code.

public User selectUser(){
    dataTemplate.selectById(...);
}
public User insertUser(){
    dataTemplate.insert(user);
}

The above is the code before the transformation. Whether it is a read operation or a write operation, we use the same data source to operate. But in order to adapt to the new database schema, we had to manually decide in the code which data source should be requested.

public User selectUser(){
    readTemplate.selectById(...);
}
public User insertUser(){
    writeTemplate.insert(user);
}

After the modified code, the developer determines which data source should be selected for operation based on its own experience. When it is a read operation, we choose readTemplate. When it is a write operation, we choose writeTemplate.

But as a programmer, we vaguely feel that the judgment of which data source should be used should not be judged manually, but should be automatically judged by the code. After all, the pattern of this judgment is very simple - if it is select then use the read data source, if it is other then use the write data source.

In fact, this is one of the uses of MyCat, that is, as a database middleware to solve the problem of data source judgment. If we use MyCat as database middleware, then we don't need to care which datasource I should use. MyCat helps us shield the differences between different data sources. For us, there is only one data source, which can handle write operations and read operations. The above query and insert code can become as follows:

public User selectUser(){
    dataTemplate.selectById(...);
}
public User insertUser(){
    dataTemplate.insert(user);
}

After implementing the master-slave database architecture, and then using MyCat, you find that we don't need to modify too much code, just change the data source to the MyCat address. MyCat automatically sends all our statements to the backend MySQL server.

When we use the master-slave database architecture, we will find that we can support more user access and requests. However, with the further development of the business, it can be found that there will be some problems:

  • When we modify the registration module, we need to publish the entire project once, which will affect the normal use of the login and shopping modules.

  • Even if the code for each change is small, we still need to release the entire project package, which makes the code package per release very large.

  • With the continuous growth of business volume, we will find that even if the master-slave read-write separation is realized, the pressure on the database is very large, and it seems that it is almost unbearable.

The problems mentioned above are only some of the problems encountered in actual combat. In fact, the problems encountered will only be more and not less, and will become more prominent as the business continues to develop.

Vertically Sliced ​​Database Architecture

At this time, in order to prevent each business module from affecting each other, we vertically split the application layer, that is, the registration module, the login module, and the shopping module are all used as an application system , which read and write independent database servers. At this point, our system architecture diagram is shown in the following figure:

 

After realizing the vertical split, we can successfully solve the three problems mentioned above: the problem of mutual influence of business modules and the problem of single database pressure.

But with the further expansion of the business, we have added many business modules: customer service module, wallet module, personal center module, favorites module, order module, etc. According to the database architecture we designed before, we will have many data sources, which are scattered in various projects:

  • User database 192.168.0.1

  • commodity database 192.168.0.2

  • SMS database 192.168.0.3

  • Customer Service Database 192.168.0.4

  • Wallet database 192.168.0.5

  • ……

For a project manager, with so many data sources scattered in different projects, how to manage them in a unified manner is a problem. Many times it is difficult for us to remember which database this project is connected to, and which database that project is connected to.

But if you use MyCat as database middleware, MyCat can help you solve this problem. For all projects, they only need to connect to one address provided externally by MyCat, and MyCat helps these projects contact all the back-end MySQL databases. For the two front-end projects, they only know MyCat, the database middleware, and do not need to care which database I connect to. MyCat can complete this task through its own configuration.

Which table redundant code, so that developers can focus more on the development of business logic.

Horizontally Sliced ​​Database Architecture

After the database architecture has undergone a master-slave architecture and a vertical split architecture, there is no problem with general business reading and writing. But for some core business data, there may still be bottlenecks, such as user modules.

For some user systems with up to 100 million users, even after the optimization of the master-slave architecture and the vertical split architecture, the data that needs to be stored in a single table of the user database is still as high as 100 million. If we store all the data in one table, whether it is the insertion data during registration, or the query data during login, it is bound to become very slow.

At this time, we have to horizontally split these high-volume core business tables, that is, split the massive data records into multiple tables for storage. For example, we may have only one User table at the beginning, and we will split the User table by the user ID with the remainder of 1000, then we will have 1000 tables, namely User_000 to User_999. At this point, the architecture diagram of the project is as follows:

 

When we query user data in the code, we first judge the table it should operate based on the remainder of the user ID, and then query the corresponding table. For example, a user whose UserId is 90749738 should query the User_38 table, and a user whose UserId is 74847383 should query the User_83 table.

Through horizontal splitting, we have successfully solved the bottleneck problem of reading and writing core business tables of massive data. But at this time, there is a problem at the code level, that is, we need to judge which table should be queried according to UserId before querying the database. This operation is highly consistent for all business modules and should be separated into one public items.

Consistent with judging whether to use read data sources or write data sources, we all feel that such mechanical tasks should not be left to programmers, but should be done by machines. This is actually what MyCat can do for us: MyCat allows MyCat to automatically determine which sub-table should be queried by configuring a series of sub-database and sub-table rules. By using the MyCat database middleware, we can save the redundant code of judging which table to query at the code layer, so that developers can focus more on the development of business logic.

Summarize

From a single database architecture, to a database architecture with master-slave read-write separation, to a vertically split and horizontally split database architecture. We can see that MyCat helps us solve the three mechanical repetitive problems of reading and writing data source judgment, complex data source address, and table division judgment .

But MyCat has developed so far, and its functions have far exceeded the three mentioned above. For example, MyCat supports master-slave switching function. When a network problem or other failure occurs in the database master database, MyCat can automatically switch to the slave database to ensure normal read and write functions. MyCat is positioned as a database middleware, and MyCat can do everything between the application layer and the data layer.

Recommend an exchange learning skirt: 69---7-57-9-7-5-1 It will share some videos recorded by senior architects: Spring, MyBatis, Netty source code analysis, high concurrency, high performance, distributed , The principle of microservice architecture, and JVM performance optimization have become a necessary knowledge system for architects. You can also receive free learning resources, which are currently benefiting a lot:

​Through this article, we understand the background of the birth of MyCat and its most basic functions, then in the next article, we will use the simplest available Demo to use MyCat.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325305552&siteId=291194637