A complete guide to MySQL sub-databases and tables: an advanced guide from novice to master!

Hello everyone, I am Xiaomi, a programmer who loves technology. Today, I will talk to you about the sub-database and sub-table technology in MySQL. I believe it is a very important topic for developers and DBAs.

 

What is sub-library and sub-table?

First, let’s first understand what sub-database and sub-table are. Database and table sharding refers to splitting data originally stored in a single database into multiple databases or multiple data tables . The purpose of this is to improve the scalability and performance of the database and solve the bottleneck of a single database in terms of data volume and concurrent access.

Why do we need sub-database and sub-table?

So why do we need sub-databases and tables? There are mainly the following reasons:

  • First, as the business develops and the amount of data continues to grow, the storage capacity of a single database may not be able to meet the demand. At this time, data can be dispersed into multiple databases through sub-databases to improve the storage capacity of the entire system.
  • Secondly, high concurrent access is also an issue that needs to be considered. When the amount of access is too large, a single database may not be able to handle so many concurrent requests. Through table splitting, data can be split into multiple tables according to certain rules to achieve balanced distribution of concurrent requests and improve the system's concurrent processing capabilities.

Horizontal sub-library

Horizontal sharding is to disperse data into multiple databases according to certain rules. Common rules include based on hash value of data, according to time range or according to business dimensions, etc. Through horizontal sharding, data can be dispersed to different database instances to achieve data offloading and load balancing.

Let us take an e-commerce project as an example to illustrate the concept of horizontal sharding. Assume that our e-commerce system has thousands of products, and each product has a large amount of order data. We can store products in different ranges in different databases according to the range of product IDs. For example, product IDs are limited to 10,000, products less than 10,000 are stored in database A, and products greater than 10,000 are stored in database B. In this way, each database only needs to process a part of the product data, which improves the concurrent processing capability of the database.

Level score table

Horizontal table sharding is to disperse data into different tables in the same database according to certain rules. This method is suitable for situations where the amount of data in a single table is too large, resulting in reduced query and write performance. Through horizontal table sharding, data can be dispersed into different tables to improve query performance and writing speed.

Let’s take a look at the application of horizontal sub-tables. In e-commerce projects, we can divide the order table into tables according to the time dimension. For example, each month's order data is stored in a separate table, such as order_202101, order_202102, etc. In this way, the amount of data in each table is relatively small, query and update operations can be performed more quickly, and the response speed of the system is improved.

Vertical sub-library

Vertical sharding is to disperse data into different databases according to business functions. Different business functions can exist independently in different databases, making each business independent of each other and reducing the association and dependence between databases.

In addition to horizontal splitting, we can also consider vertical splitting. In e-commerce projects, product information and order information are two independent modules, and their access modes and data characteristics may be different. We can store product information in a separate database and order information in another separate database. In this way, accesses between different databases will not affect each other, improving the overall performance of the system.

Vertical table

Vertical table splitting is to split a single table according to the characteristics of the columns. Dividing the columns in a table according to business functions or access frequencies reduces the number of columns in each table and improves query performance and storage efficiency.

In an e-commerce project, the product information table may contain a large number of fields, and some fields are updated less frequently, while other fields are updated more frequently. We can split the table vertically according to the update frequency of the fields, and split the fields with lower update frequency into independent tables. For example, store the basic information and description information of the product in one table, and store the inventory information and price information in another table. In this way, the locking of the entire table by frequently updated fields can be reduced, and the concurrency performance of the system can be improved.

Middleware that supports sub-databases and tables

In practical applications, we can use some middleware to realize the function of sub-database and sub-table. The more commonly used ones include ShardingSphere, MyCat, Vitess, etc. These middleware can parse and rewrite SQL, route data to the correct database or data table, hide the details of sub-databases and tables, and provide convenient interfaces and management tools.

Principles followed by sub-database and sub-table

There are some principles that need to be followed when sharding databases and tables. Here are some principles I have summarized, taking e-commerce projects as an example:

  1. Segmented according to business scenarios. For example, divide product information and order information into different databases.
  2. Avoid cross-database transactions. For example, when placing an order, you need to operate the product inventory and order table at the same time. You can redundant the product inventory information into the order table to avoid the overhead of cross-database transactions.
  3. Avoid cross-database Join operations . For example, when querying orders, try to avoid join operations between multiple tables. You can reduce the possibility of cross-database joins through redundant data or table grouping.
  4. Reasonably divide the data range. For example, the database is divided according to the range of product IDs, and the data table is divided according to the time dimension.
  5. Choose your shard key wisely. The selection of shard keys is critical and needs to be selected based on the characteristics of the data and the query mode to avoid data skew and hotspot issues.
  6. Plan your index properly. According to the query scenario and data distribution rules, select an appropriate index strategy to improve query efficiency.
  7. Properly allocate hardware resources. Sub-database and sub-table will increase the hardware resource consumption of the system and need to be reasonably configured according to the actual situation to ensure the performance and stability of the system.
  8. Regular maintenance and monitoring. After the database is divided into tables, regular maintenance and monitoring are required to detect and solve problems in a timely manner to ensure the stable operation of the system.
  9. Flexible expansion and migration. According to the development of the business, the database and data tables need to be flexibly expanded and migrated to ensure the scalability of the system.
  10. Backup and recovery strategy. After sharding the database and tables, the backup and recovery strategies also need to be adjusted accordingly to ensure data security and reliability.

suggestion

Finally, I would like to give you some suggestions:

  • Try not to cut it if possible. Database and table sharding will increase the complexity and maintenance costs of the system. Database and table sharding should only be considered when the amount of data and concurrent access reaches a certain level.
  • If you want to segment, you must choose appropriate segmentation rules and plan in advance. According to business characteristics and needs, select appropriate segmentation rules to avoid later adjustments and changes.
  • Data segmentation should try to reduce the possibility of database join failure through data redundancy or table grouping. To avoid frequent cross-database Join operations, you can reduce the possibility of cross-database Join through redundant data or table grouping.
  • Since the database middleware is difficult to grasp the advantages and disadvantages of data Join implementation, and it is extremely difficult to achieve high performance, business reads should use multi-table Join as little as possible, and up to three tables can be associated with queries. Reducing the frequency of multi-table Join operations can improve the query performance of the system.

END

I hope the above content will be helpful for everyone to understand the MySQL sub-database and table technology. MySQL's sharding of databases and tables is a complex and important technology. In practical applications, it needs to be reasonably designed and adjusted based on business needs and actual conditions. If you have any questions, please leave them in the comment area and I will try my best to answer them. thanks for your support!

If you have any questions or more technical sharing, please follow my WeChat public account " Know what it is and why "!

 

Guess you like

Origin blog.csdn.net/en_joker/article/details/131111570