Database architecture evolution

    The relational database itself is relatively easy to become a system bottleneck, and the storage capacity, number of connections, and processing capacity of a single machine are limited. When the data volume of a single table increases, due to the large number of query dimensions, that is, upgrading the hardware, upgrading the network, and optimizing the SQL index, the performance still drops severely when many operations are performed. At this time, you need to consider adjusting the database architecture.

1. Master-slave replication, separation of read and write

2. Partition

3. Sub-table

4. Rules and strategies for common partitions and tables

5. Sub-library

6. Problems faced after sub-database/sub-table

7. When to consider segmentation


1. Master-slave replication, separation of read and write

    MySQL's master-slave replication solves the separation of database reads and writes, and improves read performance.

           

    

    However, master-slave replication also brings a series of other performance bottlenecks: writes cannot be expanded, writes cannot be cached, replication delays, table lock rates increase, tables become larger, and cache rates decrease. At this time, database partitioning needs to be considered. .

2. Partition

    Partitioning is to divide the data of a table into N blocks. Logically, it is only a table in the end, but the bottom layer is composed of N physical blocks. Data partitioning is a physical database design technology, its purpose is to reduce the total amount of data read and write in a specific SQL operation to reduce the response time. Partitioning is not to generate a new data table, but to evenly distribute the data of the table to different hard disks, systems or storage mesons of different servers. In fact, it is still a table. In addition, partitioning can balance the data of the table to different places, improve the efficiency of data retrieval, and reduce the frequent IO pressure value of the database.

1.1 When to consider partitions

  • The query speed of a table has been slow to affect the use

  • sql is optimized

  • Master-slave replication has been added

  • Big amount of data

  • The data in the table is segmented

  • Operations on data often involve only part of the data, not all of the data

1.2 Horizontal partition

    This form of partitioning is to partition the rows of the table. In this way, data sets divided by physical columns in different groups can be combined to perform individual partitioning (single partition) or collective partitioning (1 or more partitions). All the columns defined in the table can be found in each data set, so the characteristics of the table are still maintained.

1.3 Vertical partition

    This partitioning method generally reduces the width of the target table by vertically partitioning the table, so that certain specific columns are divided into specific partitions, and each partition contains the row corresponding to the column.

3. Sub-table

    Sub-table is to decompose a table into N entity tables with independent storage space according to certain rules. When the system reads and writes, it needs to get the corresponding word indication according to the defined rules, and then operate it.

3.1 Vertical sub-table

    Vertical table splitting is based on the "columns" in the database. If a table has many fields, you can create an extended table, and split the fields that are not frequently used or with larger field lengths into the extended table. In the case of many fields (for example, a large table has more than 100 fields), through "large table split small table", it is easier to develop and maintain, and it can also avoid cross-page problems. The bottom layer of MySQL is stored through data pages. Records occupy too much space, resulting in cross-page, causing additional performance overhead. In addition, the database loads data into memory in units of rows, so that the length of the fields in the table is shorter and the access frequency is higher, the memory can load more data, the hit rate is higher, and the disk IO is reduced, thereby improving database performance.    

3.2 Level score table

    When an application is difficult to fine-grained vertical segmentation, or the number of rows of data after segmentation is huge, there is a single library read/write, storage performance bottleneck, then horizontal segmentation is required. According to the internal logical relationship of the data in the table, the horizontal sub-table disperses the same table into multiple tables under different conditions, and each table contains only a part of the data, so that the amount of data in a single table is reduced.

4. Rules and strategies for common partitions and tables

  • Range
  • Hash
  • Split by time
  • After hashing, take the modulus according to the number of sub-tables
  • To save the database configuration in the authentication database is to create a DB, which separately saves the mapping relationship between user_id and DB

5. Sub-library

    As the amount of data increases, the storage space of a single DB may be insufficient, and as the amount of queries increases, a single database server can no longer support it. At this time, the database can be differentiated horizontally.

4.1 Vertical sub-library

    Vertical sub-database is to store different tables with low correlation in different databases according to business coupling. The approach is similar to the splitting of a large system into multiple small systems, which are divided independently according to business classifications. Similar to the "microservice governance" approach, each microservice uses a separate database.

4.2 Sub-library and sub-table

    According to the internal logical relationship of the data in the table, the same table is distributed to multiple databases under different conditions, and each table contains only a part of the data, so that the amount of data in a single table is reduced, and the effect of distributed is achieved.        

6. Problems faced after sub-database/sub-table

(1) Transaction support, sub-database and sub-table, become a distributed transaction, and distributed transaction processing is complicated;

(2) Cross-database and cross-table issues during join can only be solved through interface aggregation, which increases the complexity of development;

(3) Sub-database and table, and read-write separation uses distributed. In order to ensure strong consistency, distributed will inevitably bring about delays, resulting in reduced performance and higher system complexity;

7. When to consider segmentation

(1) Can not be divided and try not to be divided

    Not all tables need to be segmented, mainly depends on the growth rate of the data. After the segmentation, the complexity of the business will be increased to some extent. In addition to the storage and query of the data carried by the database, it is also one of its important tasks to assist the business to better fulfill its requirements. It is not a last resort to use the big trick of sub-database and sub-table to avoid "over-design" and "premature optimization". Before sub-database sub-table, don't divide for sub-division, first try to do what you can, for example: upgrade hardware, upgrade network, read and write separation, index optimization, etc. When the amount of data reaches the bottleneck of a single table, then consider sub-database sub-table.

(2) The amount of data is too large, and normal operation and maintenance affects business access

    The operation and maintenance mentioned here refers to:

  • For database backup, if the single table is too large, a lot of disk IO and network IO are required for backup. For example, for 1T data, when the network transmission occupies 50MB, it takes 20,000 seconds to complete the transmission. The risk of the whole process is relatively high.
  • When DDL modification is performed on a large table, MySQL will lock the entire table. This time will be very long. During this time, the business cannot access this table, which will have a great impact. If you use pt-online-schema-change, triggers and shadow tables will be created during use, which also takes a long time. During this operation, all are counted as risk time. Splitting the data table and reducing the total amount helps to reduce this risk.
  • Large tables are frequently accessed and updated, and lock waits are more likely to occur. Divide data, use space for time, reduce access pressure in disguise

(3) As the business develops, some fields need to be split vertically

(4) Rapid growth of data volume

    With the rapid development of business, the amount of data in a single table will continue to grow. When the performance is close to the bottleneck, horizontal segmentation needs to be considered, and database and table sub-tables need to be considered. At this time, you must choose a suitable segmentation rule and estimate the data capacity in advance.

(5) Security and availability

    Don't put eggs in one basket. Vertical segmentation at the business level separates the databases of unrelated businesses, because each business has different amounts of data and access, and it cannot be implicated in other businesses because one business ties up the database. Using horizontal segmentation, when a database has a problem, it will not affect 100% of users. Each database only bears part of the business data, so the overall usability can be improved.

Guess you like

Origin blog.csdn.net/MOU_IT/article/details/113828140