The basis and problems of database sub-database sub-table

The necessity of sub-database and sub-table

The relational database itself is relatively easy to become a system bottleneck, and the storage capacity, number of connections, and processing capacity of a single machine are limited. When the data volume of a single table reaches 1000W or 100G, due to the large number of query dimensions, even if the database is added and the index is optimized, the performance will still drop severely when many operations are performed. At this time, it is necessary to consider segmentation. The purpose of segmentation is to reduce the burden on the database and shorten the query time.

The sub-table in the database only solves the problem of the excessive amount of data in a single table, but does not distribute the tables to the libraries of different machines. Therefore, it is not very helpful to reduce the pressure on the MySQL database. Everyone is still competing for the same physical machine. CPU, memory, and network IO are best solved by sub-library and sub-table.

Sub-table type and basis

Data segmentation can be divided into two methods according to its segmentation type: vertical (vertical) segmentation and horizontal (horizontal) segmentation.
Vertical segmentation : the multi-field table is split into multiple small-field tables.
     tab(a,b,c,d)--->tab1(a,b)+tab2(c,d)
horizontal segmentation : split into multiple tables with the same field according to business type.
    tab(a,b,c,d)--->tab1(a,b,c,d)+tab2(a,b,c,d)

Table disassembly basis : according to the id range (the disadvantage is that the hot data is uneven), according to the modulus range, and according to the business scope.
According to the numerical range, such as id 1-1000, there is also the time range.
According to the numerical value, hash takes the modulus mod. If the remainder is 0, put one library, 1 put the second library, and so on.


Sub-library and sub-table problem


1. Transaction consistency: XA protocol, two-phase commit . Final transaction consistency: Allow a little time for the final transaction to be consistent. Transaction compensation : reconciliation check of data.
2. Cross-node join query
  (1) dictionary table associated query , you can prepare a copy for each database
  (2) field redundancy , anti-paradigm design, for example, when the order table saves user_id, it also saves user_name
   (3) two queries
3. The primary key conflict (sequence) will only appear when
  the starting value of the sequence is different when divided by the id range , and the step length is the same, so it can be staggered.
4. Cross-database sorting, the maximum value
  should be taken out one by one, and finally sorted together. If the maximum value is taken, the maximum value of each shard should be taken, and then the maximum value should be taken collectively.

Guess you like

Origin blog.csdn.net/x18094/article/details/114285346