Meituan is too ruthless: 10 billion-level sub-database and table, migration without downtime, how to deal with it?

say up front

In the reader exchange group (50+) of the 40-year-old veteran architect Nien , some friends have recently obtained interview qualifications for first-tier Internet companies such as Tencent, Meituan, Ali, Pinduoduo, Jitu, Youzan, and Xiyin. Encountered a few very important interview questions:

  • How to migrate the database without downtime?
  • 10 billion database table, how to migrate without downtime?
  • Etc., etc…

In fact, the answer is not set in stone.

There are actually tens of thousands of reference answers. Here, Nien has been combining industry cases to sort out the most comprehensive and systematic answer.

Here is a new industry case "Stability Construction of Cainiao Points System - Sub-database and table & tens of billions of data migration". From the perspective of interviews, Nien carried out a second reconstruction and sorting out of this plan, and now uses it as a reference The answer is included in our " Nin Java Interview Collection PDF " V82 version

For the reference of the friends in the future, everyone must take a good look at the answer of this production level.

The original text of this article: "Cainiao Points System Stability Construction - Sub-database and table & tens of billions of data migration" The author of the original plan is Xinghua, the original text comes from the official account, Ali developer.

The following content is Nien's secondary architecture analysis and creation based on the original text based on his own 3-level architecture notes and Nien's 3-level architecture knowledge system (3-level architecture universe).

PDF of "Nin Architecture Notes", "Nin High Concurrency Trilogy" and "Nin Java Interview Collection", please go to the official account [Technical Freedom Circle] at the end of the article to get it

Article Directory

The technical importance of sub-database sub-table & data migration

Sub-database sub-table & data migration is not only a technical problem, but also a test of a person's technical skills.

Sub-database sub-table & data migration is also a coordination problem, requiring coordinated development, operation and maintenance, but one person can handle it. A test of a person's coordination skills.

Therefore, sub-database sub-table & data migration are the core issues of big factory interviews and architecture interviews.

The Great Value of the Points System

Great systems are all iterative.

In the early days, in order to quickly trial and error, the rookie points system was a single system, and the database was a single database with multiple tables.

With the rapid growth of business, Cainiao’s C-end users have exceeded 300 million+, and consumers have gradually expanded from checking, picking up, and sending express mail to playing and shopping.

Playing and shopping are mainly a variety of activities in the mall and various business lines. These activities include a variety of interactive means such as stay, task, lottery, sharing, check-in, assistance, exchange, and lottery.

Cainiao's points represent the user's virtual rights and interests, which serve as a hook.

Cainiao points provide certain "value" for the interactive product itself, which increases the time consumers stay on the platform.

From this dimension, the Cainiao points system has always played a core role at the bottom layer, carrying the core assets of users.

The functional architecture of the points system is as follows

The overall functional architecture is shown in the figure above.

The Great Challenges Facing the Points System

During the big promotion period, it is necessary to support activities with large traffic volume, and the whole point system faces great challenges during the big promotion period.

Cainiao has more than 300 million C-end users, and it is impossible to support a single application, and it is also impossible to support a single database with multiple tables in the storage layer.

In order to support the continuous explosive growth of Cainiao's C-end marketing business, the Cainiao point system needs to be upgraded.

The question is: how to upgrade the data structure of points from single database and single table to sub-database and sub-table without suspending business?

There is also: How is this high-risk operation completed step by step?

Before starting to design the migration plan, the Cainiao Points team carefully analyzed the problem and found the following major challenges for the project:

  • From 1 to N, there is a big difference in system architecture : the original system was a single-database and single-table system, and the previous design and development were based on a single-database and single-table system, such as SQL queries, where multiple indexes were built on a single table to support After these query services are switched to sub-database and sub-table, queries without sub-table keys cannot be supported.
  • Business cannot be suspended : During the entire migration process, the marketing business is not allowed to be suspended, which is similar to changing the engine of an airplane, and it is more difficult than some banking systems to suspend services for migration.
  • Large amount of data : The amount of data in a single table is extremely large, which poses a great challenge to the stability of the data synchronization link.
  • Old system architecture : The point system was established very early, and the selected technical frameworks are old, and some are no longer maintained, resulting in increased costs for the entire transformation, and a high risk in the modification process.
  • Multiple interface versions : Due to historical reasons, there are many interface versions for the issuance and consumption of points, and some historical burdens are in it, which makes migration difficult and risky.
  • Can be monitored, can be dimmed, and can be rolled back : Ali’s three axes of stability: can be monitored, can be dimmed, and can be rolled back. If grayscale and rollback are required, two-way synchronization of data must be required, which increases the risk of data synchronization links.
    • 1. Can be grayscaled : Any change must be grayscaled, that is, to control the effective scope of the change. First make small-scale changes, and then expand the scope after passing the verification
    • 2. Monitorable : In the grayscale process, it must be able to be monitored and understand the application of the system after the change
    • 3. Rollback : When it is found that changes will cause problems through monitoring, there needs to be a way to roll back
  • Time is tight : the demolition of the database and the migration of data sources need to be completed before a specific time (more than one month), and the complexity of the operation process is increased during the network closure period.

Refactoring is extremely risky, but it has to be done

Due to the reconstruction of data involved, the project is extremely risky, and if it is not handled properly, it may be necessary to leave with a backpack.

why?

  • First: Once the data is disordered or lost, it may cause some data to be unrecoverable .
  • Second: Or even if it can be restored, the recovery time is usually very long , calculated by hours or even days. It is difficult for the business to bear this price.

However, since the risk is so great, can we not do this sub-database and sub-table migration project?

cannot! why:

  • Because the Cainiao points system has become a bottleneck during the 2020 Double 11 stress test.
  • It has reached the point where refactoring has to be migrated, and the later this refactoring is done, the greater the risk .

As the saying goes: when the arrow is on the string, it has to be fired.

Scheme design for reconstruction and migration of 10 billion database tables

Stock data analysis

Table situation: The point system mainly has two core tables, the point summary table and the point detail table.

Data situation: The data in the two databases reaches tens of billions, the scale of incremental data with detailed points is large, and the daily new data is also in the tens of millions.

Service interface analysis

The service interfaces provided by the points system mainly include:

  • read points
  • add points
  • Deduction (freeze points)
  • Refund points.

In order to temporarily solve the system performance bottleneck problem during the big promotion period, the read credits will be queried through the cache Tair (similar to redis), and the cache will be broken down to reach the database, and the write credits will directly update the database (the cache is).

The pressure on the database is not reading while writing.

Through observation, during the big promotion period, the bottleneck of the database is the pressure on the database from writing credits.

If the database is expanded vertically and the highest configuration is upgraded, the problem cannot be completely solved:

  • On the one hand, the budget is too high,
  • On the other hand, it may not be able to support the estimated traffic. Even if you get away with it this time, you will not be able to expand horizontally in the future.

Alternative solutions to solve the performance bottleneck of 10 billion database tables

40-year-old architect promotion: As an architect, you must provide multiple candidate solutions, optional solutions.

There are three options for solving the performance bottleneck of 10 billion-level database tables:

  1. Sub-database and sub-table, data migration: Considering the performance of the database, it is decided to upgrade the scoring system from a single database with multiple tables to sub-database and sub-table
  2. Optimize database operations to reduce database operations for adding points
  3. Sort business priorities, downgrade some scenarios, do not write points details

The indicators of Scheme 2 and Scheme 3 do not solve the root cause, but only delay the problem, and do not solve the problem. As a result of the delay, the more you delay, the higher the cost.

Through comprehensive consideration, select Option 1 for optimization , and completely solve the performance bottleneck of 10 billion-level database tables.

Sub-database and sub-table + data migration scheme design

Sub-database and sub-table scheme design

(1) Shard key design for sub-database and sub-table:

Since it is a summary table and a detailed table of points, the table is divided from the user dimension

(2) Scale design of sub-database and sub-table:

Sub-database sub-table is divided into 8 databases, 1024 tables

For the specific methodology of sub-database and sub-table, please refer to the Nien seckill video. Or refer to Nien's push middle platform video.

(3) Scheme design of primary key ID

In the process of migrating data, special attention should be paid to the primary key ID, which requires the distributed primary key produced by the program.

In other words, you need to manually specify the primary key in the program.

Instead of using the DB self-incrementing primary key to prevent the wrong sequence of ID generation

For the specific scheme design of the primary key ID, please refer to the Nien seckill video. Or refer to Nien's push middle platform video.

Data Migration Scheme Design

Sub-database and sub-table must involve data migration, and data migration must involve a full amount and incremental data.

The key to data migration is: how to ensure that data will not be repeated, will not be lost, and ensure uniqueness

Data migration solutions include: full migration and incremental migration.

Use the Ali data migration tool Jingwei, which can perform both full migration and incremental migration.

The Jingwei tool will be equivalent to a copy of the record database during the full task , and then the task can be traced back during the increment.

How to ensure data uniqueness? The migration tool can guarantee the uniqueness of data, and can update existing records.

Nin hints:

If it is not Ali, you can use an open source data synchronization tool.

For example, Ali's open source DTS is a very good tool.

The DTS tool can realize data migration and data synchronization between different data sources. It only needs to configure the data sources at both ends to realize it automatically. It does not require manual operation, which is very convenient.

Of course, if you are familiar with kettle, you can also use kettle for data synchronization.

In short, there are many similar middleware, or write migration scripts yourself

Dynamic reading and writing switching transformation

Since it is a dynamic migration, the data source must not be switched through the release program.

Therefore, the transformation is carried out at the program level, and the transformation of dynamic reading and writing switching is carried out.

The general idea: to manually specify dual data sources in the program , two db data sources should be loaded in the program in advance, pre-set switches through configuration, and then dynamically switch between read and write .

In order to achieve monitorability, grayscale, and rollback , grayscale is also required.

From the perspective of configuration, not only the switching switch needs to be configured, but also the switching ratio in the grayscale process needs to be guaranteed, which can be regulated through configuration

Non-stop, 10 billion-level data migration execution process

There are about 10 steps in the data migration execution process:

  1. Environment preparation : online library configuration is complete
  2. Full synchronization : the full task of creating two new tables (integral table and detail table) on the data migration tool
  3. Incremental synchronization : After the full migration is completed, the increment is enabled (automatically backtracks the start time of the full amount, and the multiple consumption of messages will be idempotent)
  4. Data verification : full data verification, check whether the data is consistent
  5. Cut-off test : Improve the code pre-release test (collect online traffic for playback, run various cases, check the flow switch, etc.), and release it online without any problem
  6. Secondary verification : full verification and correction again (data equalization)
  7. Enable double-write : enable double-write (to ensure real-time data)
  8. Turn on grayscale reading : low-traffic nodes (after midnight) perform grayscale flow switching userId%x, verify, gradually open traffic, and continue to observe
  9. Write-only new library : double-write switch to the new library, ensure that only the new library is written, and complete the data migration plan
  10. Migration completed : The system runs stably for a period of time, the migration & double-writing code is offline, and the resources of the old library are released

step1: Environment preparation: online library configuration is complete

The configuration of the online library is completed, and the off-peak period of the business is found, and the migration operation is performed

1) Configuration work

  • Apply for 8 libraries of db resources
  • Create a logic library and configure a logic table
  • Configuring Logical Table Routing Algorithms
  • Configure the sub-database and sub-table rules in the application
  • Multiple data source configuration

For the configuration of multiple data sources, please refer to the configuration of multiple data sources in the Nien push platform.

2) View low-peak business hours through monitoring

  • Find out the low peak period of business from monitoring, and estimate the operating time period. From the figure, we can see that the low peak time of business is between 2:00-5:00

step2: start full data migration

The full data migration cycle is very long.

"Full data migration" refers to the data of the old database. This is the real data migration, which migrates the existing data in the old database table (these data may be historical data, which has already been written, or may have just been written recently, not new data in the future), Generally speaking, the data of the tables in the source database will be obtained through Select. Then write to the new library by means of Insert, Replace into, Update, and Delete.

The data migration time may be a week or a month

Every data migration takes a long time, and it cannot be done in a day or two.

Generally speaking, the process of migrating data is basically as follows:

Enable full sync

  • Using Jingwei (Alibaba data synchronization tool), it can automatically synchronize data to the corresponding physical sub-table according to the configured sub-database and sub-table rules
  • Full start time, select gmt_modifiedas condition field
  • During full migration, the source data goes to the standby database by default, and the target data is written to the main database.

Whether it is full volume or verification, it will actually put pressure on the source-side standby database and the target-side main database. So you need to pay attention here, so as not to affect the online service.

  • In the process of migrating data, when a write conflict occurs, it is converted to updateexecute
  • Pay attention to the time when the data migration is completed. Suppose we start 8 tasks at the same time with 8 billion data, and the tps of each task is 1w. The calculation formula is as follows:

  • Data migration Kanban, real-time viewing of migration progress (ps: this tool is internal to Ali, and there are many similar open source systems outside)

step3: Turn on incremental synchronization

Incremental synchronization: After the full migration is completed, the increment is enabled (automatically backtracks the start time of the full amount, and the multiple consumption of messages will be idempotent)

After the full migration is completed, the newly added data needs to be processed.

Real-time incremental data needs to be written to our new storage after the original database is written. During this process, we need to continuously perform data verification.

When we verify that the basic problem is not serious, we then perform a flow cut operation until the flow is completely cut, and we no longer need to perform data verification and incremental data migration.

In the data migration tool (Jingwei), start the incremental task and pay attention to the following points for incremental synchronization:

  • During INSERT, if a primary key conflict occurs, Jingwei will INSERTchange it to UPDATEan event. Users don't need to worry about exceptions caused by primary key conflicts.
  • Exception handling mechanism: After an incremental task generates an exception, the migration tool will retry three times on the same machine by default. If it fails all three times, an alarm message will be given, and another machine will be retried later.

Incremental synchronization involves synchronization point description: Incremental synchronization is mainly started by referring to the binlog point of the original library. This synchronization point is also called the consumption point.

Specifically, the consumption point refers to the position of the Binlog queue that has been successfully consumed currently, and the point is a timestamp in the format of 'yyyy-MM-dd HH:mm:ss'.

step4: Data verification

Full data verification to check whether the data is consistent

After full synchronization, we need to continuously verify the data.

After incremental synchronization, we also need to continuously perform data verification.

When we verify that the basic problem is not serious, we then perform a flow cut operation until the flow is completely cut, and we no longer need to perform data verification and incremental data migration.

Data verification includes:

  • Full verification service
  • Full revision service
  • Points reconciliation (business verification)

Full verification service

The tools used below are internal to Ali, and the principles are similar. You can refer to Ali’s open-source data transmission service DTS.

  • The full verification service is similar to the full migration service, and the configuration process is the same as above. After the verification service is executed, the number of missing and differences will be displayed on the page.
  • Verifying whether the data on the source end and the target end are consistent is also a necessary pre-operation for the full correction service.

Through verification, find the difference in data, and correct the data, or called correction

Full revision service

  • Through the correction service, the inconsistent source DB and target DB can be corrected to ensure consistency.
  • Calibration service must be performed before using correction service.

Points reconciliation (business verification)

Business verification is mainly for data consistency verification.

Note: The business verification task should be careful not to affect the services running online. Usually, the verification task will write a lot of batch query statements, and there will be batch scanning of tables. If the code is not written well, it will easily cause the database to hang.

Reconciliation standard : the data in the target database and the source database are consistent (all fields)

Reconciliation : It can be handled from two aspects: the total points table and the points details

Account reconciliation process : The data consistency between the old and new databases of users who have completed migration is polled by scheduled tasks. It should be noted that due to the sequential order of reading the new and old databases, instantaneous data inconsistency occurs . For this problem, you can use reconciliation and retry, as long as the final consistency is guaranteed.

1) Sampling data verification

Sampling and verification of the latest incremental data according to business type or user id (the following verification tool is Ali’s internal tool, and external users can refer to Ali’s data transmission service DTS), and then the problematic data needs to be corrected

2) Odps offline data verification

Use the odps hour meter for reconciliation. The idea is very simple, using odps data synchronization capability, offline data processing capability, and dynamic script writing to quickly realize multi-system reconciliation.

There is no need to modify the application, and the stability is also guaranteed. For scenarios with low real-time requirements, it is recommended to try it.

step5: cut flow test

Modified the code pre-release test (collecting online traffic for playback, running various cases, checking the flow switch, etc.), and released it online without any problem

step6: Second verification

To ensure safety, before the official cut-off, the full amount of verification and correction will be performed again (data equalization)

step7: Turn on double writing (addition, deletion, modification) switching

On the whole, it is divided into the following four steps, and the switching is performed dynamically through configuration. The issues that need to be paid attention to during the switching are as follows:

  • Log burying is required for the operation of writing a new library
  • It is not required that the new library must be successfully written (it does not affect the service, and the inconsistent data will be covered by incremental tasks in the later stage)
  • If both the old library and the new library are written, the final return result is the old data source
  • Data source rollback: Double writing is enabled, and the total score value in the new database is incorrect (the new data source cannot be rolled back)
  • Opportunity to turn on double-write: Since increment has been turned on, there is no need to turn on double-write before switching streams. When preparing to cut the library, enable double writing. Why do you need to enable double writing here?
    • Considering the limit case, the incremental synchronization task will be delayed (theoretically at the second level)
    • Synchronize data to the new database in real time

step8: Turn on the grayscale read and cut stream

Find the correct time period to operate, and operate during the low traffic period (after midnight)

When operating, carry out grayscale processing and advance step by step.

Perform grayscale flow cut userId%x, verify, gradually open flow, and continue to observe

The process of reading and cutting streams:

  • Organize all query interfaces
  • Write a proxy layer for the DAO layerxxProxyDAO.class
  • Switch control of the read interface at the proxy layer
  • According to the modulus of the last 4 digits of userId, the grayscale is obtained, and the data source used for query is dynamically obtained

Note: For userId without routing field, code modification is required

When our data verification basically does not report errors, it means that our migration program is relatively stable, so can we directly use our new data? Of course it is not possible. If we switch, it will be good if it goes well. If there is a problem, it will affect all users. So we need to do grayscale next, that is, cut flow.

This flow cutting plan is based on the method of user id modulo. This flow cutting needs to formulate a flow cutting plan. In what time period, how much traffic will be released, and when cutting the flow, the flow must be selected to be relatively small. Every time the flow is cut, the log needs to be carefully observed, and problems should be repaired as soon as possible. A flow release process is a process from slow to fast. For example, at the beginning, the amount of 1% is continuously superimposed Yes, in the future, we will directly increase the volume by 10%, 20%. Because if there is a problem, it will often be found when there is a small flow. If there is no problem with the small flow, then the follow-up can be quickly increased.

The switching process adopts the form of gradually increasing the volume. For many grayscale methods, we use the whitelist verification first, and then the user ID is modulo 10,000 to gradually increase the volume.

Gray cut flow verification: 1-1%-5%-10%-50%-100% cut flow

After the gray-scale reading and cutting flow is completed, all the reading flow falls into the new library.

step9: Only write the new library

Double-write switch to the new library to ensure that only the new library is written to complete the data migration plan

The double writing in the previous step 7 has become single writing.

step10: Complete the migration

Until the cutoff reaches 100%, the migration is complete.

However, the old library cannot be offline immediately, and it has to be backed up for a period of time, just in case something goes wrong and you can switch back to the old library

In addition, it is necessary to ensure that the data in the old database can be up to date.

Therefore, start the synchronization task from the new Jingwei library to the old library.

Then observe the follow-up work order feedback of each business and each system warning & log; perform performance pressure testing on the new library to ensure the stability of the new library

How to migrate after the 10 billion-level data is divided into databases and tables:

Finally, let’s briefly summarize this routine. In summary, it is actually four steps:

Step 1: Inventory synchronization,

Step 2: Incremental sync,

Step 3: Data verification,

Step 4: Grayscale cut flow

To be more detailed, there are 10 steps, as detailed above.

Summary of Data Migration

The cycle is very long : the entire migration process lasted for more than one month from the formulation of the plan to the completion of the final migration, and the migration was finally completed.

Reliable solution : Whether you use database tools or services to migrate data, the goal is the same, that is, there is no difference in data, no user perception, exceptions can be monitored, and the solution can be rolled back.

Sufficient rehearsal : Do as much rehearsal as possible before the migration, and whether some potential problems can be found efficiently through the automated scripts written by some tests.

Migrate as soon as possible : At the same time, storage is stateful, and migration is relatively difficult. Developers need to be forward-looking, try to be cautious when selecting models, choose an appropriate database, and avoid database migration. When potential problems in database selection are discovered, it is necessary to make a decisive decision and migrate as soon as possible. Don't procrastinate because you think that the probability of problems is not high. Otherwise, once a problem occurs, it will be a major failure, and the loss caused is incalculable.

References

https://baijiahao.baidu.com/s?id=1711398612204958365&wfr=spider&for=pc

https://github.com/alibaba/tb_tddl

https://github.com/alibaba/yugong

https://www.aliyun.com/product/dts

Combined with the scheme of the rookie point system, review the previous interview questions

  • How to migrate the database without downtime?
  • 10 billion database table, how to migrate without downtime?

The above scheme can be used as a reference answer for everyone. Make sure the interviewer is very satisfied.

In the follow-up, Nien will give you more and more powerful answers based on industry cases.

Of course, if you encounter such high-concurrency interview problems, you can come to Nien's [Technical Freedom Circle] to chat.

Recommended related reading:

" Tencent is too ruthless: 4 billion QQ accounts, given 1G memory, how to deduplicate? "

How to structure 100 billion data and 30W-level qps? Come to a ceiling case "

" How to Scheduling 10 Billion-Level Orders, Come to a Big Factory's Superb Solution "

" Two Big Factory 10 Billion-Level Red Envelope Architecture Scheme "

" How to optimize the performance of tens of millions and billions of data?" Textbook-level answers are coming "

Nien Architecture Notes "" Nien High Concurrency Trilogy "" Nien Java Interview Collection" PDF, please go to the following official account [Technical Freedom Circle] to take ↓↓↓

Guess you like

Origin blog.csdn.net/crazymakercircle/article/details/131573734