10 billion-level ultra-large traffic red envelope architecture scheme for 2 major factories

10 billion-level ultra-large traffic red envelope architecture scheme for 2 major factories

Article Directory

10 billion red envelope application scenarios

overview

It is said that every Double Eleven or Spring Festival and other holidays are the happiest days for everyone. You can send and receive red envelopes in WeChat groups. In addition, this year WeChat also launched face-to-face red envelopes, so that everyone can send and receive directly during New Year greetings. For users It's cool and convenient. However, considering the technical architecture side, this makes the sending and receiving data of WeChat red envelopes increase exponentially, and the complexity of processing has also increased a lot.

In 2017, the largest amount of WeChat red envelopes was sent on New Year’s Eve, reaching 14.2 billion.

For such a large-scale, high-concurrency, and high-peak business scenario, even the technical team of Meidi Internet, including EBay, Amazon, etc., cannot imagine what level of technical architecture support is needed behind such huge traffic and concurrency ?

When the scale of capital transactions reaches tens of billions, how can we ensure the concurrent performance and transaction security of the system?

Today's Internet platforms in China, there are two scenarios that can be called hundreds of millions of concurrency:

One is a WeChat red envelope,

One is a byte red envelope,

They all reach a request load of more than hundreds of millions in a unit time.

Ten Billion Level WeChat Red Packet Technical Architecture

Compared with the red envelopes in the traditional sense, the "red envelopes" that have become popular in the past two years seem to be the highlight of the Spring Festival. After thousands of years of inheritance and changes, giving red envelopes during the Spring Festival has long become a cultural custom that has accumulated in history and has been integrated into the blood of the nation. According to the data released by various companies, the total number of red envelopes sent by WeChat users throughout the day on New Year’s Eve reached 8.08 billion, and the peak sending and receiving volume of red envelopes was 409,000 per second. During the live broadcast of the Spring Festival Gala, there were 51.91 million Weibo posts discussing the Spring Festival Gala, the number of interactions among netizens reached 115 million, and the total number of times netizens grabbed Weibo red envelopes exceeded 800 million times.

After 15 years of shaking the WeChat red envelope during the Spring Festival Gala, the business volume in the first half of 2015 once grew exponentially. In particular, the large increase in the number of active users of WeChat red envelopes has made the 2016 New Year's Eve New Year's Eve red envelopes a great challenge. In order to cope with the predictable massive red envelope business during the 2016 Spring Festival, the red envelope system has undergone a series of adjustments and optimizations in terms of architecture. It mainly includes a series of measures including off-site architecture, cache system optimization, concurrency policy optimization for dismantling red envelopes, and storage optimization to prepare for the 2016 Spring Festival red envelope challenge. Some of the most important ideas are introduced below.

architecture

WeChat users have two access points in China, Shenzhen and Shanghai, which are habitually called South and North (that is, Shenzhen is south and Shanghai is north). After the user requests access, different services choose a deployment method according to the service characteristics. WeChat red envelopes can be divided into order latitude and user latitude in terms of information flow.

Among them, the order is the key information that runs through the red envelope sending, snatching, dismantling, detailed list and other businesses, which belongs to the transaction type information; while the user latitude refers to the red envelope receiving list and red envelope sending list of the red envelope user, which belongs to the display type information. The structure of the red envelope system has the following aspects:

North-South distribution

1. The north-south independent system of the order layer, the data is not synchronized

The user accesses nearby, assigns the north and south of the order when requesting a red envelope, and puts the north and south logo on the order number. When snatching red envelopes, opening red envelopes, and checking the detailed list of red envelopes, the access layer directs the traffic to the north-south system closed-loop respectively according to the north-south identification on the red envelope number. According to the location of the red envelope user and the red envelope grabber, there are four situations as follows:

1) Shenzhen users send red envelopes, Shenzhen users grab

The order falls in Shenzhen, and Shenzhen users do not need to cross the city when grabbing red envelopes, and complete the closed loop in Shenzhen.

2) Shenzhen users send out red envelopes, and Shanghai users grab them

The order landed in Shenzhen, and Shanghai users grabbed the red envelopes. After connecting in Shanghai, they crossed the city to Shenzhen through a dedicated line, and finally completed the red envelope grabbing in Shenzhen closed loop.

3) Shanghai users send red envelopes, Shanghai users grab

The order falls in Shanghai, and Shanghai users do not need to cross cities when grabbing red envelopes, and complete the closed loop in Shanghai.

4) Shanghai users send red envelopes, and Shenzhen users grab them

The order landed in Shanghai, and Shenzhen users grabbed red envelopes. After connecting from Shenzhen, they crossed the city to Shanghai through a dedicated line, and finally completed the red envelope grabbing in a closed loop in Shanghai.

The system is designed in this way, and the advantage is that the north-south system shares traffic and reduces system risk.

2. User data is written more and read less, and the full amount is stored in Shenzhen, written in an asynchronous queue, and cross-city while checking the time

The query entry of user data is deeply hidden in the WeChat wallet. This determines that the amount of access to user data will not be too large, and it is also regarded as non-critical information that can be bypassed, and the real-time requirements are not high. Therefore, it is only necessary to split the user data writing request from the order latitude when sending and unpacking the red envelope, and write it to Shenzhen asynchronously by MQ. The background checks the order with the user on a regular basis to ensure data integrity.

3. Support flexible regulation of north-south traffic

After the red envelope system is distributed from north to south, whether the order lands in Shenzhen or Shanghai can be flexibly allocated, and only logic needs to be done on the access layer. For example, in the access layer, all red envelope requests can be implemented in Shenzhen (regardless of whether the user accesses from Shanghai or Shenzhen), so that the red envelope business system in Shanghai will have no request volume. The disaster recovery capability of the red envelope system has been improved. At the same time, the background management system on the access layer has been realized, and the second-level capacity control capability has been realized. According to the real-time monitoring of north-south request volume, corresponding deployment can be made.

4. The traffic transfer capability in case of DB failure is based on the north-south traffic control capability. When a DB failure is found, the red envelope business traffic can be transferred to the other side to achieve disaster recovery for DB failure.

Order Form

The order is placed in the cache before payment, and the atomic incr operation of the cache is used to sequentially generate the red envelope order number. The advantage is the lightweight operation of the cache and the reduction of DB waste. There is a certain conversion rate between the user's request to send a red envelope and the actual payment. Some users will not actually pay after requesting a red envelope.

Asynchronous dismantling of red envelopes and account entry

Information flow is separated from capital flow.

When opening the red envelope, the red envelope opening certificate is recorded in the DB, and then the asynchronous queue request is recorded.

The failure to enter the account is compensated through the compensation queue, and finally the red envelope voucher is reconciled with the account entry flow of the user account to ensure the final consistency. As shown below:

The theoretical basis of this architectural design is the separation of fast and slow.

The entry of red envelopes is a distributed transaction, which belongs to the slow interface.

And the red envelope certificate will land quickly.

In the actual application scenario, after the user grabs the red envelope, he only cares about who is the "best luck" in the detailed list, and rarely cares whether the zero he grabbed has been credited.

Because it is only necessary to display the user's voucher for opening the red envelope.

Sending and dismantling, other operations double-layer cache

1. Cache stores all queries, two layers of cache

In addition to using ckv as a full cache, a local memory cache is also added to the data access layer dao as a secondary cache to store all read requests.

When the query fails or the query does not exist, the memory cache is downgraded; when the memory cache query fails or the record does not exist, the DB is downgraded.

The DB itself does not do read-write separation.

2. DB writes synchronous cache, which tolerates a small amount of inconsistency. After the DB write operation is completed, the memory cache is synchronized in dao, and the business service layer synchronizes ckv. The failure is compensated by the asynchronous queue.

Scheduled ckv reconciliation with the DB standby machine to ensure that the final data is consistent.

High concurrency

The concurrent challenge of WeChat red envelopes mainly lies in the large groups of WeChat, where many people grab the same red envelope at the same time.

In the above situation, there is competition for MySQL row locks. To control this concurrency, the team did a few things:

1. The request is routed according to the red envelope order, the logic block is vertically sticky, and the transaction is isolated

Logical units are divided according to red envelope orders, and the business within the unit is closed. When the service rpc is called, use the hash value of the red envelope order number to find the next hop address as the key. All dismantling requests and query requests for the same red envelope are routed to the same logical machine and the same DB for processing.

2. Dao builds a local Memcache memory cache to control the number of concurrent red packets

In the DB access machine dao, build a local memory cache. Use the red envelope order number as the key to atomically count the unpacking requests of the same red envelope, and control the number of concurrent requests that can enter the DB to unpack red envelopes at the same time.

The implementation of this strategy relies on request routing according to the hash value of the red envelope order to ensure that all requests for the same red envelope are routed to the same logical layer machine.

3. Multi-level concurrency control

1) Red envelope control

Sending red envelopes is the entry point of the business process. Controlling the concurrency here means controlling the overall concurrency of the red envelope business. In the business link for sending red packets, multi-layer flow control is implemented to ensure that the magnitude of the effective red packets generated is within a controllable range.

2) Grab red envelope control

There are two steps when receiving WeChat red envelopes, grabbing and dismantling.

The action of grabbing red envelopes itself has the effect of controlling the concurrency of demolition. Because when grabbing red envelopes, you only need to check the data in the cache and do not need to request the DB.

Traffic such as red envelopes that have been received, users who have received them, and red envelopes that have expired can be directly intercepted. And for the number of requests that are eligible to enter the red envelope, flow control is also performed. Through these processes, the traffic that can finally enter the splitting link is greatly reduced, and all of them are valid requests.

3) Memory cache control during teardown

For the control of concurrent dismantling of the same red envelope, the above article has introduced it.

4. DB simplification and splitting

There are many factors that affect the concurrency capability of DB. The red envelope system has been optimized in combination with the red envelope usage scenarios. There are two main points that can be used for reference:

1) Only key fields are stored in the order table, and other fields are only stored in the cache, which can be flexible.

In the display of red envelope details, in addition to the key information of the order (user, order number, amount, time, status), there are also fields such as user avatar, nickname, and greetings. These fields are not critical information for transactions, but occupy a large amount of storage space.

The non-key information is removed, only cache exists, and the user queries and displays, but the order does not land.

In this way, the light weight and high efficiency of the order can be maintained. At the same time, when the cache misses, the compensation can be queried from the real-time interface to achieve the effect of optimizing the order DB capacity.

2) DB double latitude sub-database table, hot and cold separation

Use the order hash, order date, and two latitude sub-database tables, that is, the format of db_xxx.t_x_dd.

Among them, x represents the hash value of the order, and dd represents the 01-31 cycle day.

The order hash latitude is to disperse the order to different DB servers and balance the pressure.

The order date cycle day latitude is to avoid the infinite expansion of single table data, so that every day is an empty table.

In addition, the popularity of red envelope orders is very typical.

Hot data is concentrated within one or two days, and decreases sharply over time.

The online hot database only needs to store data for a few days, and other data can be moved to a low-cost cold database at regular intervals.

The circular daily table also facilitates the migration of historical data.

Red envelope algorithm

First of all, if there is only one red envelope, the entire amount will be used directly in this round to ensure that the red envelope is sent out.

Then, calculate the minimum number of red envelopes to be received in this round to ensure that the red envelopes are all received, that is, the lower water level of this round; The main methods are as follows:

Calculate the lower water level of the amount of red envelopes in this round: Assuming that the minimum value of 1 point is received in this round, then each time after receiving 200 yuan red envelopes, the lower water level is 1 point; Every time you get 200 yuan, the rest of the current round should be taken away, which is the lower water level of the current round.

Calculate the upper water level of the current round of red envelopes: Assume that the current round of red envelopes is 200 yuan, and the remaining money is enough to receive 1 cent in each subsequent round, then the upper water level of this round is 200 yuan; 1 point, calculate the upper water level of this round.

In order to make the amount of red envelopes not too disparate, use the average value of red envelopes to adjust the upper water level. If the amount of the upper water mark is greater than twice the average value of the red envelope, then use twice the average value of the red envelope as the upper water mark. In other words, the amount of red envelopes grabbed in each round is at most twice the average value of the remaining red envelopes.

Finally, get the random number and use the upper water level to take the remainder. If the result is smaller than the lower water level, use the lower water level directly, otherwise use the random amount to split the amount for this round.

Flexible downgrade scheme

Abnormalities may occur everywhere in the system, and it is necessary to make a plan to deal with all links.

The main downgrade considerations of WeChat red envelopes for system exceptions are listed below.

1. Downgrade the DB for an order cache failure

The order cache has two functions, generating red envelope orders and order cache.

In the case of a cache failure, it is downgraded to directly landing on the DB, and the order number is independently generated using the id generator.

2. Degraded DB due to cache failure during grabbing

When grabbing a red envelope, query the cache to intercept invalid requests such as the red envelope has been grabbed, the user has already grabbed it, and the red envelope has expired. When the cache fails, the DB query is downgraded, and the DB current limiting protection switch is turned on at the same time to prevent the service from being unavailable due to excessive pressure on the DB.

In addition, when the cache failure downgrades the DB, the DB does not store user avatars, user nicknames, etc. (the optimization mentioned above), and at this time it is downgraded to real-time interface query. The query failed, and continued to be downgraded to display the default avatar and nickname.

3. Multi-level flexibility of fund entry at the time of dismantling

When opening the red envelope, the DB records the red envelope opening document, and then executes the fund transfer. Documents need to be landed in real time, and for fund transfers, here are multiple levels of flexible downgrading solutions:

Large-amount red envelopes are transferred in real time, and small-amount red envelopes are queued for asynchronous transfers. All red packets are queued for asynchronous transfers. The real-time process of asynchronous transfers does not perform transfers, and batches are deposited in batches with documents afterwards.

In short, after the documents are landed, the actual entry can be real-time and asynchronous, and the final consistency can be guaranteed.

4. User list downgrade

User list data in the WeChat red envelope system is non-critical path information and can be downgraded.

First of all, when writing, it is written asynchronously through MQ, and the consistency is guaranteed through timing reconciliation.

Secondly, only two screens are cached in the cache, and the user list DB is checked if the user queries more than two screens. In the case of high system pressure, users can be restricted to only check two screens.

The adjusted system has passed the practice test of the 2016 Spring Festival, passed the business peak on New Year's Eve smoothly, and guaranteed the experience of red envelope users.

The above reference source: Architecture said public account, author: Fang Leming

Fang Leming, graduated from South China University of Technology in 2011 with a major in communication and information systems, and worked in Tenpay Technology Co., Ltd. after graduation. After the establishment of the WeChat payment team, it is mainly responsible for the background structure of payment application products such as WeChat red envelopes, WeChat transfers, and AA collection.

360w QPS 10 billion byte red envelope architecture

1. Background & Challenges & Goals

1.1 Business Background

(1) Support eight ports :

The 2022 Spring Festival activities of byte-based products need to support the exchange of rewards for Baduan APP products (including Douyin/Douyin Volcano/Douyin Extreme Edition/Watermelon/Toutiao/Toutiao Express/Tomato Novels/Tomato Smooth Listening). Users can participate in activities at any of the above-mentioned ends, and the rewards obtained can be withdrawn and used at other ends.

(2) The gameplay is changeable :

There are mainly card collection, friend page red envelope rain, red envelope rain, card collection lottery and fireworks display, etc.

(3) Various rewards :

Reward types include cash red envelopes, subsidized video red envelopes, commercial advertising coupons, e-commerce coupons, payment coupons, consumer finance coupons, insurance coupons, credit card coupons, tea coupons, movie tickets, dou+ coupons, Douyin cultural and creative coupons, Avatar accessories, etc.

1.2 Core Challenges

(1) Ultra-high throughput, ultra-large concurrency, and the highest estimated 360w QPS award.

(2) There are many types of rewards, more than 10 kinds of rewards in total. A variety of rewarding scenarios, the gameplay is changeable;

(3) All-round protection from the stability of the reward system, user experience, fund security and basic operational capabilities to ensure the smooth progress of the event.

1.3 Final goal

(1) Reward entry : the data is highly reliable. Provide a unified error handling mechanism, idempotent ability to enter accounts, and reward budget control.

(2) **Display/use of rewards: **Support users to view, withdraw cash (cash), use coupons/appends, etc.

(3) Stability guarantee : In the case of large-volume account entry, the stability and perfection of the core path of the wallet is guaranteed, and user rewards are guaranteed through common stability guarantee methods such as resource expansion, current limiting, circuit breaker, downgrade, bottom-up, resource isolation, etc. Orientation core experience.

(4) Fund security : Through mechanisms such as idempotence, account reconciliation, monitoring and alarming, the security of funds is guaranteed, and the user's assets should be sent out as much as possible.

(5) Activity isolation : Realize the isolation of reward entry and display data in the three stages of internal testing, gray scale and official Spring Festival activities, without mutual influence.

2. Introduction of product requirements

Users can participate in Byte’s Spring Festival activities at any end to get rewards. Taking the scene of Douyin red envelope rain cash red envelope entry as an example, the specific business process is as follows:

Log in to Douyin → Participate in the event → Event wallet page → Click the withdrawal button → Enter the withdrawal page → Make a withdrawal → The withdrawal result page,

In addition, you can also enter the active wallet page from the wallet page.

The core scene of award distribution:

  1. Collecting cards : various card coupons will be issued when collecting cards, and large cash red envelopes will also be issued by the collecting card koi, and bonuses and coupons will be distributed when the collecting cards are drawn;
  2. Red envelope rain : send red envelopes, coupons and video subsidy red envelopes, of which the maximum amount of red envelopes and coupons is 180w QPS;

3. Wallet asset middle platform design and implementation

In the 2022 Spring Festival activities, the business parties are divided into:

UG, incentive middle platform, video red envelope, wallet direction, asset middle platform, etc.

Among them, UG is mainly responsible for the realization of the gameplay of the event, including specific event-related business logic and stability guarantees such as card collection, red envelope rain, and fireworks conference.

The orientation of the wallet is a related task to achieve reward entry, reward display, reward use and fund security in a high-traffic scenario.

Among them, the asset center is responsible for reward distribution and reward display .

3.1 The overall structure diagram of the Spring Festival asset asset middle platform is as follows:

The core system of wallet asset center is divided as follows:

  1. Asset order layer :

    Convergent eight-terminal reward entry link,

    Provide a unified interface protocol to connect with the reward distribution function of upstream activity business parties,

    At the same time, it supports budget control, compensation, order number idempotence, etc.

  2. Active wallet api layer :

    Unified reward display links, while supporting large traffic scenarios

3.2 Asset Order Center Design

Core release model:

illustrate:

Activity ID uniquely distinguishes an activity,

This Spring Festival is assigned a separate parent event ID

There is a one-to-one correspondence between the scene ID and a specific reward type,

Define the unique configuration for issuing rewards in this scenario,

The configurable capabilities of the scene ID are:

  • Issuance of reward bill copy;
  • whether compensation is required;
  • Current limiting configuration;
  • Whether to carry out inventory control;
  • Whether to perform reconciliation.
  • Provides pluggable capabilities for optional service access.

Order number design:

The asset order layer supports the idempotency of the order number dimension, and the design logic of the order number is

${actID}_${scene_id}_${rain_id}_${award_type}_${statge}

From the design level of the single number, it is guaranteed not to be overissued, and the reward user for each scene can only receive it once at most.

4. Solve the core difficult problems

4.1 Difficulty 1: Support eight-terminal reward data exchange

There are eight product ends, which need to be connected uniformly.

Among them, Douyin-based and Toutiao-based APPs have different account systems, so rewards cannot be connected through user IDs.

The specific solution is:

  • Generate a unique actID for each user
  • The mobile phone number has the highest priority. If the mobile phone numbers registered on different terminals are the same, the actIDs on different terminals are consistent.

On the basis of the unique actID, the reward data of each user is bound to actID, and the entry and query are realized through the actID dimension, which can realize the interoperability of eight terminal rewards.

The schematic diagram is as follows:

4.2 Difficulty 2: Realization of reward entry in high-scenario scenarios

In ultra-high concurrency scenarios, discovering gold red envelopes is the most critical part. There are several reasons as follows:

  1. It is estimated that the maximum flow of the discovery gold red envelope is 180w TPS.
  2. The cash red envelope itself is of high value, and it is necessary to ensure the safety of funds.
  3. Users are highly sensitive to cash, and cost issues must also be considered while ensuring user experience and functional integrity.

As mentioned above, it is found that the golden red envelope is facing relatively large technical challenges.

Sending red envelopes is actually a transaction behavior, and the direction of capital flow is from the company's cost and then into the personal account.

(1) From a technical perspective, it is necessary to support the idempotence of the order number dimension, and multiple requests for the same order number are only credited once. The order number generation logic is

${actID}_${scene_id}_${rain_id}_${award_type}_${statge}

From the design level of the order number, it is guaranteed not to overissue.

(2) To support high concurrency, there are the following two traditional solutions:

Specific program type Implementation ideas advantage shortcoming
Simultaneously credit Apply for the same computing and storage resources as the estimated traffic 1. Simple development; 2. Not easy to make mistakes; Wasted storage costs. Taking the account database as an example, the actual stress test results show that 152 database instances are required to support 300,000 red envelopes, and at least 1152 database instances are required to support 1,800,000 red envelopes, not counting other computing and storage resources such as tce and redis.
Asynchronous credit Apply for some computing and storage resources, and there is a certain difference between the actual accounting ability and the estimated 1. Simple development; 2. Not easy to make mistakes; 3. No waste of resources; User experience is greatly affected. There is a large delay in entering the account. Taking this year's event as an example, there will be a delay of more than ten minutes. After users participate in the gameplay and get rewards, they cannot see the rewards on the event wallet page, nor can they withdraw cash. There will be a large number of customer complaints, which will affect the effect of Douyin activities.

The above two technical solutions in the traditional sense all have obvious shortcomings.

Then think about it, what is the solution that can save resources and guarantee user experience?

Finally, the red envelope rain token scheme was adopted, and the specific scheme is:

It is realized by using asynchronous account entry plus a small amount of distributed storage and a more complex solution.

Let's introduce it in detail.

4.2.1 Red envelope rain token scheme:

According to the estimated distribution of red envelopes , the red envelope rain token scheme calculates that the minimum TPS to be supported for the actual entry is 30w, so there is a process of suppressing the order in the actual distribution.

Design goals:

In the event that there is a large gap between the activity estimated to be issued to users (180w) and the actual account (30w), the core experience of users is guaranteed.

The user cannot perceive the process of pressing the order when viewing and using the front-end page, that is, the viewing and using experience cannot be affected. The relevant displayed data includes balance, accumulated income and red envelope flow, and use includes cash withdrawal, etc.

Specific design plan:

Every time we send a red envelope to a user in a large traffic scenario, an encrypted token will be generated ( using asymmetric encryption , including the meta information of the red envelope: the amount of the red envelope, actID, and the time of issuance, etc.),

Stored in the client and server respectively ( disaster recovery and mutual backup ), each user has a token list.

Every time a red envelope is sent, the token's entry status will be recorded in Redis.

Then the cash red packet flow, balance and other data that the user sees on the active wallet page are the result of merging the registered red packet list + token list - registered/received token list.

At the same time, in order to ensure that the user's cash withdrawal experience does not perceive the red envelope pressing order process,

When entering the withdrawal page or clicking withdrawal, the unaccounted token list will be forced into the account,

Ensure that the balance of the account when the user withdraws cash is the total amount that should be entered into the account, and does not block the user's cash withdrawal process.

The schematic diagram is as follows:

token data structure:

The token uses the protobuf format,

It has been verified by a single test that the storage consumption is actually twice as small as using json, saving the bandwidth and storage costs of the requested network;

At the same time, the CPU consumption of serialization and deserialization is also reduced.

// 红包雨token结构
type RedPacketToken struct {
    
    
   AppID      int64  `protobuf: varint,1,opt  json: AppID,omitempty ` // 端ID
   ActID     int64  `protobuf: varint,2,opt  json: UserID,omitempty ` // ActID
   ActivityID string `protobuf: bytes,3,opt  json: ActivityID,omitempty ` // 活动ID
   SceneID    string `protobuf: bytes,4,opt  json: SceneID,omitempty ` // 场景ID
   Amount     int64  `protobuf: varint,5,opt  json: Amount,omitempty ` // 红包金额
   OutTradeNo string `protobuf: bytes,6,opt  json: OutTradeNo,omitempty ` // 订单号
   OpenTime   int64  `protobuf: varint,7,opt  json: OpenTime,omitempty ` // 开奖时间
   RainID     int32  `protobuf: varint,8,opt,name=rainID  json: rainID,omitempty ` // 红包雨ID
   Status     int64  `protobuf: varint,9,opt,name=status  json: status,omitempty ` //入账状态
}

token security guarantee:

The asymmetric encryption algorithm is used to ensure that the client stored in it is not cracked as much as possible.

If the token encryption algorithm is deciphered by hackers, it can be detected by monitoring and alarm, and can be downgraded.

4.3 Difficulty 3: Reward chains rely on multiple stability guarantees

The schematic diagram of downgrading the process of sending red envelopes is as follows:

According to historical experience, the more complex the function is, the more dependencies will increase, and the corresponding stability risk will be higher, so how to ensure the stability of the system with high dependencies?

solution:

The most basic function of cash red envelope entry must be guaranteed,

It is to enter the red envelope obtained by the user into the account,

The core functions need to support idempotency and budget control (to avoid oversending),

The idempotent design of the red envelope account strongly relies on the database to maintain transactional consistency.

However, if an extreme situation occurs, there may be problems in the intermediate link. If it is a weak dependency, it needs to be downgraded without affecting the main process of distribution.

The shortest path for sending red envelopes in the direction of the wallet is to rely on the computing resources of the service instance and MySQL storage resources to achieve cash red envelopes.

The strength and weakness of the red envelope depends on the combing diagram:

psm dependent service Is it strongly dependent downgrade plan Downgrade impact
Asset center tcc yes downgrade read local cache none
bytkekv no Active downgrade switch, skip bytekv, rely on downstream to be idempotent none
Fund transaction layer Distributed lock Redis no Passive downgrade, call failed, skip directly Basically none
token Redis no Active downgrade switch without calling Redis Users can perceive that there is a delay in account entry, and there will be many customer complaints
MySQL yes If there is a problem with the master, please contact the dba to cut the master Red envelopes are unavailable during failures

4.4 Difficulty 4: Budget control of large-volume card issuance coupons

In a scenario where large volumes of coupons are issued intensively, the wallet side cooperates with the algorithm strategy to control the inventory of card coupons issued to prevent over-issuance.

Implementation:

(1) The wallet asset center maintains the consumption and issuance amount of each coupon template ID.

(2) Before each coupon is issued, read the consumption and total inventory of the coupon template ID. At the same time, a threshold will be set. If the remaining amount of the coupon is less than 10%, the coupon will not be issued (use coupons or blessings to cover).

(3) The coupon issuance process accumulates the consumption of each coupon template ID (using the Redis incr command to atomically accumulate the consumption), and then compares it with the total active inventory. If the consumption is greater than the total inventory, it will be rejected to prevent over-issuance. It is also a bottom-up process.

Specific flow chart:

Optimization direction:

(1) When using Redis counting under heavy traffic, there will be a hot key problem for a single key, which needs to be solved by splitting the key.

(2) In a high-traffic scenario, there will be a timeout problem when operating Redis . Returning to the upstream process, the upstream will continue to retry to issue coupons, which will consume more inventory and issue less. The actual activity inventory of this Spring Festival event is based on the estimated inventory. % to alleviate the problem of less occurrences caused by timeouts.

4.5 Difficulty Five: Stability Guarantee of Reading and Writing of Hot Keys in High QPS Scenarios

The maximum traffic is estimated to be 180wQPS for reading and 30wQPS for writing.

This is a typical scenario with huge traffic, hot keys, update delays are not sensitive, and non-data strong consistency scenarios (numbers are always accumulated),

At the same time, it is necessary to do a good job in disaster recovery and downgrade processing , and the error between the amount displayed in the actual event and the expected distribution value of the product is less than 1%.

4.5.1 Option 1

It is easier to think of using the Redis distributed cache to implement the reading and writing of a single key under high QPS, but the reading and writing of a single key will be hit on an instance. The bottleneck of a single instance after pressure testing is 3w QPS.

So one optimization is to split multiple keys, and then use the local cache to cover the bottom line.

The specific writing process:

The design splits 100 keys, and uses the incr command to accumulate the number according to the requested actID%100 each time the red envelope is sent. Because the idempotence cannot be guaranteed, no retrying will be made after timeout.

Reading process:

Similar to the write process, the local cache is read first,

If the local cache value is 0, then read the key values ​​of each Redis and add them together for return.

question:

(1) Splitting 100 keys will lead to the problem of read diffusion, requiring more Redis resources to be applied, and the storage cost is relatively high.

Moreover, there may be a read timeout problem, and it cannot be guaranteed that all keys are read successfully at one time, so the returned results may be less than the previous one.

(2) In terms of disaster recovery solutions, if you apply for Redis backup, you will also need more storage resources and additional storage costs.

4.5.2 Scheme 2

Design ideas:

Optimizing on the basis of the realization of scheme 1,

In the writing scenario, merge write requests through the local cache and perform atomic accumulation.

The read scene returns the value of the local cache, reducing the occupation of additional storage resources.

Using Redis to implement centralized storage, everyone will read the same value in the end.

Specific design plan:

When each docker instance starts, it will execute scheduled tasks, which are divided into reading Redis tasks and writing Redis tasks.

Reading process:

  1. The local scheduled task is executed every second,

    Read the value of Redis single key, if the obtained value is greater than the local cache, then update the value of the local cache.

  2. The externally exposed sdk can directly return the value of the local cache.

  3. There is a problem that needs to be paid attention to. There is no data within the first second of each instance startup, so it will block the read and return when there is data.

Write process:

  1. Because the read is to read the local cache (the local cache does not expire), so it is only necessary to handle the writing under concurrent conditions.

  2. Local cache write variables use go's atomic.AddInt64 to support atomic accumulation of local write cache values.

  3. Every time the scheduled task of updating Redis is executed,

    First copy the local write cache to the amount variable, and finally incr the value of the amount to the Redis single key, so that the value of the single key of Redis has been accumulated.

  4. The disaster recovery solution is to use the backup Redis cluster, double write when writing,

    Once the host group hangs up, a configuration switch is designed to support reading backup Redis. The data consistency of two Redis clusters is realized through timing tasks.

The specific writing flow chart is as follows:

The traffic of calling Redis in this solution is directly proportional to the number of instances .

After investigation, the service on the read side has 20,000 instances of the main site, and the service on the write side has 8,000 instances in the asset center.

Therefore, the actual QPS to be supported by Redis is 28,000/timed task execution interval (unit is s),

It has been verified by pressure testing that a single instance of Redis can support 20,000 get and 8k incr operations for a single key.

Therefore, the execution time interval of the scheduled task is set to 1s . If there are more instances, you can consider extending the execution time interval.

4.5.3 Scheme Comparison
advantage shortcoming
Option One 1. Simple implementation cost 1. Waste of storage resources; 2. Difficult to do disaster recovery; 3. Cannot be accumulated all the time;
Option II 1. Save resources; 2. The disaster recovery solution is relatively simple, and it also saves resource costs; 1. The implementation is slightly complicated, and the problem of concurrent atomic accumulation needs to be considered

in conclusion:

Considering the implementation effect, resource cost and disaster recovery, we finally chose the second option to go online.

4.6 Difficulty Six: Fund Security Guarantee in Large Flow Scenarios

Wallet has done three things during this Spring Festival event to ensure the safety of funds issued by cash red envelopes with large flow and large budget:

  1. The interception of cash red envelope distribution and overall budget control
  2. Interception of the upper limit of the amount of a single cash red envelope
  3. Fund reconciliation in the scene of sending red envelopes with large traffic
  • Hour-level reconciliation: support h+1 hour-level reconciliation for red envelope rain/collection card/firework red envelope distribution, and set up a bottom-up h+2 reconciliation for some scenarios.
  • Quasi-real-time reconciliation: Red envelope data that has been recorded in the red envelope rain checks the wallet asset middle platform and activity side to do real-time reconciliation

Schematic diagram of multi-dimensional verification:

Quasi-real-time reconciliation flow chart:

illustrate:

The quasi-real-time reconciliation monitoring and alarm can detect whether there is an abnormal account entry in time, and if the alarm is found, there will be an emergency plan to deal with it.

5. Common pattern abstraction

After experiencing the design and implementation of the super-large traffic activities during the Spring Festival, I have some summaries and experiences to share with you.

5.1 Disaster recovery and downgrade level

In large-traffic scenarios, in order to ensure the final online effect of the event, disaster recovery must be done well.

Refer to common implementation solutions in the industry, such as downgrading, current limiting, circuit breakers, resource isolation, and use and storage estimates based on the estimated number of participants and effects of activities.

5.1.1 Current Limiting Level

(1) In terms of current limiting, the api layer nginx inbound traffic limit, distributed inbound traffic limit, and distributed outbound traffic limit are applied.

These current limiters are public middleware at the ByteDance company level and have been verified by large traffic.

(2) First, the actual single-instance pressure test was carried out, and the capacity was expanded based on the traffic carried by the single instance and the estimated traffic to the service during the Spring Festival event, and combined with the downstream situation,

In tlb, the inflow, inflow limit and outbound flow limit have been configured in detail and completely.

Current limiting goal:

Ensure the stability of its own services, prevent external expected outflows from knocking down its own services, prevent an avalanche effect, and ensure core business and user core experience.

The simple cluster current limit is the current limit of the instance dimension.

QPS of current limit for each instance=total configured current limit QPS/number of instances,

For multi-machine low QPS, there may be inaccurate situations. It is necessary to go through the actual pressure test and adjust the configuration value in time.

For distributed inbound traffic and outbound traffic limiting, the two usage methods are as follows, each of which supports high and low QPS, the difference is only in the usage and functions of the SDK.

Generally, low QPS requires high precision, and the redis counting method is adopted, and the user provides its own redis cluster.

High QPS requires low accuracy, and degenerates into a single-instance current limit for the total number of QPS/tce instances.

5.1.2 Degradation levels

For high-traffic scenarios, each core function must have a corresponding downgrade solution to ensure the stability of the core link in emergencies.

(1) A sufficient operation plan has been prepared for the entry of the Spring Festival rewards and the direction of the event wallet page. There are a total of 26 downgrade switches, and the car is abandoned at critical moments to prevent the core link from being affected by a single point of problem.

(2) Taking the gold red envelope link as an example, the final complete downgrade solution for the wallet direction is to only rely on docker and MySQL, and other dependencies can be downgraded. If the MySQL master has problems, you can contact the master in an emergency, although the last one is useless On, but the premise must be well designed to ensure the safety of activities.

5.1.3 Resource isolation level

(1) Improve development efficiency without reinventing the wheel .

Because the wallet asset center also supports the needs of Douyin asset distribution on a daily basis, this Spring Festival event also reused the existing interface and code process to support award distribution.

(2) At the same time, cluster isolation was implemented at the service level for this Spring Festival event .

Create a dedicated active cluster to isolate the underlying storage resources, so that active traffic and regular traffic do not affect each other.

5.1.4 Storage Estimation

(1) It is not only necessary to consider and verify that Redis or MySQL storage can withstand the corresponding traffic, but also to estimate whether the storage resources are sufficient according to the actual acquisition participation and distribution of data.

(2) For ByteDance's Redis component,

It can be expanded vertically (increase storage for each instance, up to 10G), or horizontally (the upper limit of a single computer room is 500 instances). Because Redis is synchronized by three computer rooms, only the storage limit of one computer room is considered when calculating storage. Can.

Sufficient buffer must be reserved, because horizontal expansion is a very slow process. In case of insufficient storage resources in an emergency, the dependent storage can only be removed in advance through the configuration switch, which needs to be designed in advance.

5.1.5 Pressure measurement level

During this Spring Festival event, full link stress testing was done on wallet reward entry and event wallet page. The following is a summary of some experiences.

  1. Before the stress test, it is necessary to establish a monitoring dashboard for stress testing the entire link, so that problems can be found in a timely and convenient manner during the stress test.
  2. For the MySQL database, before the red envelope rain and other large-traffic official activities start, a small-traffic pressure test is performed to preheat the database, and the link is built in advance before the peak traffic, reducing the time-consuming of building a large number of links during official events, and ensuring the database level of the red envelope link. stability.
  3. During the stress test process, the pressure test standard must be transmitted, and the full link is supported to identify the stress test traffic for special logic processing, without interfering with the normal online business.
  4. There is no special processing for pressure measurement traffic, and the processing flow of pressure measurement traffic is consistent with that of online traffic.
  5. During the pressure test, it is necessary to verify whether the computing resources and storage resources can withstand the estimated traffic
  • Sort out the stress test plan , set a reasonable initial flow rate based on historical experience, gradually increase the stress test flow rate, and observe various stress test indicators in real time.
  • Storage resource pressure test data should be isolated from online data . For MySQL and Bytekv, it is to build a pressure test table. For Redis and Abase, it is a pressure test key. Add a pressure test prefix to the online key.
  • The pressure test data should be cleaned up in time . Redis and Abase have a short expiration time, and the expiration mechanism is more convenient. If you forget to set the expiration time, you can identify the prefix of the pressure test mark according to the written script and delete it.
  1. After the stress test, you should also pay attention to whether the indicators of storage resources meet expectations.

5.2 Thinking about microservices

In daily technical design, everyone will abide by the principles and specifications of microservice design, split different modules according to system responsibilities and core data models, and improve the efficiency of development iterations without affecting each other.

However, microservices also have their disadvantages. For scenarios with very large traffic, the functions are more complicated, and they will pass through multiple links, which consumes computing resources extremely.

The asset center of this Spring Festival event provides sdk packages instead of rpc for microservice link aggregation to provide basic capabilities to the outside world , such as querying balances, judging whether users have received rewards, and forcing accounts. The maximum access traffic is tens of millions, which saves computing resources of tens of thousands of CPU cores compared with the use of microservice architecture.

6. The future evolution direction of the system

(1) Sort out upstream and downstream needs and pain points, optimize the design and implementation of the asset middle platform, improve the basic capabilities, optimize the service structure, and provide one-stop services, so that access activities can focus more on the research and development of activity business logic.

(2) Strengthen the capacity building of real-time and offline data kanban, so that the display of reward distribution data is clearer and more accurate.

(3) Strengthen configuration and document construction, reduce the docking cost of docking activities internally, and improve the access efficiency of event business parties externally.

The above reference source: ByteDance technical team: Design and implementation of the entry and display of the large flow reward system for the Spring Festival wallet

Recommended reading:

" Docker interview questions (the most complete in history + continuous updates) "
" Scenario question: Assuming 100,000 people make a sudden visit, how can your system not avalanche? " Nin 's Java Interview Collection "
" Springcloud gateway's underlying principles and core actual combat (the most complete in history) " " Flux, Mono, Reactor actual combat (the most complete in history) " " sentinel (the most complete in history) " " Nacos (the most complete in history)" Sharding-JDBC underlying principle and core practice (the most complete in history) " " Detailed explanation of TCP protocol (the most complete in history) " " clickhouse ultra-lower layer principle + high-availability practice (the most complete in history) " " nacos high-availability (Illustration + second understanding + the most complete in history) " " King of the Queue: Disruptor Principle, Architecture, Source Code One Article Penetration " " Ring Queue, Striped Ring Buffer Striped-RingBuffer (the most complete in history) "










" Get it done in one article: The chaotic relationship between SpringBoot, SLF4j, Log4j, Logback, and Netty (the most complete in history)
" Singleton mode (the most complete in history)
" Red- black tree (graphic + second understanding + most complete in history) "
" Distributed transactions (Understand in seconds) "
" King of Cache: Caffeine source code, architecture, principle (the most complete in history, 10W super long text) "
" King of Cache: The use of Caffeine (the most complete in history) "
" Java Agent probes, bytes Code enhancement ByteBuddy (the most complete in history) "
" Docker principle (illustration + second understanding + most complete in history) "
" Redis distributed lock (illustration - second understanding - most complete in history) "
" Zookeeper distributed lock - illustration - second understanding "
" Zookeeper Curator event monitoring - 10 minutes to understand "
" Netty sticky package unpacking | The most comprehensive interpretation in history "
" Netty 1 million high concurrent server configuration "
" Springcloud high concurrency configuration (full understanding) "

Guess you like

Origin blog.csdn.net/crazymakercircle/article/details/128697096