Demand Analysis Case: Birthday Coupon Delivery Accident and Demand Optimization

This article introduces a demand for birthday coupons, accidents caused by insufficient consideration, and the process of using the demand optimization plan instead of using the technical optimization plan.
The accident occurred around 2019, and the relevant data is not certain, so it is for reference only.

1. The cause of the accident

An Internet catering system with tens of millions of registered users.
Around August and September, in order to activate old customers and improve user retention, the product put forward a requirement: to issue daily coupons to registered users. To be more specific,
give users a birthday coupon on the 7th day before their birthday. Coupon, and send a WeChat notification.
This requirement is not complicated. It took about 2 weeks to go online after development and testing, and it has been running stably for 2 or 3 months after going online. Users can successfully receive coupons and spend successfully.
The code logic at that time was roughly as follows:

  • Create a new job and start it at 0:00 every day
  • According to SQL, find out all users whose birthdays are after the 7th day. The SQL pseudo code is roughly as follows:
    select id, wx_openid, username from users where birthday=now+7
  • To traverse these users, execute in sequence:
    • Judging that the user has not given a birthday coupon;
    • Generate a birthday coupon and insert it into the user coupon table;
    • Send WeChat push messages.
  • End, wait for the next day to execute
    insert image description here

2. The process of the accident

During the breakfast time on December 25, an alarm occurred in the payment module of the catering system, and a large number of payment timeouts occurred.
After preliminary investigation, the API response of the ticket module is very slow, all in seconds or even tens of seconds;
further investigation shows that there are a lot of deadlocks and slow queries in the database of the ticket module:

  • The deadlock is the conflict between the transaction of inserting the birthday coupon and the query and write-off business;
  • Slow query means that the amount of data in the coupon table has increased by tens of millions of lines of birthday coupon data within a few hours;

The solution is to stop the coupon delivery job first, add the corresponding index, troubleshoot, and the specific investigation and processing process will not be elaborated.

Because there was an error in the payment module, and there were a large number of user complaints at Christmas, it was a major production accident.
Reasons for post-mortem recovery:

  • When users register, the default birthday is January 1st. A large number of users register with the default value without modification, resulting in more than 70% (tens of millions) of users whose birthday is on January 1st;
  • Coupons are issued serially, and WeChat push is required (no asynchronous), resulting in a particularly slow process, which lasts until the job is not completed during the day, and users come in and cause a large number of deadlocks;
  • The data in the user coupon table has suddenly increased from millions to tens of millions. Some queries that were originally fast, due to the lack of effective indexes, slow down the queries, causing an avalanche effect.

3. Accident optimization plan

According to the reasons for the review, some technical solutions were discussed internally:

  • Issue coupons in batches instead of INSERT one by one;
  • Process WeChat push asynchronously through memory events or message middleware;
  • Review all ticket business SQL, clean up slow queries;
  • Regularly back up and clean up expired and unused ticket data;
  • Sub-table of user coupon table, for example: according to user ID, take modulo 100 and divide into 100 tables

Among these plans, the first three are still ok, and they are arranged to be implemented;
the fourth one may affect the business, and users cannot find historical data;
the fifth sub-table transformation is too much work, and the construction period will be relatively long, so it is listed first. Do, follow-up arrangements.

4. Demand optimization plan and realization

After reflection later, some key points of this demand for coupons:

  • When there are tens of millions of users, 80% are inactive users, and their birthday coupons are basically unusable;
  • The birthday coupon is valid for 7 days, after which it will become invalid and cannot be used again.

After repeated internal discussions and thinking, a product solution was finally obtained, which requires certain changes in requirements. The general idea: the coupons
issued to users should be stored separately, and users should be reminded to collect them when they enter, and unclaimed coupons will be invalidated. There will be no large amount of invalid data in the coupon table.
Later, find the product manager to communicate and confirm, and the two points affected by the demand change:

  • Increase the user's click to collect behavior, there is a certain loss of experience;
  • Unclaimed coupons cannot be seen by the user, but the expired status can be seen originally;
    Note: It is also possible to check, there is a certain amount of development work, and there is no problem in performance.

After evaluation, I feel that the experience problem is acceptable, and the unclaimed coupons do not need to be seen by users, so I decided to implement this solution. The
final requirements and technical implementation steps are as follows:

  • Added a new database table: the coupon table to be collected, the fields are consistent with the old user coupon table;
  • When issuing coupons to users, insert a new waiting coupon form and push it on WeChat;
  • When the user clicks the push message or logs in to enter the system, if there is a coupon to be claimed, a message will pop up: Do you want to claim this coupon; after the
    user clicks to receive, the coupon will be inserted into the user coupon table, and the claim log will be recorded;
  • Periodically delete the expired data of the coupon table to be collected and the received data (you can consider backup, depending on business needs);

Because of this change in demand, it can be used not only for birthday coupons, but also for other promotional activities.
After this transformation is launched, the data growth of the user coupon table has been reduced by more than 80%, and there is no need to consider the technology of sub-table Behavior.

5. Reflection

Through this demand optimization, several points are extended:

  • Technicians must have a deep understanding of the business and become experts in the business field in order to make good products;
  • The accumulation of experience has guiding significance for subsequent product requirements and problems, and can become a direct or indirect solution reference;
  • There is no silver bullet for a technical solution. In this case, sub-database and sub-table is not the best solution;
    until I left the job, this user voucher table did not do the work of sub-table, and there was never any performance problem;
  • There is the best solution, only the relatively suitable one.

6. Postscript

After this demand optimization plan came out, I was quite happy at first.
Later, when I was using apps such as Meituan, Taobao, and McDonald’s, I suddenly realized that they had already implemented this plan, such as Double
11, Taobao, etc. There will be pop-up windows for you to receive coupons; McDonald’s will pop-up windows for you to receive coupons every week;

In the past, when I used these apps, I just subconsciously clicked to collect the coupons, and I occasionally complained about why
.

Sure enough, I answered that sentence:
Sometimes your epiphany is just the basic skills of others.

When problems arise, first refer to the solutions of other products, and stand on the shoulders of giants.

Guess you like

Origin blog.csdn.net/youbl/article/details/131355430