Social software red envelope technology decryption (11): the most complete decryption of WeChat red envelope random algorithm (including code implementation)

When writing the content of this article, I refer to the information on the Internet. Please refer to the "Reference Materials" section for details. Thank you for sharing.

1 Introduction

This series of articles has compiled 10 articles, but none of them involve the implementation of the specific red envelope algorithm, mainly due to the following two reasons.

On the one hand, the red envelope functions in various social/IM products are highly homogenized. The "playability" of the red envelope algorithm is the "core competitiveness". This is the differentiated competitive idea of ​​the homogenization function and will not be disclosed casually. .

On the other hand, there are still all kinds of red envelope plug-ins on the market, and once these algorithms are made public, it is very likely that this group of plug-in developers will come up with something.

Therefore, in this case, if you want to implement the red envelope function in social/IM products, you can basically only figure out how to implement the red envelope algorithm, and it is difficult to find the algorithm of the big factory to apply directly.

In the spirit of im knowledge dissemination of the instant messaging network, I collected and referred to a large number of online materials, and integrated more reliable sources of information, and then came up with this article. Based on limited information, this article shares some technical points in the implementation of the WeChat red envelope random algorithm, and sorts out two more reliable red envelope algorithm implementation ideas (including runnable implementation code), and hope to give you a red envelope algorithm development belt To inspire.

Statement: The information in this article is compiled from the Internet and is for study and research purposes only. If anything is wrong, please notify the author.

study Exchange:

-5 groups for instant messaging development and communication: 215477170  [recommended]

-Introduction to Mobile IM Development: "One entry is enough for novices: Develop mobile IM from scratch "

-Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK   [recommended]

This article has been simultaneously published on the "Instant Messaging Technology Circle" official account, welcome to pay attention:

▲ The link of this article on the official account is: click here to enter , the original link is: http://www.52im.net/thread-3125-1-1.html

2. Series of articles

3. Summary of the main points of the WeChat red envelope algorithm

This is the only one that can be found to discuss the technical points of the WeChat red envelope algorithm with the participation of the WeChat team. It was shared in 2015. It was almost not long after the WeChat red envelopes were popular. It was probably because the people in the WeChat technical team had no concerns outside of these technologies at the time, so they shared limited information. The information is rare. This time I have reorganized it and can be used as Reference materials used. The following is the text of the data.

Source: Technical discussion from a certain architecture group of InfoQ, compiled by Zhu Yuhua (personal blog is: zhuyuhua.com (currently unavailable)).

Information background: The cause is that a friend consulted on the structure of WeChat red envelopes in Moments, so with the WeChat team members participating in the discussion, I (referring to "Zhu Yuhua") compiled the technical points of this discussion, which is the following content (content It is a question and answer format).

3.1 Technical points of algorithm realization

Q: When is the amount of WeChat calculated?

Answer: The WeChat amount is calculated in real time when it is split. It is not pre-allocated. It uses pure memory calculation and does not require budget space for storage.

Why calculate the amount in real time? The reason is: the real-time efficiency is higher, and the budget is inefficient. The budget also accounts for additional storage. Because the red envelope only occupies one record and the validity period is only a few days, it does not require much space. Even when the pressure is high, the horizontal expansion machine is.

Question: Regarding the real-time performance, why did you find the red envelope when you clicked it?

Answer: The amount of the red envelope in 2014 is known at one point, and the operation is divided into two operations, first grab the amount, and then transfer.

The demolition and grabbing of the red envelopes in 2015 are separate and need to be clicked twice. Therefore, the red envelopes will be grabbed, but after the red envelopes are clicked, they will be notified that the red envelopes have been received. Entering the first page does not mean that you got it, it just means that there are still red envelopes at that time.

Question: Regarding the distribution algorithm, how is the amount in the red envelope calculated? Why is there a big difference in the amount of red envelopes?

Answer: Random, the quota is between  0.01  and the remaining average  2  . For example: send  100  dollars, a total of  10  red envelope, then the average is  10  dollars one, then sent to the amount of red in  0.01 yuan to 20 fluctuates between element.

When the first  3  red envelopes were received with a total of  40  yuan, there are 60  dollars left  , a total of  7  red envelopes, then  the amount of these  7 red envelopes is between 0.01~ ( 60/7 * 2 ) = 17.14 .

Note: The algorithm here is that every time one is robbed, the remaining algorithm will execute the above algorithm again (Tim also thinks the above algorithm is too complicated, I don't know what kind of consideration it is based on).

If this calculation continues, it will exceed the entire amount at the beginning, so if it is not enough at the end, then the following algorithm will be adopted: ensure that the remaining users can get a minimum of 1 cent.

If the person in front is not lucky, then the more the remaining balance, the more red envelope amount, so the actual probability is the same.

Question: The design of the red envelope

Answer: WeChat pulls the amount data from TenPay, generates the number/red envelope type/amount, and puts it in the redis cluster. The app puts the red envelope ID request into the request queue. If it finds that the number of red envelopes exceeds the number, return directly . According to the logic of the red envelope, the token request is successfully obtained, and then Tenpay will make a consistent call. By saving transaction records on both sides like Bitcoin, the transaction will be handed over to a third-party service for audit. If there is an inconsistency in the transaction process, it will be forced to return. .

Question: Concurrency processing: How to calculate the red envelope is robbed?

Answer: The cache will resist invalid requests and filter out invalid requests. The actual amount of access to the background is not large. The cache records the number of red envelopes, and the number of atomic operations is decremented. When it reaches 0, it means that it has been robbed. Tenpay  prepares 200,000 transactions per second, but the actual amount is less than  80,000 per second .

Question: How to maintain 8w writes per second?

Answer: Multi-master sharding, horizontal expansion of the machine.

Q: What is the data capacity?

Answer: A red envelope only occupies one record, and the validity period is only a few days, so it does not require much space.

Question: Isn't it stressful to query the distribution of red packets?

Answer: The number of people who grabbed the red envelope and the red envelope are all in a cache record, so there is not much query pressure.

Q: One red envelope and one queue?

Answer: There is no queue, one red packet is one piece of data, and there is a counter field on the data.

Question: Have you proved from the data that the probability of each red envelope is equal?

Answer: It's not absolutely equal, it's just a simple head-shot algorithm.

Q: Will there be two best algorithms?

Answer: There will be the same amount, but there is only one with the best luck, and the first one is the best.

Q: Is the data updated every time I receive a red envelope?

Answer: Every time you grab a red envelope, cas updates the remaining amount and the number of red envelopes.

Question: How to deposit red envelopes into the warehouse?

Answer: The database will accumulate the number and amount already received, and insert a receipt record. Entry into the account is a background asynchronous operation.

Q: What should I do if there is an error in the account? For example, the number of red envelopes is gone, but there is still a balance?

Answer: There will be a take all operation at the end. There is also a reconciliation to guarantee.

Question: Since there is atomic reduction during the grab, shouldn't it happen that the grab is not taken apart?

Answer: The atomic subtraction here is not an atomic operation in the true sense. It is the CAS provided by the Cache layer. It is constantly trying by comparing version numbers.

Q: What should I do if the cache and db are down?

Answer: Active and standby + reconciliation.

Question: Why do we need to separate looting and dismantling?

Answer: The general idea is to set up multiple layers of filters, layer by layer to reduce the flow and pressure.

This design was originally designed because the grab operation is the business layer, and the demolition is the accounting operation. An operation is too heavy and the interruption rate is high. From the interface level, the first interface is purely cache operation and has strong pressure capability. A simple query Cache blocked most of the users, and the first screening was done, so most people will see the prompt that they have finished grabbing.

Question: Is there any strategy for issuing red envelopes or withdrawing cash after grabbing red envelopes?

Answer: Large amount priority crediting strategy.

In response to the above technical points, someone also drew a schematic diagram (this is a relatively clear version that can be found on the Internet): 

3.2. Simulation of WeChat grabbing red envelopes

Regarding the information compiled in the previous section, when someone sends a red envelope of N people in the WeChat group with a total amount of M yuan, the approximate technical logic of the background is as follows.

3.2.1) Background operation of issuing red envelopes:

  • 1) Add a red envelope record in the database, store it in CKV, and set the expiration time;
  • 2) Add a record to the Cache (may be Tencent's internal kv database, based on memory, with landing, kernel mode network processing module, providing services in the form of kernel module)) to store the number of people grabbing red envelopes N.

3.2.2) Grab the red envelope background operation:

  • 1) Red envelope grabbing is divided into grabbing and demolition: the grab operation is completed at the Cache layer, and the number of red envelopes is reduced through atomic reduction operations. When it reaches 0, it means that the grabbing is done. In the end, the actual amount of demolition operations in the background is not large. Invalid requests are directly blocked outside the Cache layer.
  • The atomic subtraction operation here is not an atomic subtraction operation in the true sense. It is the CAS provided by its Cache layer. By comparing the version numbers and trying continuously, there is a certain degree of conflict. The conflicting user will let it go to the next step. Operation, this also explains why some users grabbed it and found it was over.
  • 2) Demolition of the red envelope is completed in the database: accumulate the number and amount already received through the transaction operation of the database, insert a collection stream, and the entry is an asynchronous operation, which also explains why the red envelope is not seen in the balance after the red envelope is received during the Spring Festival .
  • The amount is calculated in real time when it is split. The amount is a random number between 1 point and 2 times the remaining average value. A red envelope with a total amount of M yuan. The largest red envelope is M * 2 /N (and will not exceed M). When the red envelope is removed, the remaining amount and number will be updated. Tenpay is prepared for 200,000 transactions per second, which is actually only 80,000 per second.

4. WeChat red envelope algorithm simulation implementation 1 (including code)

According to the technical key points of the WeChat red envelope random algorithm in the previous section, an algorithm has been implemented. The following is for reference. (Note: The content of this section is quoted from the article "A Preliminary Study on WeChat Red Packet Random Algorithm ")

4.1, algorithm agreement

The algorithm is very simple. Like the WeChat algorithm, it is not calculated in advance, but calculated when the red envelope is grabbed.

That is: the amount is random, and the amount is between 0.01 and the remaining average *2. (See the previous section " About the distribution algorithm, how to calculate the amount in the red envelope? Why is there a big difference in the amount of each red envelope? ")

4.2, code implementation

The logic of the algorithm is mainly:

public static double getRandomMoney(RedPackage _redPackage) {

    // remainSize The remaining amount of red packets

    // remainMoney remaining money

    if(_redPackage.remainSize == 1) {

        _redPackage.remainSize--;

        return (double) Math.round(_redPackage.remainMoney * 100) / 100;

    }

    Random r     = newRandom();

    double min   = 0.01; //

    double max   = _redPackage.remainMoney / _redPackage.remainSize * 2;

    double money = r.nextDouble() * max;

    money = money <= min ? 0.01: money;

    money = Math.floor(money * 100) / 100;

    _redPackage.remainSize--;

    _redPackage.remainMoney -= money;

    return money;

}

The data structure of LeftMoneyPackage is as follows:

class RedPackage {

    int remainSize;

    double remainMoney;

}

The relevant data for initialization during testing is:

static void init() {

    redPackage.remainSize  = 30;

    redPackage.remainMoney = 500;

}

The attachment is a complete Java code file that can be run:

(Unable to upload attachments, please download from this link if necessary: http://www.52im.net/thread-3125-1-1.html )

4.3, test results

4.3.1 Single test

According to the initialization data in the above code (30 people grabbed 500 pieces), it was executed twice, and the results are as follows:

//the first time

15.69   21.18   24.11   30.85   0.74    20.85   2.96    13.43   11.12   24.87   1.86    19.62   5.97    29.33   3.05    26.94   18.69   34.47   9.4 29.83   5.17    24.67   17.09   29.96   6.77    5.79    0.34    23.89   40.44   0.92

//the second time

10.44   18.01   17.01   21.07   11.87   4.78    30.14   32.05   16.68   20.34   12.94   27.98   9.31    17.97   12.93   28.75   12.1    12.77   7.54    10.87   4.16    25.36   26.89   5.73    11.59   23.91   17.77   15.85   23.42   9.77

The first random red envelope data chart is as follows: 

▲ The x-axis is the order of grabs, and the y-axis is the amount of grabs

The second random red envelope data chart is as follows:

▲ The x-axis is the order of grabs, and the y-axis is the amount of grabs

4.3.2 Multiple Means

The average of 200 repeated executions:

▲ The x-axis is the order of grabs, and the y-axis is the average probability of the amount of money grabbed this time

Average value of repeated executions 2000 times: 

▲ The x-axis is the order of grabs, and the y-axis is the average probability of the amount of money grabbed this time

It can be seen from the average results of the above two figures that the probability of the amount of money that can be grabbed each time in this algorithm is almost equal, which is more reasonable in terms of randomness.

5. WeChat red envelope algorithm simulation implementation 2 (including code)

I am very interested in random algorithms. It just so happens that my recent research direction is a bit biased towards random numbers, so I also implemented the WeChat red envelope distribution algorithm myself (for the main points of the algorithm, please refer to the third section of this article). (Note: The content of this section is quoted from the article " Analysis of WeChat Red Packet Algorithm ")

5.1, code implementation

It can be understood from the third section that WeChat does not pre-allocate all the red envelope amounts at the beginning, but calculates it at the time of demolition. The advantage of this is high efficiency and real-time. How is the red envelope calculated in this code? Please refer to section 4 " Regarding the distribution algorithm, how is the amount in the red envelope calculated? Why is there a large difference in the amount of red envelopes? ".

Based on this idea, you can write a red envelope distribution algorithm:

/**

 * Not perfect red envelope algorithm

 */

public static double rand(double money, int people, List<Double> l) {

    if(people == 1) {

        double red = Math.round(money * 100) / 100.0;

        l.add(red);

        return0;

    }

    Random random = newRandom();

    double min = 0.01;

    double max = money / people * 2.0;

    double red = random.nextDouble() * max;

    red = red <= min ? min : red;

    red = Math.floor(red * 100) / 100.0;

    l.add(red);

    double remain = Math.round((money - red) * 100) / 100.0;

    return remain;

}

The overall idea of ​​the algorithm is very simple. Just pay attention to the last person. At this time, no random number calculation is performed, but the remaining amount is directly used as a red envelope.

5.2, the first analysis

Using the above algorithm, the user's behavior of grabbing red envelopes can be analyzed. The imitation here is: 30 yuan red envelope, 10 people grab it. Operate 100 times.

The following results can be obtained: 

▲ The x-axis is the order of grabs, and the y-axis is the amount of grabs

It can be easily seen from the above picture that the later the one who grabs, the greater the risk and the greater the profit, and the greater the chance of getting the "best luck".

How about the distribution of the red envelope face value?

▲ The x-axis is the order of grabs, and the y-axis is the average value of the grab amount after 100 repetitions

As can be seen from the figure above, they are all relatively close to the average (3 yuan).

What about repeating 1,000 times?

▲ The x-axis is the order of grabs, and the y-axis is the average value of the amount of grabs repeated 1000 times

Closer. . .

It can be seen that this algorithm allows everyone to grab the red envelope denominations with roughly equal probability.

5.3, shortcomings

Someone asked this question: 

Next, he posted several screenshots of his experiments. I took one here. If you are interested, you can go to Zhihu's question to view more pictures.

At this time, my buddies were in discussions with mine, and they told me that there is indeed a certain rule that might give the last person to grab some small advantage, such as 0.01 more.

For example, if you send 6 packages with a total of 0.09, the last one to grab has a great probability of 0.03.

However, my previous code could not reflect this.

For example, if 10 people open a package of 0.11 yuan, my result is:

It can be seen that the above code still has shortcomings.

So I have a guess:

WeChat may not randomize the full amount. The amount may have been processed before the red envelopes are distributed, for example, subtract (number of red envelopes*0.01) in advance, and then add 0.01 to the random value of each red envelope to This ensures that the minimum value of each red envelope is 0.01.

This guess may solve the doubts of that friend and my buddy.

5.4, ​​perfect algorithm

Make a simple correction to the code on the original basis:

public static double rand(double money, int people, List<Double> l) {

    if(people == 1) {

        double red = Math.round(money * 100) / 100.0;

        l.add(red+0.01);

        return 0;

    }

    Random random = newRandom();

    double min = 0;

    double max = money / people * 2.0;

    double red = random.nextDouble() * max;

    red = red <= min ? min : red;

    red = Math.floor(red * 100) / 100.0;

    l.add(red+0.01);

    double remain = Math.round((money - red) * 100) / 100.0;

    return remain;

}

This algorithm, the value of money passed in the first call is the total amount minus the number of red envelopes * 0.01, roughly like this:

_money = _money - people * 0.01;

5.5, the second analysis

5.5.1 Verify the deficiencies of the last time

1) 10 people grab a package of 0.11 yuan:

2) 2 people grab a package of 0.03 yuan: 

3) 6 people grab the package of 0.09:

5.5.2 Will the modified code affect the known conclusions?

30 yuan red envelope, 10 people grabbed it, 100 operations.

▲ The x-axis is the order of grabs, and the y-axis is the amount of grabs 

▲ The x-axis is the order of grabs, and the y-axis is the average value of the grab amount after 100 repetitions

It can be seen from the above two figures that the conclusion has basically not changed.

5.6. Conclusion

After the above code practice, we can know:

1) Grab first and then grab, the amount expectation is the same;

2) WeChat’s red envelope algorithm is likely to be pre-allocated to each person with a "base amount" of 0.01;

3) The latter have high risks and high returns.

5.7, supplement

Take a few pictures for later testing, add to the previous point of view, send n red envelopes, the total amount is (n+1)*0.01, the last one must be the best luck.

 

You can also try.

The above can probably prove that the WeChat red envelope is given to everyone with a minimum amount of 0.01 before the distribution!

6. Reference materials

[1]  A Preliminary Study on the Random Algorithm of WeChat Red Packets

[2]  Analysis of WeChat Red Packet Algorithm

[3]  Introduction to the architecture design of WeChat red envelopes

[4]  How is the random algorithm of WeChat red envelopes implemented?

In addition, on Zhihu, many people participated in the discussion on the WeChat red envelope algorithm. If you are interested, you can go up and have a look. Maybe there will be more inspiration: " How is the random algorithm of WeChat red envelope implemented?" ".

Appendix: More WeChat related resources

" IM Development Collection: The most complete in history, a summary of various function parameters and logic rules of WeChat "

" WeChat local database cracked version (including iOS, Android), only for learning and research [Attachment download] "

(This article was published synchronously at: http://www.52im.net/thread-3125-1-1.html )

Guess you like

Origin blog.csdn.net/hellojackjiang2011/article/details/108239310