Reservoir Sampling reservoir sampling algorithm

https://blog.csdn.net/huagong_adu/article/details/7619665

https://www.jianshu.com/p/63f6cf19923d

https://www.cnblogs.com/snowInPluto/p/5996269.html

https://www.cnblogs.com/xudong-bupt/p/4053652.html

https://www.jianshu.com/p/51f7089c082b

concept:

Extracting a random data in the array and so the probability of a given length is easy, but if faced with a massive data stream of unknown length of it? Reservoir sampling (Reservoir Sampling) algorithm is used to solve this problem, it is very useful in the analysis of large data sets of some of the time.

Scene Description:

Application Scene Scene Description: sample 100 query in a mass advertising data, wherein the features comprise PV (number of searches query's), adpv (the number of searches ads), adshow (A total of ad impressions amount after the ad), the Click ( click number)

Sampling reservoir: generating a random number (0,1) value of u, so that U = a (. 1 / PV) , take the first 100 of a large value.

 

Guess you like

Origin www.cnblogs.com/Lee-yl/p/11209634.html