Pond Sampling to Solve the Random Selection Problem

1 Introduction

Pond sampling is a series of random algorithms whose purpose is to select k samples from a set S containing n items, where n is a large or unknown quantity, especially suitable for cases where all n items cannot be stored in memory condition. The most common example is Algorithm R mentioned by Jeffrey Vitter in his paper.

2. Algorithm steps:

The set is represented by s, there are n samples in total, the sample order is represented by j, and the pond is represented by p:

  1. Select the first k samples in s and put them into the pond p, at this time j∈[0,k-1];
  2. When j>=k, the index r is randomly generated in the range [0,j], that is, r∈[0,j]:
    if r<k, that is, r∈[0,k-1], then use s[j ] to replace p[r], that is, p[r]=s[j];

3. Use cases

3.1 Is it possible to randomly select K elements from a set of unknown size with equal probability?

insert image description here

    // 水塘抽样算法
    public ArrayList<Integer> randomSelect(ArrayList<Integer> list,int k){
    
    
        // 1.选取前k个元素
        ArrayList<Integer> pool=new ArrayList<>(list.subList(0,k));
        // 2.对于i<=k的元素,进行随机替换
        Random random=new Random();
        for (int i=k;i<list.size();i++){
    
    
            int r=random.nextInt(i+1);
            if (r<k){
    
    
                pool.set(r,list.get(i));
            }
        }
        return pool;
    }

4. Lituo 382. Linked List Random Node

Given a singly linked list, randomly select a node of the linked list and return the corresponding node value. Each node has the same probability of being selected.
Implement the Solution class:

  • Solution(ListNode head) Initializes the object with an array of integers. int getRandom()
  • Randomly selects a node from the linked list and returns the value of that node. All nodes in the linked list have equal probability of being selected.
    // 方式一:顺序表+二分查找
    class Solution {
    
    
        ArrayList<ListNode> nodes=new ArrayList<>();
        public Solution(ListNode head) {
    
    
            // 统计节点数量
            ListNode tail=head;
            while (tail!=null){
    
    
                nodes.add(tail);
                tail=tail.next;
            }
        }

        public int getRandom() {
    
    
            // 计算随机索引
            int index=new Random().nextInt(nodes.size());  // [0,bound)
            // 二分查找选取节点
            ListNode tar=binarySearch(index);
            return tar.val;
//            return nodes.get(index);
        }

        private ListNode binarySearch(int index) {
    
    
            int l=0,r=nodes.size()-1;
            while (l<=r){
    
    
                int m=l+(r-l)/2;
                if (m==index){
    
    
                    return nodes.get(m);
                }else if (m<index){
    
    
                    l=m+1;
                }else if (m>index){
    
    
                    r=m-1;
                }
            }
            return null;
        }
    }
/**
    * 方式二:水塘抽样算法
    *
    */
class Solution {
    
    
    ListNode head;
    Random random=new Random();
    public Solution(ListNode head) {
    
    
        this.head=head;
    }

    public int getRandom() {
    
    
        ListNode tail=head.next;
        int index=1;  // 记录第几个元素
        int val=head.val;

        while (tail!=null){
    
    
            int r=random.nextInt(index+1);
            if(r==0){
    
    
                val=tail.val;
            }
            index++;
            tail=tail.next;
        }
        return val;
    }
}

参考:
1)https://baike.baidu.com/item/%E6%B0%B4%E5%A1%98%E6%8A%BD%E6%A0%B7/10490257?fr=aladdin
2)https://blog.csdn.net/wq3095435422/article/details/124413184

Guess you like

Origin blog.csdn.net/qq_43665602/article/details/130105870