Binary Search - Medium - Random Selection by Weight

foreword

        It’s been a long time since I took the time to read the questions. I suddenly found that Leetcode has a random label, and I was curious about what type of questions it was. I didn’t expect to see this question that was also involved in the company’s gamification teaching project, so I wrote a question solution .

        In this article, I will first give the solution to this question. After that, another more efficient solution to this problem after reducing the data range is given in the "Extension" later in the article, as well as a case similar to this problem that I encountered in a full-time game project .

        The main test points of this question are binary search and whether it can be flexibly transformed. In addition, if the length range of the weight array wis not so long and the weight range is not so large, the test point will change from binary search to hash . For details, please see the extension in my later article.

Topic source (click to jump)

answer

analyze

        The basic idea is to express the weight of each position as an interval occupied in a numerical range. The specific process is to construct a list whose total length is wthe sum of the values ​​in the weight list . When randomly selecting a location, rand a value . At this time, if there is a subscript , make weight ∈ [ weight R oute [ index ] , weight R oute [ index + 1 ] ) weight \in [weightRoute[index], weightRoute[index +1] )sumWeightweightRouteweightindexweight[weightRoute[index],weightRoute[index+1 ] ) , then the subscriptindexis the coordinate we want to obtain. The binary search in the subsequent search stage is the basic operation, and the idea is very simple, so I won’t go into details here.

the code

        In the construction phase, the time complexity is O(n), and in the search phase, the time complexity is O(logn).

class Solution {
    
    

    protected $sumWeight;
    protected $length;
    protected $weightRoute = [0];
    
    function __construct(array $w) {
    
    
        // sum total
        $this->sumWeight = array_sum($w);
        $this->length = count($w);
        for ($i = 1; $i < $this->length; $i++) {
    
    
            $this->weightRoute[$i] = $w[$i-1] + $this->weightRoute[$i-1];
        }
        echo json_encode($this->weightRoute) . PHP_EOL;
    }
  
    /**
     * @return Integer
     */
    function pickIndex() {
    
    
        $weight = mt_rand(0, $this->sumWeight - 1);
        $pick = $this->search($weight);
        return $pick;
    }
    
    /**
     * 二分查找$weight在$weightRoute中对应的下标
     * @var int $weight
     * @return int
     */
    function search(int $weight) {
    
    
        $i = 0;
        $j = $this->length - 1;
        $middle = intval(($j - $i) / 2);
        $weightRoute = $this->weightRoute;
        while (true) {
    
    
            if ($weightRoute[$j] <= $weight) {
    
    
                // success
                return $j;
            } elseif ($i === $middle || $j === $middle) {
    
    
                // 如果结果是$j,上面的if就已经返回了,不会走到这儿
                return $i;
            } elseif ($weightRoute[$middle] <= $weight) {
    
    
                // to right
                $i = $middle;
            } else {
    
    
                // to left
                $j = $middle - 1;
            }
            $middle = intval(($j - $i) / 2) + $i;
        }
    }
}

        The running results are as follows.

insert image description here

Without using binary search?

        I'm curious, can all test cases pass without using binary search? So it will Solution::pickIndex()be changed to the following order to search, Solution::__construct()and the code is consistent with the above.

    /**
     * @return Integer
     */
    function pickIndex() {
    
    
        $weight = mt_rand(0, $this->sumWeight - 1);
        // echo 'w: ' . $weight . PHP_EOL;
        $pick = null;
        for ($i = 0;$i < $this->length; $i++) {
    
    
            // echo $i . PHP_EOL;
            if ($this->weightRoute[$i] > $weight) {
    
    
                break;
            }
            $pick = $i;
        }
        return $pick;
    }

The final result is as follows.

insert image description here
        It also passed, the memory consumption is the same as the binary search above, but all test cases take 4s. When I saw this point, I really wanted to complain. . . . If the problem has no time limit in leetcode, then it can be solved without binary search (manually crying and smiling).

extension extension

        In this article, I will assume that the data range of this question is reduced, so as to give another more efficient solution, as well as an actual case similar to this problem that I encountered in a full-time game project.

title deformation

        This question is said to be a binary search, but in practical applications, if the sum of our weights is not particularly large, the following method can be used to construct another stage with a time complexity of O(n), and the search stage with a time complexity of O(1).

analyze

        The basic idea is to represent the weight of each position as the length of the space in a list. The specific process is to construct a list whose total length is wthe sum of the values ​​in the weight list . When randomly selecting a location, rand a value , which is the coordinate we want to obtain. From the analysis and comparison above, it will be found that the two are actually a method, but the current solution is to exchange space for time (the search phase is improved from O(logn) to O(1)).sumWeightweightRouteindexweightRoute[index]

Code

class Solution {
    
    

    protected $sumWeight;
    protected $length;
    protected $weightRoute = [];
    
    function __construct($w) {
    
    
        // sum total
        $this->sumWeight = array_sum($w);
        $this->length = count($w);
        for ($i = 0; $i < $this->length;) {
    
    
            if ($w[$i]) {
    
    
                $this->weightRoute[] = $i;
                $w[$i]--;
            } else {
    
    
                $i++;
            }
        }
        echo json_encode($this->weightRoute);
    }
  
    /**
     * @return Integer
     */
    function pickIndex() {
    
    
        $index = mt_rand(0, $this->sumWeight - 1);
        return $this->weightRoute[$index];
    }
}

        In this question, the following conditions exist.

  1. 1 &lt; = w . l e n g t h &lt; = 10000 1 &lt;= w.length &lt;= 10000 1<=w.length<=10000
  2. 1 &lt; = w [ i ] &lt; = 1 0 5 1 &lt;= w[i] &lt;= 10^5 1<=w[i]<=105

        Then, in extreme cases, weightRoutethe total length will reach 1 0 9 10^9109. After submitting, there is a high probability of stack explosion.

practical application

        In the full-time game, there will be such a scene from time to time. Here we take the math game Voyage Era as an example of gamified teaching . (I don’t remember where the earlier games were so useful =. =)
        In the Age of Voyage, the player needs to run a company. Before the game starts, the player chooses the goal to be achieved, and within the next 21 rounds , the level can be cleared by achieving the goal. In this game, there is such a situation that economic crises are randomly generated in each round , but the probability of economic crises in each round is different . This question is very close to but different from the above question. If we regard probabilities as weights for the above problems, the key difference between the two problems is as follows.

  1. We know that the list of probabilities is of fixed length 20.
  2. We know the weight (probability) of each round when we implement the logic code

        That is to say, the above weight list wis a fixed array here. Therefore, there is no need to write such complicated code in the construction stage. We can set a constant array as the weight list (from this point of view, this problem starts from Difficulty is reduced from medium to easy). Of course, since PHP is stateless, this problem is different in actual implementation. I won’t go into details here, as everyone knows that this algorithm has such an application scenario.

Guess you like

Origin blog.csdn.net/qq_23937195/article/details/95721151