Fixed-length shuffle algorithm

1. Background

During the written test, I encountered an algorithmic problem: it is almost a random number of m that are not repeated from n different numbers. The shuffling algorithm is to break up the original array so that a certain number of the original array appears at each position in the shattered array with equal probability, which can just solve the problem.

2. Shuffle algorithm

Three shuffling algorithms are derived from card draw, card swap and card insertion, among which the draw and card swap correspond to the Fisher-Yates Shuffle and Knuth-Durstenfeld Shhuffle algorithms respectively.

2.1Fisher-Yates Shuffle

Ronald A. Fisher and Frank Yates, namely Fisher-Yates Shuffle, were the first to propose this shuffling method. The basic idea is to randomly select a number that has not been taken before from the original array to the new array, as follows:

  1. Initialize the original array and the new array, the length of the original array is n (known);
  2. From the unprocessed array (assuming there are k remaining), randomly generate an element p of [0,k) (assuming the elements are numbered starting from 0)
  3. Take the p-th number from the remaining k numbers
  4. Repeat 2 and 3 until the number is exhausted
  5. The sequence taken from step 3 is a scrambled sequence

The following proves its randomness, that is, the i-th position of each element placed in the new array is 1/n (assuming the size of the array is n).
Proof: the probability that an element m is placed in the i-th position P = the probability that m is not selected when the element is selected in the first i-1 position * the probability that m is selected in the i-th position, that is, for
Insert picture description here
example: if there is a 5 number Sequence, the probability of each number being taken out as the first number is 1/5, that is, 1/n. The
second number taken out is placed in the second of the new array, and each number is taken as the second number. The probability of taking out is 4/5 * 1/4 = 1/5, which is 1/n

#define N 10
#define M 5
void Fisher_Yates_Shuffle(vector<int>& arr,vector<int>& res)
{
    
    
     srand((unsigned)time(NULL));
     int k;
     for (int i=0;i<M;++i)
     {
    
    
     	
     	k = rand() % arr.size();
     	res.push_back(arr[k]);
     	arr.erase(arr.begin()+k);
     }
}

The time complexity is O(n*n), and the space complexity is O(n).

2.2Knuth-Durstenfeld Shuffle

Knuth and Durstenfeld improved the algorithm on the basis of Fisher et al., interacting with the numbers on the original array , saving the extra O(n) space. The basic idea of ​​the algorithm is similar to that of Fisher. Each time a number is randomly taken from the unprocessed data, and then the number is placed at the end of the array, that is, the number at the end of the array is the number that has been processed.

The algorithm steps are:

  1. Create an array arr with an array size of n, and store the values ​​from 1 to n respectively;
  2. Generate a random number x from 0 to n-1;//The last bit can also be taken
  3. The output subscript is the value of x, which is the first random number
  4. Exchange the current tail element of arr with the xth element
  5. Same as 2, generate a random number x from 0 to n-2
  6. The output subscript is the value of x, which is the second random number
  7. Swap the penultimate element of arr with the element whose subscript is x;

    as above, until m numbers are output

This algorithm is a classic shuffling algorithm. Its proof is as follows:
For arr[i], the probability of being in the n-1th position after shuffling is 1/n (the random number of the first exchange is i) and
the probability of being in the n-2 position is [(n- 1)/n] * [1/(n-1)] = 1/n, (the random number exchanged for the first time is not i, and the second time is the location of arr[i] (note that if i=n -1, the first exchange arr[n-1] will be changed to a random position))
The probability of being at the nkth position is [(n-1)/n] * [(n-2)/(n- 1)] [(n-k+1)/(n-k+2)] *[1/(n-k+1)] = 1/n
(the first random number should not be i, the second It is not the location of arr[i] (it may change with the exchange)...... the nkth time is the location of arr[i]).

void Knuth_Durstenfeld_Shuffle(vector<int>& arr){
    
    
	for(int i = arr.size() - 1; i >= 0; i--){
    
    
		srand(unsigned)time(NULL);
		swap(arr[i], arr[rand() % (i + 1)];
	} 
}

Graphical proof
Insert picture description here
Insert picture description here

Insert picture description here
Insert picture description here
The time complexity is O(n) and the space complexity is O(1). The shortcomings must know the length of the array n. The
original array has been modified. This is an algorithm that shuffles the order in place. The time complexity of the algorithm is also from the Fisher algorithm. The O(n2) is promoted to O(n). Because it scans from back to front, it is impossible to process arrays that do not know the length or grow dynamically.

Guess you like

Origin blog.csdn.net/J_avaSmallWhite/article/details/109268439