Two solutions of reverse order pair (merge sort and tree array) and their variants

Reverse pair problem:
Given an array nums, we call (i, j) an important reverse pair if i< j and nums[i]> nums[j].
You need to return the number of important reverse pairs in the given array.
For example, the array arr[]={3,5,4,2,1} should output 8. Because (3,2)(3,1)(5,4)(5,2)(5,1)(4,2)(4,1)(2,1) are in reverse order.

Solution 1: Use Merge Sort to solve it incidentally. When the right half of the array is not empty during merge sort, and the corresponding element on the right is smaller than the corresponding element on the left, then all remaining elements of the left half of the array are larger than the current element of the right half of the array, so add the corresponding position

           num_of_reverse_pair+=m-p1;

That's it.

What needs to be paid attention to in this question is the processing of the same elements in the array. If the same elements exist in the array, then in the
reverse_pair() function

       else if (a[p1]<=a[p2]) {
           t[p++] = a[p1++];
       }

Must use <=, not <, even if (a[p1]>a[p2]) is added in the following else to distinguish (a[p1]=a[p2]). The reason is as follows:
Assuming that the left half of the array has [5,8,9] left, and the right half of the array also has [5,8,9] left, if the above is <,
the program will copy the 5 of the right half of the array to the temporary array t, The right half of the array has [8,9] left. Then the program sees that the 5 of the left half of the array is less than the 8 of the right half of the array, and copies the 5 of the left half of the array into t. Now the left and right arrays are [8,9], and repeat the same steps until all elements are copied to t. , where num_of_reverse_pair is not increased, and the result is 3 pairs less: [8, 5], [9, 5], [9, 8].

If <= is used, the program copies the 5 of the left half of the array to t, and then the program finds that the 8 of the left half of the array is greater than the 5 of the right half of the array, num_of_reverse_pair will increase by 2, and then the program will copy the 5 of the right half of the array to t , the left and right arrays are now [8,9]. Similarly, the program will first copy the 8 of the left half of the array, and then num_of_reverse_pai will increase by 1, because 9>8.

#include <iostream>

using namespace std;

int num_of_reverse_pair=0;
void merge_sort(int *a, int x, int y, int *t) {

   if (y == x+1)
        return;

   int m = x + (y-x)/2;
   merge_sort(a, x, m, t);
   merge_sort(a, m, y, t);

   int p1=x, p2=m;   //pointers to the left half array and right half array, respectively.
   int p=x; //pointer to the array t

   while(p1<m || p2<y) {
       // if right half array empty, copy the rest of left half array to t
       if (p2 >= y) {
           t[p++] = a[p1++];
       }
       else if (p1 >= m) {
           t[p++] = a[p2++];
       }
       else if (a[p1]<a[p2]) {
           t[p++] = a[p1++];
       }
       else {
           t[p++] = a[p2++];
       }

   }

   for (int i=x; i<y; i++){
       a[i] = t[i];
   }

}

void reverse_pair(int *a, int x, int y, int *t) {

   if (y == x+1)
        return;

   int m = x + (y-x)/2;
   reverse_pair(a, x, m, t);
   reverse_pair(a, m, y, t);

   int p1=x, p2=m;   //pointers to the left half array and right half array, respectively.
   int p=x; //pointer to the left half array

   while(p1<m || p2<y) {
       //right side is empty
       if (p2 >= y) {
           t[p++] = a[p1++];
       }
       else if (p1 >= m) {  //left side is empty
           t[p++] = a[p2++];

       }
       //注意， 对于逆序对问题，这里必须用<=  !!!
       else if (a[p1]<=a[p2]) { //both sides are non-empty, and left item is smaller than right item
           t[p++] = a[p1++];
       }
       else {
           num_of_reverse_pair+=m-p1;
           for (int i=p1; i<m; i++) {
               cout<<"("<<a[i]<<", "<<a[p2]<<")"<<endl;
           }
           t[p++] = a[p2++];
       }

   }

   for (int i=x; i<y; i++){
       a[i] = t[i];
   }

   return;
}



int main()
{
    int arr[14] = {-1, 7, 9, 23, 5, 8, 94, 128, 8, 8, 7, 9, 10, 23};

    cout<<"original array:"<<endl;
    for (int i=0; i<sizeof(arr)/sizeof(int); i++)
        cout<<arr[i]<<" ";
    cout<<endl;

    int t[sizeof(arr)/sizeof(int)];
    reverse_pair(arr, 0, sizeof(arr)/sizeof(int), t);
    cout <<"num of reverse pairs are " << num_of_reverse_pair <<endl;

    cout <<"sorted array is"<<endl;
    for (int i=0; i<sizeof(arr)/sizeof(int); i++)
        cout<<arr[i]<<" ";
    cout<<endl;

    return 0;
}

There is also a variant of the reversed pair problem, which is LeetCode 493.
Given an array nums, we call (i, j) an important reverse pair if i < j and nums[i] > 2*nums[j]. Note the 2 here.

For this variant of the problem, we can't just add the else{} above with if (a[p1] > 2*a[p2]) as follows:

else{
      if (a[p1] > 2*a[p2]) {     
           num_of_reverse_pair+=m-p1;
           for (int i=p1; i<m; i++) {
               cout<<"("<<a[i]<<", "<<a[p2]<<")"<<endl;
           }
      }
      t[p++] = a[p2++];       
}

This will lead to missed judgments in some cases. For example, if a[]={2,4,3,5,1}, a[] will become {2,4,1,3,5} in the process of merge sorting.
When p1=0, p2=2, because a[0]=2 is not twice as large as a[2]=1, the above if condition is not satisfied, p2++, so the number 1 will slip through the net, and then p1 will ++ , the latter 4 has no chance to compare with 1.

The method I use is to traverse directly from [p1,m) in the above else, and count++ if there is an element larger than a[p2]. code show as below:

else{
    for (int i=p1; i<m; i++) {
        if (a[i] > 2L*a[p2]) {
            num_of_reverse_pair+=1;
            cout<<" ("<<a[i]<<", "<<a[p2]<<") ";
        }
        cout<<endl;
    }
    t[p++] = a[p2++];
}

Also note that 2*a[p2] may cause overflow, so use 2L*a[p2] to convert it to long.

Note that the complexity of the above algorithm has become O(n^2logn), in fact, it is better to search directly with two loops. However, we can still optimize the algorithm. Considering that a[p1..m) is already sorted, we can use binary search to find numbers larger than 2*a[p2] during this period. In this way, the algorithm complexity becomes O(n(logn)^2). Note that the loop that prints pairs doesn't count.

The complete code is as follows:

int lower_bound_special(int *a, int x, int y, int v) {
    int m;
    while(x<y) {
        m = x + (y-x)/2;
        if (a[m] <= v)
            x=m+1;
        else 
            y=m;
    }
    return x;
}

void reverse_pair_2(int *a, int x, int y, int *t) {
   if (y == x+1)
        return;

   int m = x + (y-x)/2;
   reverse_pair_2(a, x, m, t);
   reverse_pair_2(a, m, y, t);
   int p1=x, p2=m;   //pointers to the left half array and right half array, respectively.
   int p=x; //pointer to the left half array

   while(p1<m || p2<y) {
       //right side is empty
       if (p2 >= y) {
           t[p++] = a[p1++];
       }
       else if (p1 >= m) {  //left side is empty
           t[p++] = a[p2++];

       }
       else if (a[p1]<=a[p2]) { //both sides are non-empty, and left item is smaller than or equal to the right item
           t[p++] = a[p1++];
       }
       else {
           int find_index = lower_bound_special(a, p1, m, 2L*a[p2]);
           if (find_index > 0) {
                num_of_reverse_pair += m-find_index;
                for (int i=find_index; i<m; i++) {
                    cout<<" ("<<a[i]<<", "<<a[p2]<<") ";
                }
                cout<<endl;
           }

           t[p++] = a[p2++];
       }

   }

   for (int i=x; i<y; i++){
       a[i] = t[i];
   }

   return;
}

Explain that lower_bound_special() function, it returns the first index larger than v within the range of a array [x, y). That is, if a[2] <= v < a[3], return 3. If a[x..y) are all less than or equal to v, return y. So, its input range is [x,y) and its output range is [x,y]. Let's analyze it carefully:
when a[m] == v, we should go to [m+1,y) to find, so x=m+1;
when a[m] > v, m is an optional option, but the previous possible Also, so we should go to [x,m) to find, so y=m;
when a[m] < v, we should start from m+1, so x=m+1.
Combine case 1 and case 3, then When a[m]<=v, x=m+1.

Solution 2: Use a tree-like array. The numbers in the array a[]={3,5,4,2,1} here correspond to the subscripts in the tree array.

The code here refers to
http://www.cnblogs.com/xiongmao-cpp/p/5043340.html

#include <iostream>
#include <cstring>

using namespace std;
#define N 1010
int c[N];
int n;
int lowbit(int i)
{
    return i&(-i);
}
int insert(int i)
{
    while(i<=n){
        c[i]+=1;
        i+=lowbit(i);
    }
    return 0;
}

int getsum(int i)
{
    int sum=0;
    while(i>0){
        sum+=c[i];
        i-=lowbit(i);
    }
    return sum;
}
void output()
{
    for(int i=1;i<=n;i++) cout<<c[i]<<" ";
    cout<<endl;
}

int main()
{
    while(cin>>n){
        int ans=0;
        memset(c,0,sizeof(c));
        for(int i=1;i<=n;i++){
            int a;
            cin>>a;
            insert(a);

            cout<<"input "<<a<<endl;
            output();

            ans+=i-getsum(a); //统计当前序列中大于a的元素的个数，即对应a的逆序
            cout<<"ans= "<<ans<<endl;

        }
        cout<<ans<<endl;
    }
    return 0;
}

Note:
1) lowbit(i) returns the value represented by the last 1 of i and the following 0 in the binary representation of i. For example i=5, binary 0101, lowbit(5)=1. i=6, binary 0110, lowbit(6)=2. i=8, binary 1000, lowbit(8)=8. Looking at the figure below, we can find that lowbit(i) represents the number of A[i] governed by C[i]. Note that there is no A[i] in the above program, and its input value represents the subscript of A[i]. For example i=7, then A[7]=1. The corresponding A[i]=0 for i without input.

write picture description here

2) insert(i) means that when adding A[i] (that is, A[i]=1), the corresponding value of C[i] must be +1. For example, a=3, then c[3], c[4] and c[8] are all +1. So how do we find 4 from 3 and 8 from 4? This uses lowbit(). The lowbit of 3 is 1, 3+1=4. The lowbit of 4 is 4, 4+4=8.
It should be noted that the loop in input(i) is limited to i<=n. If the number of elements in our input array is only 5, we can find 4 when a=3, because it is 8 if we look further above, which is already out of bounds. We can add 1 to both c[3] and c[4].

3) getsum(a) indicates how many elements from A[1] to A[i] (that is, all inputs up to a) are less than or equal to a. We can see that lowbit() is still used. For example, a=5, lowbit(5)=1, 5-1=4, lowbit(4)=4, 4-4=0. We can add c[4] and c[5] together. It is added here because 5 is not a power of 2, so c[4] and c[5] do not contain each other. And if a=8, lowbit(8)=8, just return c[8] directly.
Note here that A[i] is 1 if it exists, and 0 otherwise. So getsum(a) is actually the sum of the elements in A[1]..A[i] that are less than or equal to a. Then i-gesum(a) is the number of elements greater than a in A[1]..A[i], that is, the number of inverse pairs corresponding to a. We add up the number of inverse pairs corresponding to all input a to get the answer.

If there is any inappropriate content in this article, please correct me.

Two solutions of reverse order pair (merge sort and tree array) and their variants

Guess you like