Algorithms and Data Structures (1.4): divide and conquer (six questions, c++)

The divide and conquer method has always been a classic in algorithms, and the difficulty is between greedy and dynamic programming. This time, I will share six very representative problems in the divide and conquer method.

Q1:  Binary Search


Binary search should be the most basic algorithm in the application of divide and conquer. Compared with the time complexity of ordinary search O(n), the time complexity of binary search is O(logn), which greatly improves the efficiency of search.
#include <iostream>
#include <cassert>
#include <vector>

using namespace std;

int binary_search(const vector<int> &a, int x) {
  int low = 0, high = (int)a.size();
  int mid;
  while(low <= high){
    mid = (low + high) / 2;
    if(x == a[mid]) return mid;
    else if(x < a[mid]) high = mid - 1;
    else low = mid + 1;
  }
  return -1;
}

int linear_search(const vector<int> &a, int x) {
  for (size_t i = 0; i < a.size(); ++i) {
    if (a[i] == x) return i;
  }
  return -1;
}

int main() {
  int n;
  std::cin >> n;
  vector<int> a(n);
  for (size_t i = 0; i < a.size(); i++) {
    std::cin >> a[i];
  }
  int m;
  std::cin >> m;
  vector<int> b(m);
  for (int i = 0; i < m; ++i) {
    std::cin >> b[i];
  }
  for (int i = 0; i < m; ++i) {
    std::cout << binary_search(a, b[i]) << ' ';
  }
}

The above are the basic steps of linear search and binary search, which are relatively simple.

The basic idea of ​​binary search is: first, we sort the array to be searched, and then constantly compare the middle element and the search value to determine which way the searched pointer should move:

If the middle value is less than the lookup value, then only the right half of the area needs to be searched;

If the middle value is greater than the lookup value, then only the left half of the area needs to be searched;

Iterates in this way, and finally finds the element you are looking for.


Q2: Majority Elements


The main element question is saying that we want to find an element out of n elements that occurs more than half of the times, and output 0 if it is not found.

As long as you can write a programming language, you should be able to figure out that if we count the number of occurrences of each element and count the number of occurrences greater than n/2, this problem will be solved. But the complexity is O(n^2), can we come up with a faster way?

The answer is to use divide and conquer:

We divide an array into left and right halves. If an element is the main element of the entire array, it must also be the main element of the left half or right half, otherwise the number cannot exceed n/2. Assuming that we have obtained the main elements of the left and right halves (if which half has no main element, the return value is -1), here is how to merge. We combine the left and right halves, and count whether the number of occurrences of the main element of the left half or the main element of the right half exceeds half of the entire region, and whoever exceeds is returned. Returns -1 if none is exceeded.

The final output is to judge whether there are main elements.

Here is the code:

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

int get_majority_element(vector<int> &a, int left, int right) { //right is +1 to the position of the last element
  if (left == right) return -1; //no element
  if (left + 1 == right) return a[left]; //if there is only one element
  int mid = (left + right) / 2;
  int low = get_majority_element(a, left, mid);
  int high = get_majority_element(a, mid, right);
  int m = 0, n = 0;
  for(int i = left; i < right; i++){
    if(a[i] == low) m++;
    if(a[i] == high) n++;
  }
  if(m > (right - left)/2) return low; //Count the number of times
  if(n > (right - left)/2) return high;
  return -1;
}

int main() {
  int n;
  std::cin >> n;
  vector<int> a(n);
  for (size_t i = 0; i < a.size(); ++i) {
    std::cin >> a[i];
  }
  std::cout << (get_majority_element(a, 0, a.size()) != -1) << '\n';
}

Q3: Improving Quick Sort


Quick sort is generally the most commonly used method in sorting algorithms, but there is also a very big problem in quick sort, which cannot sort duplicate elements. This kind of doubt has been bothering me for a long time when I was learning Quick Sort. Now let me explain why Quicksort can't handle repeated elements. In the process of writing quicksort, we generally need to put the first element of the array in such a position that all elements before this position are smaller than it, and all elements after this position are larger than it. So if this array has an element from the second element to the last that repeats the first element, where should we put it. Remember, now you are looking for the absolute position of the first element in the array. So you don't yet know where that repeating element is, so you can't position the repeating element.

Of course there is a solution, and that is the rule of thirds. It turns out that an array we divide it into two parts, the left is smaller than the target, and the right is larger than the target. Now after we use the rule of thirds, the left is small, the middle is equal, and the right is large. When we divide into two, the return value is the absolute position where the target element should be, and when we divide into three, we need to return two values, respectively recording the position of the left middle and the middle right, so as to make the middle part equal. The data is not messy. How to return two values, of course, is to use a structure.

The general quick sort is like this, i, j start from the left and right respectively

Search forward from j, that is, search forward from the back (j--), find the first value A[j] less than the key , and exchange A[j] and A[i];
Search backward from i, that is, search backward from front (i++), find the first A[i] greater than key , and exchange A[i] and A[j];

while(first < last)
    {
        while(first < last && a[last] >= key)
        {
            --last;
        }
 
        a[first] = a[last];
 
        while(first < last && a[first] <= key)
        {
            ++first;
        }
         
        a[last] = a[first];
 }

Now we have two variables m1, m2 that record the left middle position and the middle right position respectively (the initial position is 0). If an element smaller than the key is found on the right side, we will move the other elements in the middle and right part uniformly backward by one space. , insert this element at m1, note that the value of m1 and this element cannot be directly exchanged, because this will disrupt the order of the middle and right parts. If you find the value equal to the key, put it in m2 and exchange it directly.

Here is the code:

#include <iostream>
#include <vector>
#include <cstdlib>

using namespace std;

struct div3{
  int m1;
  int m2;
};

struct div3 partition3(vector<int> &a, int l, int r) {
  int x = a[l];
  int m1 = l;
  int m2 = l;
  for (int i = l + 1; i <= r; i++) {
    if (a[i] < x) {
      m1++;
      m2++;
      int k = a[i];
      for(int j = i; j > m1 ;j--) a[j] = a[j-1];
      a[m1] = k;
    }
    else if (a[i] == x) {
      m2++;
      swap(a[i], a[m2]);
    }
  }
  swap(a[l], a[m1]);
  struct div3 m;
  m.m1 = m1;
  m.m2 = m2;
  return m;
}


void randomized_quick_sort(vector<int> &a, int l, int r) {
  if (l >= r) {
    return;
  }

  int k = l + rand() % (r - l + 1);
  swap(a[l], a[k]);
  struct div3 m = partition3(a, l, r);

  randomized_quick_sort(a, l, m.m1 - 1);
  randomized_quick_sort(a, m.m2 + 1, r);
}

int main() {
  int n;
  std::cin >> n;
  vector<int> a(n);
  for (size_t i = 0; i < a.size(); ++i) {
    std::cin >> a[i];
  }
  randomized_quick_sort(a, 0, a.size() - 1);
  for (size_t i = 0; i < a.size(); ++i) {
    std::cout << a[i] << ' ';
  }
}


Q4: Number of Inversions

The meaning of the question is very simple, find how many pairs of i < j and ai is greater than aj (ai, aj).

Of course, using O(n^2) is also very easy to achieve.

Now to use divide and conquer to reduce the time complexity,

The core of using the divide and conquer method is how to merge.

Suppose we already know how many pairs are left and right,

Now in addition to adding these two items to sum, we also need to calculate the number of pairs of elements that are larger on the left than on the right

So how do we merge:

We can refer to merge-sort. When we have two ordered arrays on the left and right, we can merge them into an entire ordered array. The specific method can be searched on Baidu.com. Similarly, when we have two ordered arrays, suppose we find that an element in the a array is larger than an element in the b array, then all the elements after this element in the a array are larger than this element, so we can omit the a lot of time. My approach is to count the number at the same time as the merge sort.

#include <iostream>
#include <vector>

using namespace std;

long long merge(vector<int> &a, vector<int> &b, size_t left, size_t ave, size_t right) {
  int i = left, j = ave, k = left;
  long long count = 0;
  while((i < ave) && (j < right)){
    if(a[i]<=a[j]) b[k++] = a[i++];
    else{
      b[k++] = a[j++];
      count += ave - i; //A crucial step, the others are almost the same as merge sort
    }
  }

  while(i < ave) b[k++] = a[i++];
  while(j < right) b[k++] = a[j++];
  for(int l = left; l < right; l++){
    a[l] = b[l]; //b array is an auxiliary sorted array
  }
  return count;
}

long long get_number_of_inversions(vector<int> &a, vector<int> &b, size_t left, size_t right) {
  long long number_of_inversions = 0;
  if (right <= left + 1) return number_of_inversions;
  size_t ave = left + (right - left) / 2;
  number_of_inversions += get_number_of_inversions(a, b, left, ave);
  number_of_inversions += get_number_of_inversions(a, b, ave, right);
  number_of_inversions += merge(a, b,left, ave, right);
  return number_of_inversions;
}

int main() {
  int n;
  std::cin >> n;
  vector<int> a(n);
  for (size_t i = 0; i < a.size(); i++) {
    std::cin >> a[i];
  }
  vector<int> b(a.size());
  std::cout << get_number_of_inversions(a, b, 0, a.size()) << '\n';
}

Q5: Organizing a Lottery


Briefly describe the problem, give you a set of intervals and a few numbers, and ask how many intervals each number contains.

A very naive algorithm is to count slowly:

vector<int> naive_count_segments(vector<int> starts, vector<int> ends, vector<int> points) {
  vector<int> cnt(points.size());
  for (size_t i = 0; i < points.size(); i++) {
    for (size_t j = 0; j < starts.size(); j++) {
      cnt[i] += starts[j] <= points[i] && points[i] <= ends[j];
    }
  }
  return cnt;
}

Or find a way to reduce the time complexity

We can sort the left endpoints of the interval individually,

The right endpoint is sorted separately (the sorting algorithm is uniformly calculated according to the time complexity nlogn)

Find the number of elements in the left endpoint that are smaller than the key value, denoted as l; (using binary search, the complexity is logn)

Find the number of elements in the right endpoint that are larger than the key value, denoted as r;

The total number is recorded as n;

Then the number of intervals including it is r+ln.

This requires you to think about why. You can give a few examples and try it yourself.

Here is the code:

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int cl_binary_search(const vector<int> &a, int x) {
  int low = 0, high = (int)a.size()-1;
  int mid;
  if(a[high] <= x) return a.size();
  if(a[low] > x) return 0;
  while(low <= high){
    mid = (low + high) / 2;
    if(x == a[mid]) {
      while(x==a[mid]) mid++;
      return mid;
    }
    else if(x < a[mid]) high = mid - 1;
    else low = mid + 1;
  }
  if(a[mid]<x) return mid+1;
  else return mid;
}

int cr_binary_search(const vector<int> &a, int x) {
  int low = 0, high = (int)a.size()-1;
  int mid;
  if(a[high] < x) return 0;
  if(a[low] >= x) return a.size();
  while(low <= high){
    mid = (low + high) / 2;
    if(x == a[mid]) {
      while(x==a[mid]) mid--;
      return a.size()-(mid + 1);
    }
    else if(x < a[mid]) high = mid - 1;
    else low = mid + 1;
  }
  if(a[mid]<x) return a.size()-(mid+1);
  else return a.size()-mid;
}

vector<int> fast_count_segments(vector<int> starts, vector<int> ends, vector<int> points) {
  vector<int> cnt(points.size());
  sort(starts.begin(),starts.end());
  sort(ends.begin(),ends.end());
  for(int i = 0; i < points.size(); i++){
    int l = cl_binary_search(starts, points[i]);
    int r = cr_binary_search(ends, points[i]);
    int n = ends.size();
    cnt[i] = l + r - n;
  }
  return cnt;
}

int main() {
  int n, m;
  std::cin >> n >> m;
  vector<int> starts(n), ends(n);
  for (size_t i = 0; i < starts.size(); i++) {
    std::cin >> starts[i] >> ends[i];
  }
  vector<int> points(m);
  for (size_t i = 0; i < points.size(); i++) {
    std::cin >> points[i];
  }
  //use fast_count_segments
  vector<int> cnt = fast_count_segments(starts, ends, points);
  for (size_t i = 0; i < cnt.size(); i++) {
    std::cout << cnt[i] << ' ';
  }
}


Q6: Closest Points


This is a very, very brain-burning question in the fractional algorithm.

According to the conventional idea, we still divide it into two parts, assuming that we have calculated the smallest point pair on the left and the smallest point pair on the right, then how to merge it?

We need to find a point on the left and a point on the right to see if the distance between them is smaller than d1 or d2.


First put all the points at a distance from the middle line d (d=min(d1, d2)) into an array


As for the points outside the purple area, they are discarded, because no matter how much you find the points on the left that are more than d from the center line, it is impossible to find the points on the right that are less than d from it.

Now we have points in the purple area, denoted as array a, and then we need to find the two closest distances. Of course, we can also use double traversal, but the time will be very long, we need to find a simple method.

Then sort the elements in this array according to the y value. For each a[i], we only need the y value that is less than d away from it. There are at most 6 such points, which can be proved according to the drawer principle. So the complexity is 6n, plus the sorting time is nlogn level.


Here is the code:

#include <algorithm>
#include <iostream>
#include <sstream>
#include <iomanip>
#include <vector>
#include <string>
#include <cmath>
#include <stdlib.h>

using namespace std;

struct point{
  int x;
  int y;
};

int cmp(const void *a, const void *b){
  return (*(const struct point *)a).x - (*(const struct point *)b).x;
}

int cmpy(const void *a, const void *b){
  point *p1 = (point *)a;
  point *p2 = (point *)b;
  return (*(const struct point *)a).y - (*(const struct point *)b).y;
}

double dis(point p1, point p2){
  double d = sqrt((p1.x - p2.x) * (p1.x - p2.x) + (p1.y - p2.y) * (p1.y - p2.y));
  return d;
}



double minimal_distance(point p[], int low, int high) {
  //write your code here
  if(low == high - 1) return dis(p[low],p[high]); //If there are only two points, return the distance directly
  if(low == high - 2) return min(min(dis(p[low],p[high]),dis(p[low],p[low+1])),dis(p[high-1 ],p[high])); //Three points, solve directly by violence
  int mid = (low + high) / 2;
  double left = minimal_distance(p, low, mid);
  double right = minimal_distance(p, mid+1, high);
  double d;
  d = min(left,right);
  if(d==0) return 0;
  
  point a[high];
  int n=0;
  for(int i = low; i <= high; i++) { // put the distance below the middle point d on the x-axis in the a array
    if(abs(p[i].x - p[mid].x)< d) a[n++]=p[i];
  }

  if(n == 1) return min(left,right);

  qsort(a,n,sizeof(point),cmpy); //sort the y-axis
  
  double mindis = dis(a[0],a[1]);
  for(int i = 0; i < n; i++){   
    for(int j = i+1; j < n && (a[j].y-a[i].y)<mindis; j++){
      if(dis(a[i],a[j]) < mindis) mindis = dis(a[i],a[j]);
    }
  }

  if(d > mindis) return mindis;
  else return d;
}

int main() {
  size_t n;
  std::cin >> n;
  point p[n];
  for (size_t i = 0; i < n; i++) {
    std::cin >> p[i].x >> p[i].y;
  }
  qsort(p,n,sizeof(point),cmp);
  std::cout << std::fixed;
  std::cout << std::setprecision(9) << minimal_distance(p, 0, n-1) << "\n";
}

tips:

Regarding the sorting algorithm, it usually takes some time to write it yourself, and you can directly use the one that comes with C++. The performance is generally better than what you wrote yourself,

Divided into two types: sort&qsort

sort is a function that comes with <algorithm> in C++, so it is widely used in C++. The specific performance is that the support for the stl library is very in place, and many containers can be sorted by sort.

And qsort is a function in <stdlib.h> in c, only for arrays in c language.

In the actual measurement process, qsort is more than twice as fast as sort. In the online programming test, some programs will time out with sort, but not with qsort.

So what if you want to use qsort after using the container in C++.

qsort(&p[0], size ,sizeof(...),cmp);

You can change the parameters of qsort to this form.

At the same time, the comparison function has very strict format requirements

int cmp(const void *p1, const void *p2){
	int a = (*(point *)p).x;
	int b = (*(point *)p).x;
	return a-b;
}

This is the cmp function that sorts the structure of the sixth question

The specific function format can be Baidu.

Of course, due to the complicated process of destructing containers such as vector, using qsort for containers is obviously not as fast as using qsort for arrays.



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324839070&siteId=291194637