The beauty of data structures and algorithms (two points)

binary search

1. What is binary search

Binary search is aimed at an ordered data set. Each time, by comparing with the elements in the middle of the interval, the interval to be searched is reduced to half of the previous one, until the element to be searched is found, or the interval is reduced to 0.

For example, let's do a word guessing game now. I randomly write a number between 0 and 99 and you can guess what I wrote. In the process of guessing, every time you guess, I will tell you whether your guess is too big or too small, until you guess right.
insert image description here

Second, time complexity analysis?

1. Time complexity

Binary search is a very efficient search algorithm. How efficient is it? Let's analyze its time complexity.

Assuming that the data size is n, the data will be reduced to half of the original size after each search. In the worst case, it will not stop until the search interval is reduced to empty. Therefore, the data size of each search is: n, n/2, n/4, ..., n/(2^k), ..., which is a proportional sequence. When n/(2^k)=1, the value of k is the total number of reductions and the total number of searches. Each reduction operation only involves the size comparison of the two data, so after k times of interval reduction operations, the time complexity is O(k). By n/(2^k)=1, k=log2n can be obtained, so the time complexity is O(logn).

2. Know O(logn)

This is an extremely efficient time complexity, sometimes even more efficient than an O(1) algorithm . Why?
②Because logn is a very "terrifying" order of magnitude, even if n is very large, the corresponding logn is also very small. For example, n is equal to the 32nd power of 2, which is 4.2 billion, and logn is only 32.
③ It can be seen that O(logn) is sometimes much faster than O(1000) and O(10000).

If you don't know much about time complexity, you can go to my previous blog. There is an article devoted to analyzing time complexity.

3. How to implement binary search?

1. Non-recursive (loop implementation)

public int bsearch(int[] a, int n, int value) {
    
    
  int low = 0;
  int high = n - 1;

  while (low <= high) {
    
    
    int mid = (low + high) / 2;
    if (a[mid] == value) {
    
    
      return mid;
    } else if (a[mid] < value) {
    
    
      low = mid + 1;
    } else {
    
    
      high = mid - 1;
    }
  }

  return -1;
}

2. Recursive implementation

// 二分查找的递归实现
public int bsearch(int[] a, int n, int val) {
    
    
  return bsearchInternally(a, 0, n - 1, val);
}

private int bsearchInternally(int[] a, int low, int high, int value) {
    
    
  if (low > high) return -1;

  int mid =  low + ((high - low) >> 1);
  if (a[mid] == value) {
    
    
    return mid;
  } else if (a[mid] < value) {
    
    
    return bsearchInternally(a, mid+1, high, value);
  } else {
    
    
    return bsearchInternally(a, low, mid-1, value);
  }
}

Precautions

1. Loop exit condition

Note that low<=high , not low<high .

2. The value of mid

In fact, mid=(low+high)/2 is problematic. Because if low and high are relatively large , the sum of the two may overflow . An improved method is to write the calculation of mid as low+(high-low)/2 . Further, if we want to optimize the performance to the extreme, we can convert the division by 2 operation here into a bit operation low+((high-low)>>1) [Be sure to pay attention to this parenthesis, otherwise the symbol priority will be out of order. question]. Because computers process bit operations much faster than division operations.

3. Update of low and high

low=mid+1, high=mid-1 . Pay attention to the +1 and -1 here. If you write it directly as low=mid or high=mid, an infinite loop may occur. For example, when high=3, low=3, if a[3] is not equal to value, it will cause the loop to never exit.


4. Conditions of use (limitation of application scenarios)

The time complexity of binary search is O(logn) , and the efficiency of finding data is very high .

However, binary search cannot be used in all situations , and its application scenarios are very limited . So when is it suitable to use binary search, and when is it not suitable?

1. Sequence table structure

Binary search relies on a sequential table structure, that is, an array.

2. Ordered data

Binary search is aimed at ordered data, so it can only be used in scenarios where insertion and deletion operations are infrequent and multiple searches are performed at one time.

3. The amount of data is too small

The amount of data is too small to be suitable for binary search, and the efficiency improvement is not obvious compared with direct traversal. But there is an exception, that is, the comparison operation between data is very time-consuming. For example, the strings stored in the array are all strings with a length of more than 300. Then it is better to use binary search to minimize the comparison operation.

4. Too much data

If the amount of data is too large, it is not suitable for binary search, because the array needs continuous space. If the amount of data is too large, it is often impossible to find a continuous memory space to store such large-scale data.

Five, four common binary search deformation problems

1. Find the element whose first value is equal to the given value

If we are looking for any element whose value is equal to the given value, when a[mid] is equal to the value we are looking for, a[mid] is the element we are looking for. However, if we are solving for the element whose first value is equal to the given value, when a[mid] is equal to the value we are looking for, we need to confirm whether this a[mid] is the first value equal to the given value Elements.

public int bsearch(int[] a, int n, int value) {
    
    
  int low = 0;
  int high = n - 1;
  while (low <= high) {
    
    
    int mid =  low + ((high - low) >> 1);
    if (a[mid] > value) {
    
    
      high = mid - 1;
    } else if (a[mid] < value) {
    
    
      low = mid + 1;
    } else {
    
    
    // 比如 数组为{0 1 2 3 4 5 **5** 5 6 7 8 9 10}    表*是第一次遇到的 但是 它不是我们需要的第一个 
      if ((mid == 0) || (a[mid - 1] != value)) return mid; //如果当前位置是第一个 那么就是最前的
      //或者  如果 当前值 不是第一个 那么接着循环一次  然后 找到第一个为止 
      else high = mid - 1;
    }
  }
  return -1;
}

2. Find the element whose last value is equal to the given value

public int bsearch(int[] a, int n, int value) {
    
    
  int low = 0;
  int high = n - 1;
  while (low <= high) {
    
    
    int mid =  low + ((high - low) >> 1);
    if (a[mid] > value) {
    
    
      high = mid - 1;
    } else if (a[mid] < value) {
    
    
      low = mid + 1;
    } else {
    
    
      if ((mid == n - 1) || (a[mid + 1] != value)) return mid;
      else low = mid + 1;
    }
  }
  return -1;
}

Let's focus on line 11 of the code. If the element a[mid] is already the last element in the array, then it must be what we are looking for; if the next element a[mid+1] of a[mid] is not equal to value, it also means that a[mid] mid] is the element we want to find the last value equal to the given value.

If we check and find that an element a[mid+1] after a[mid] is also equal to value, it means that the current a[mid] is not the last element whose value is equal to the given value. We just update low=mid+1, because the element we are looking for must be between [mid+1, high].

3. Find the first element greater than or equal to the given value

Now let's look at another class of deformation problems. In an ordered array, finds the first element greater than or equal to the given value. For example, such a sequence stored in an array: 3, 4, 6, 7, 10. If you look for the first element greater than or equal to 5, it's 6.

public int bsearch(int[] a, int n, int value) {
    
    
  int low = 0;
  int high = n - 1;
  while (low <= high) {
    
    
    int mid =  low + ((high - low) >> 1);
    if (a[mid] >= value) {
    
    
      if ((mid == 0) || (a[mid - 1] < value)) return mid;
      else high = mid - 1;
    } else {
    
    
      low = mid + 1;
    }
  }
  return -1;
}

If a[mid] is less than the value to be found, the value to be found must be between [mid+1, high], so we update low=mid+1.

For the case that a[mid] is greater than or equal to the given value value, we need to first see if this a[mid] is the first element we are looking for whose value is greater than or equal to the given value. If there is no element in front of a[mid], or the previous element is less than the value to be found, then a[mid] is the element we are looking for. The code corresponding to this logic is line 7.

If a[mid-1] is also greater than or equal to the value to be found, it means that the element to be found is between [low, mid-1], so we update high to mid-1.

4. Find the last element less than or equal to the given value

Now, let's look at the last variant of binary search, finding the last element less than or equal to a given value. For example, the array stores such a set of data: 3, 5, 6, 8, 9, 10. The last element less than or equal to 7 is 6. Is it similar to the one above? In fact, the realization idea is the same.

With the previous foundation, you can write it yourself, so I won't analyze it in detail. I posted the code, you can compare it after writing it.

public int bsearch7(int[] a, int n, int value) {
    
    
  int low = 0;
  int high = n - 1;
  while (low <= high) {
    
    
    int mid =  low + ((high - low) >> 1);
    if (a[mid] > value) {
    
    
      high = mid - 1;
    } else {
    
    
      if ((mid == n - 1) || (a[mid + 1] > value)) return mid;
      else low = mid + 1;
    }
  }
  return -1;
}

6. Applicability Analysis

1. Everything that can be solved by binary search method, most of us prefer to use hash table or binary search tree , even if binary search saves more memory, but after all, there are not many cases where memory is so scarce.
2. The binary search of " value equal to the given value " is really not used very much, and binary search is more suitable for "approximate" search problems. For example, there are several variants mentioned above (point 5).

Note:
The variant binary search algorithm is very brain-burning to write, and it is easy to cause bugs due to poor details. These error-prone details include: termination conditions, interval update methods, and return value selection.

seven. Recommended topics

binary search

https://pintia.cn/problem-sets/15/problems/923

This question requires the implementation of a binary search algorithm.

Function interface definition:

Position BinarySearch( List L, ElementType X );

The List structure is defined as follows:

typedef int Position;
typedef struct LNode *List;
struct LNode {
    
    
    ElementType Data[MAXSIZE];
    Position Last; /* 保存线性表中最后一个元素的位置 */
};

L is a linear table passed in by the user, in which ElementType elements can be compared by >, ==, <, and the title ensures that the incoming data is in increasing order. The function BinarySearch wants to find the position of X in Data, that is, the array subscript (note: elements are stored from subscript 1). Returns the subscript if found, otherwise returns a special failure flag NotFound.

Example of the referee test procedure:

#include <stdio.h>
#include <stdlib.h>

#define MAXSIZE 10
#define NotFound 0
typedef int ElementType;

typedef int Position;
typedef struct LNode *List;
struct LNode {
    
    
    ElementType Data[MAXSIZE];
    Position Last; /* 保存线性表中最后一个元素的位置 */
};

List ReadInput(); /* 裁判实现,细节不表。元素从下标1开始存储 */
Position BinarySearch( List L, ElementType X );

int main()
{
    
    
    List L;
    ElementType X;
    Position P;

    L = ReadInput();
    scanf("%d", &X);
    P = BinarySearch( L, X );
    printf("%d\n", P);

    return 0;
}

Input sample 1:

5
12 31 55 89 101
31

Sample output 1:

2

Input sample 2:

3

26 78 233

31

Sample output 2:

0

Position BinarySearch( List L, ElementType X )
{
    
    
    int left=1,right=L->Last;
    int ans;
    while(left<=right)
    {
    
    
        int mid=(left+right)/2;
        if(L->Data[mid]==X)
        {
    
    
            ans=mid;
            return ans;
        }
        else if(L->Data[mid]<X)
        {
    
    
            left=mid+1;
        }
        else if(L->Data[mid]>X)
        {
    
    
            right=mid-1;
        }
    }
    return NotFound;
}

Guess you like

Origin blog.csdn.net/qq_54729417/article/details/123370749