Detailed binary search algorithm

People around me almost all that binary search is simple, but it really true? Binary search is really very simple? Not simple. Knuth look big brother (who KMP algorithm of the invention) how to say:

Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky...

This sentence can be understood: the idea is very simple, the details of the devil.

In this paper, to explore some of the most commonly used binary search scene: looking for a number, look for the left side of the border, looking for the right side of the border.

Moreover, we are to get into the details, such as whether the while loop should bring equal sign of inequality, whether mid plus one, and so it should. Analysis of these differences in detail and the reasons for these differences, you can ensure the flexibility to accurately write the correct binary search algorithm.

One, two sub-frame to find

int binarySearch(int[] nums, int target) {
    int left = 0, right = ...;

    while(...) {
        int mid = (right + left) / 2;
        if (nums[mid] == target) {
            ...
        } else if (nums[mid] < target) {
            left = ...
        } else if (nums[mid] > target) {
            right = ...
        }
    }
    return ...;
}

Analysis of a binary search technique is: do not appear else, but to all cases be clearly written with else if, so you can clearly show all the details . This article will use else if, is intended to make it clear, the reader can understand their own simplified.

... mark the place where part of the details that may arise. When you see a binary code looks, first of all pay attention to these places. Later analysis with examples of these places can have any kind of change.

In addition declare it requires skill to prevent overflow when calculating mid, recommendations written: MID = left + (right - left) / 2 , this article ignore this problem.

Second, look for a number (basic binary search)

This scenario is the simplest, and perhaps the most familiar, that search a number, if present, returns its index, otherwise -1.

int binarySearch(int[] nums, int target) {
    int left = 0; 
    int right = nums.length - 1; // 注意

    while(left <= right) { // 注意
        int mid = (right + left) / 2;
        if(nums[mid] == target)
            return mid; 
        else if (nums[mid] < target)
            left = mid + 1; // 注意
        else if (nums[mid] > target)
            right = mid - 1; // 注意
        }
    return -1;
}

1 Why the condition of the while loop is <= instead of <?

A: Because the assignment is initialized right nums.length - 1, that is the last element of the index, rather than nums.length.

This may occur in two different binary search function, the difference is: the former corresponding to both ends closed interval [left, right], which corresponds to the closed left and right open interval [left, right), because the index size nums .length is out of bounds.

We use this algorithm is [left, right] closed at both ends of the interval. This section is a section for each search, we might as well be called "search space" (Search Space) .

When should I stop searching it? Of course, when you can find a target termination:

    if(nums[mid] == target)
        return mid; 

But if not found, you need to terminate the while loop, then return -1. When it while loop should be terminated? The search space is empty should be terminated , which means you do not have to look for, and did not mean to find the thing.

while (left <= right) termination condition is left == right + 1, is written in the form of section [right + 1, right], or with a specific number into [3, 2], showing that this search time interval empty , either because no figures 3 and less than or equal to 2 bar. So this time while the loop termination is correct, it can directly return -1.

while (left <right) termination condition is left == right, the interval is written in the form [right, right], or a tape into a specific number [2, 2], this search time interval non-empty , and a number 2, but this time while loop terminates. This means that the interval [2, 2] is missing, the index 2 is not searched, this time directly returns -1 if it is an error may occur.

Of course, if you have to use while (left <right) can be, we already know the cause of the error, hit a good patch:

//...
while(left < right) {
    // ...
}
return nums[left] == target ? left : -1;

2 . Why left = mid + 1, right = mid - 1? I see some code right = mid or left = mid, none of these simple calculations, how matter in the end, how to judge?

A: This is a difficult dichotomy to find, but as long as you understand the foregoing, it can be very easy to judge.

Just clear the "search space" concept, and the algorithm of the search space is closed at both ends, that is [left, right]. So when we find an index mid mean the target, how to determine the next step of the search space it?

Of course, is to search for [left, mid - 1] or [mid + 1, right], right? Since mid been searched, should be removed from the search interval.

3 . This algorithm has what defects?

A: At this point, you should already know all the details of the algorithm, and the reasons for such treatment. However, this method has limitations.

For example, give you an ordered array nums = [1,2,2,2,3], target = 2, this algorithm returns the index is 2, that's right. But if I want to get left border target, ie 1 index, or I want to get right border target, namely the index 3, so this algorithm can not handle.

Such requirements are common. You might say, find a target index, and then to the left or right will not do a linear search? Yes, but not because it is difficult to guarantee binary search complex-log of the time.

We'll discuss the follow-up of these two algorithms binary algorithm to find.

Third, look for the left border of the binary search

Direct look at the code, which is the mark of the details need to pay attention:

int left_bound(int[] nums, int target) {
    if (nums.length == 0) return -1;
    int left = 0;
    int right = nums.length; // 注意

    while (left < right) { // 注意
        int mid = (left + right) / 2;
        if (nums[mid] == target) {
            right = mid;
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else if (nums[mid] > target) {
            right = mid; // 注意
        }
    }
    return left;
}

1. Why while (left <right) instead of <=?

A: The same analysis methods, because the initialization right = nums.length instead nums.length - 1. Therefore, each cycle of "search space" is the [left, right) left and right to open and close.

Condition while (left <right) termination was left == right, this time searching interval [left, left) happens to be empty, it may be terminated correctly.

2. Why did not return operations -1? If the target of this value does not exist in nums, how do?

A: Because you want to go step by step, to understand what this "left border" have any special meaning:

For this array, the algorithm returns 1. The meaning of this interpretation may be 1: nums elements 2 are less than 1.

For example, an ordered array nums = [2,3,5,7], target = 1, the algorithm returns 0, meaning: nums element 1 have less than 0. If target = 8, the algorithm will return 4, meaning: nums elements 8 has less than four.

In summary it can be seen, the return value of the function (i.e., the value of the left variable) value interval is closed interval [0, nums.length], so we simply add two lines of code at the correct time to return -1:

while (left < right) {
    //...
}
// target 比所有数都大
if (left == nums.length) return -1;
// 类似之前算法的处理方式
return nums[left] == target ? left : -1;

3. Why left = mid + 1, right = mid? And before the algorithm is not the same?

A: This is a very good explanation, because our "search space" is the [left, right) left and right to open and close, so when nums [mid] After being detected, the next step should be removed mid search space is divided into two sections, namely [left, mid) or [mid + 1, right).

4. Why is the algorithm can search the left side of the border?

A: The key is to nums [mid] == target processing in this case:

    if (nums[mid] == target)
        right = mid;

Visible, when not to return immediately to find the target, but the narrow upper bound right "search space" to continue the search in the interval [left, mid), that is shrinking to the left, the purpose of locking the left side of the border.

5. Why return to left instead of right?

A: Returns the left and right are the same, because while termination condition is left == right.

Fourth, looking to the right border of a binary search

Looking to find the code right border and left border of the same, only two differences, has been marked:

int right_bound(int[] nums, int target) {
    if (nums.length == 0) return -1;
    int left = 0, right = nums.length;

    while (left < right) {
        int mid = (left + right) / 2;
        if (nums[mid] == target) {
            left = mid + 1; // 注意
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else if (nums[mid] > target) {
            right = mid;
        }
    }
    return left - 1; // 注意

1. Why is this algorithm can be found right border?

A: Similarly, the key point is here:

    if (nums[mid] == target) {
        left = mid + 1;

When nums [mid] == target, do not return immediately, but left to increase the lower bound of the "search space", making the interval shrinking right, the purpose of locking the right side of the border.

2. Why did finally return left - 1 rather than the function of the left boundary, return left? And I feel that since there is a search for the right side of the border, should return right son.

A: First, the termination condition of the while loop is left == right, so the left and right are the same, you have to reflect the characteristics of the right of return right - 1 well.

As to why a reduction, which is a special point search right border, the key condition in this judgment:

    if (nums[mid] == target) {
        left = mid + 1;
        // 这样想: mid = left - 1

Since we update must be left to left = mid + 1, that is to say at the end of the while loop, nums [left] certainly not equal to target, while nums [left - 1] may be the target.

As to why the update left must be left = mid + 1, with the left side of the border search, not repeat them.

3. Why did not return operations -1? If the target of this value does not exist in nums, how do?

A: similar to the previous search left border, while since the termination condition is left == right, that is left is in the range [0, nums.length], it is possible to add two lines of code, properly -1:

while (left < right) {
    // ...
}
if (left == 0) return -1;
return nums[left-1] == target ? (left-1) : -1;

Fifth, concluded

First logic to sort out cause and effect of these differences in detail:

The first, most basic binary search algorithm:

因为我们初始化 right = nums.length - 1
所以决定了我们的「搜索区间」是 [left, right]
所以决定了 while (left <= right)
同时也决定了 left = mid+1 和 right = mid-1

因为我们只需找到一个 target 的索引即可
所以当 nums[mid] == target 时可以立即返回

Second, look for the left border of the binary search:

因为我们初始化 right = nums.length
所以决定了我们的「搜索区间」是 [left, right)
所以决定了 while (left < right)
同时也决定了 left = mid+1 和 right = mid

因为我们需找到 target 的最左侧索引
所以当 nums[mid] == target 时不要立即返回
而要收紧右侧边界以锁定左侧边界

第三个,寻找右侧边界的二分查找:

因为我们初始化 right = nums.length
所以决定了我们的「搜索区间」是 [left, right)
所以决定了 while (left < right)
同时也决定了 left = mid+1 和 right = mid

因为我们需找到 target 的最右侧索引
所以当 nums[mid] == target 时不要立即返回
而要收紧左侧边界以锁定右侧边界

又因为收紧左侧边界时必须 left = mid + 1
所以最后无论返回 left 还是 right,必须减一

如果以上内容你都能理解,那么恭喜你,二分查找算法的细节不过如此。

通过本文,你学会了:

1. 分析二分查找代码时,不要出现 else,全部展开成 else if 方便理解。

2. 注意「搜索区间」和 while 的终止条件,如果存在漏掉的元素,记得在最后检查。

3. 如需要搜索左右边界,只要在 nums[mid] == target 时做修改即可。搜索右侧时需要减一。

就算遇到其他的二分查找变形,运用这几点技巧,也能保证你写出正确的代码。LeetCode Explore 中有二分查找的专项练习,其中提供了三种不同的代码模板,现在你再去看看,很容易就知道这几个模板的实现原理了。

Guess you like

Origin www.cnblogs.com/kyoner/p/11080078.html