Do you really know binary search?

Please indicate the source.
The original intention of writing this article is because leetcode encountered a pit. Let's take a look first.

leetcode 34

Given an array of integers nums sorted in ascending order, find the starting and ending position of a given target value.

Your algorithm’s runtime complexity must be in the order of O(log n).

If the target is not found in the array, return [-1, -1].

Example 1:Input: nums = [5,7,7,8,8,10], target = 8
Output: [3,4]

Example 2:Input: nums = [5,7,7,8,8,10], target = 6
Output: [-1,-1]

code

The algorithm complexity is required to be O(log n). Obviously, it must be a binary search. Binary search to the target and then expand on both sides to find the boundary.

public int[] searchRange(int[] nums, int target) {
    if (nums.length == 0) return new int[]{-1, -1};
    int left = 0, right = nums.length - 1, mid;

    while (left < right) {
        mid = (left + right)/ 2;

        if (nums[mid] < target) {
            left = mid;
        } else if (nums[mid] > target){
            right = mid;
        } else {
              left = mid; 
              right = mid;
              break;
        }
    }

    if (nums[left] != target) {
        return new int[]{-1, -1};
    } else {
        while (left - 1 >= 0 && nums[left - 1] == target) left--;
        while (right + 1 < nums.length && nums[right + 1] == target) right++;
        return new int[]{left, right};
    }
}

Then take a test [5, 7, 7, 8, 8, 10] and run the target6 case. WTF? Infinite loop? ?

Where is the problem?

Very confused, I have been writing this way before, and there is no problem, how can it be an infinite loop. Then I will walk through the code and have a look.

array [5, 7, 7, 8, 8, 10], target 6

left right mid
0 5 2
0 2 1
0 1 0
0 1 0
0 1 0

Found the problem. When left=0, right=1, mid=0, nums[left]=5, nums[right]=7, nums[mid]=5 In the following code, the first branch of if will always be executed, three None of the values ​​will change.

if (nums[mid] <= target) {
    left = mid;
} else {
    right = mid;
}

And left = mid; is the culprit. If it is written like this, once the target element is not in the array, it will be an infinite loop. So it should be changed to left = mid + 1; . This successfully passed the test case. To sum up, this mistake is due to my own negligence.

What else could be wrong?

  1. selection of mid

    What I wrote in the code is mid = (left + right)/ 2;, but there is actually a problem. The problem is that if the left and right are both large, the sum of the two is likely to overflow. So the best way to write it would bemid = left + (right - left)/ 2;

  2. How to jump left and right?

    This problem is also the reason why I had an infinite loop in the first place. I left = midchanged it to left = mid + 1, why not right = midchange right = mid - 1it too. Of course it's ok here, because I just want to find a subscript in the array that is equal to the value of target.

To verify my conclusion , let's take a look at the implementation of binary search in jdk. inside Arraysthis class.

// Like public version, but without range checks.
private static int binarySearch0(int[] a, int fromIndex, int toIndex, int key) {
    int low = fromIndex;
    int high = toIndex - 1;

    while (low <= high) {
        int mid = (low + high) >>> 1;
        int midVal = a[mid];

        if (midVal < key)
            low = mid + 1;
        else if (midVal > key)
            high = mid - 1;
        else
            return mid; // key found
    } 

That's right here it is, low = mid + 1and high = mid - 1. But the selection of mid in front seems to conflict with what I said.

int mid = (low + high) >>> 1. Here >>> refers to an unsigned right shift, the high bit is filled with 0, and the low and high here are non-negative numbers. Using this should only be a defensive programming.

(The efficiency of the computer shift operation is much higher, and many shifts are used in jdk, such as ArrayList, HashMapetc.)

Low and high are added directly. Does jdk not consider the overflow situation? Actually no, the length of the array in Java is limited. This limit is configurable by the jvm (?). I checked the source code and did not find the specific value. At least when I execute int[] test = new int[Integer.MAX_VALUE/2];it, an OutOfMemoryError will be reported. In this case, there is no need to consider the overflow problem, and it is faster to use the shift operation directly.

more

The above discussions are all standard binary search, that is, to find the subscript of a number in an array, and return a negative number if it is not found. In fact, we often encounter another kind, which is to find the subscript of the first occurrence of the number, or the subscript of the last occurrence of the number. Obviously, the jdk binarySearchcannot achieve this requirement, and it is also stated in the java doc.

Searches the specified array of ints for the specified value using thebinary search algorithm. If the array contains multiple elements with the specified value, there is no guarantee which one will be found.

For "finding the subscript of the first occurrence of the number", we can understand it as binary search, which is the process of left going backwards and finding the target position (the final result must be the value of left). So in order not to make mistakes, we'd better only change the left with each move.

As for "finding the subscript of the last occurrence of the number", it can be regarded as the mirror image of "finding the subscript of the first occurrence of the number", which can be understood as a binary search from the front (reversing the array is to solve the first A situation), so when selecting mid, it is necessary to reversemid = right - ((right-left)>> 1)

I see that it is written in many places mid = left + ((right-left + 1)>> 1);, but it is not easy to understand. It's much easier to write mid = right - ((right-left)>> 1)and understand.

  1. Binary search for the first occurrence of the subscript of the number

    while (left < right) {
        mid = left + ((right-left)>> 1);
    
        if (nums[mid] < target) {   
        // 如果中间值比target小,left移到mid+1
            left = mid + 1;
        } else {
        // 如果中间值不比target小即可能等于,right移到mid
            right = mid;
        }   
     }
  2. Binary search for the subscript of the last occurrence of the number

    while (left < right) {
        mid = right - ((right-left)>> 1);//注意这里!!
    
        if (nums[mid] <= target) {  
        // 如果中间值小于等于target即不比target大,left移到mid 因为left的目标是和target相等的最后一个下标
            left = mid;
        } else {
        // 如果中间值比target大,right移到mid-1
            right = mid -1 ;
        }   
     }

Complete

Take a closer look at your solution. There is a problem with the while loop at the end. If this range is very large, wouldn't it be necessary to loop the entire array?

So with the previous foreshadowing, this problem can be solved in a faster and clearer way. Find the location of the first and last occurrences of the target respectively . code at the end

public int[] searchRange(int[] nums, int target) {
    if (nums.length == 0) return new int[]{-1, -1};
    int left = 0, right = nums.length - 1, first;

    //找到第一次出现的位置
    while (left < right) {
        int mid = left + ((right - left) >> 1);
        if (nums[mid] < target) {
            left = mid + 1;
        } else {
            right = mid;
        }
    }

    if (nums[left] != target) {
        return new int[]{-1, -1};
    }
    first = left;

    left = 0;
    right = nums.length - 1;

    //找到最后一次出现的位置
    while (left < right) {
        int mid = right - ((right - left) >> 1);
        if (nums[mid] <= target) {
            left = mid;
        } else {
            right = mid - 1;
        }
    }

    return new int[]{first, left};
}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325671331&siteId=291194637