【LeetCode】4、Median of Two Sorted Arrays

Topic grade: Hard

Subject description:

  There are two sorted arrays nums1 and nums2 of size m and n respectively.

  Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

  You may assume nums1 and nums2 cannot be both empty.

  Example 1:

nums1 = [1, 3]
nums2 = [2]
The median is 2.0

  Example 2:

nums1 = [1, 2]
nums2 = [3, 4]
The median is (2 + 3)/2 = 2.5

  The meaning of problems: given two lengths of m and n in the sorted array, the two arrays to find the median of all the elements. Required time complexity is O (log (m + n) ).


Problem-solving ideas:

  This title belongs to the level of difficulty is relatively difficult. Here from simple to complex given five kinds of solutions.

  Solution one: two arrays merge, find the median

  At first glance this subject, we'd have a very intuitive violent solution. Two arrays are ordered from small to large, the median overall requirements, just to be merging two arrays, synthesis of an increasing array, you can directly find the middle element.

  Merging of two arrays time complexity of O (m + n), the median only O (1) time to find an ordered array, so the total time complexity of O (m + n), also need an additional array of length m + n, so that the space complexity is O (m + n).

  code show as below:

public double findMedianSortedArrays(int[] nums1, int[] nums2) {
    int m = nums1==null?0:nums1.length;
    int n = nums2==null?0:nums2.length;
    if(m==0) //数组1空
        return findMedianSortedArrays(nums2);
    if(n==0) //数组2空
        return findMedianSortedArrays(nums1);
    
    int[] nums=new int[m+n];
    int count=0;
    for(int i=0,j=0;i<m || j<n;){  //合并为一个数组
        if(i==m)
            nums[count++]=nums2[j++];
        else if(j==n)
            nums[count++]=nums1[i++];
        else{
            if(nums1[i]<nums2[j])
                nums[count++]=nums1[i++];
            else
                nums[count++]=nums2[j++];
        }  
    }
    return findMedianSortedArrays(nums);
}
//在一个有序数组中找中位数
public double findMedianSortedArrays(int[] nums){
    int len=nums.length;
    if(len%2==0)
        return (nums[len/2]+nums[len/2-1])/2.0;
    else
        return nums[len/2];
}

  Solution two: Improved merge

  From Violence Act, we can actually find that our goal was simply to find the median, so there is no need to merge the entire array are finished, we only need to merge the median can be obtained. That is: the length of the array would be merged m + n, but as long as half the length of the merge can find the median.

  For the median, if len (len = m + n) is odd, then after the second merging len / 2 + 1 is the number median; If len is even, then the average of the two should be merged after the intermediate number , which is the first len ​​/ 2 and the average number of len / 2 + 1 number of the two numbers. Note that we are talking about the first few numbers, at index 0 is the number 1, to pay attention to when writing code.

  Thus, two solution idea is to merge only until the median can be determined, and the whole array is not necessary to merge are stored, only need to maintain two variables during cycling can, of len / 2 and the number of len / 2 + 1 number.

  Since only half the merge, the time complexity is O (len / 2), which is still O (m + n), but dropped space complexity O (1).

  code show as below:

public double findMedianSortedArrays(int[] nums1, int[] nums2) {
    int m = nums1==null?0:nums1.length;
    int n = nums2==null?0:nums2.length;
    
    int i=0,j=0;
    int first=-1,second=-1;
    //不需要全部保存,只保存两个数即可
    for(int count=0;count<=(m+n)/2;count++){  //找到中位数就停止
        first=second;
        if(i==m)
            second=nums2[j++];
        else if(j==n)
            second=nums1[i++];
        else{
            if(nums1[i]<nums2[j])
                second=nums1[i++];
            else
                second=nums2[j++];
        }  
    }
    
    if((m+n)%2==0)
        return (first+second)/2.0;
    else
        return second;
}

  Solution three: the first find small numbers of k

  Obviously, the solution I and II, although relatively easy to understand, but we see the actual time complexity of the topic did not meet the requirements. The time complexity is subject of the request log level, under normal circumstances, in order to become the complexity of log time, or is associated with a binary tree, either to find information about the two. Here is an array of data structures, so we're looking for some the above considerations in half.

  Here we give a third problem-solving ideas: the median of an ordered array is actually an array of len / 2 number (an even number is the average of two numbers). So look for the median can be converted for the sake of an array of small numbers of k. As long as we know how to find the number of small k, then the length is an odd number, we directly find the len / 2 + 1 small figure is the median, and the even then it finds the first len ​​/ 2 small and len / 2 + 1 small numbers averaged.

  So our problem is how to get into a small number of k in two ordered arrays .

  Here there is a better way: In order to find a small number of k, we compare two arrays of k / 2 digits, i.e. the nums1 [k / 2-1] and nums2 [k / 2-1] performed Compare, if what is small, so it figures it before and will not be the first k small numbers, it can be ruled out , repeat this process to find the remaining data inside.

  In fact this principle is not difficult to understand: As shown below, if nums1 [k / 2-1] smaller than nums2 [k / 2-1], then first of all it is less than the number of previous nums1 [k / 2-1] , assuming the array nums2 nums2 [k / 2-1] is also smaller than the number in front of it (not necessarily actually less than), then all smaller than nums1 [k / 2-1] up to a total number of k / 2- 1 + k / 2-1 months, i.e. two k-2, i.e. nums1 [k / 2-1] up to the k-1 may be small, the k can not be small , so that it and its previous figures You can be excluded.

  Conversely, if nums1 [k / 2-1] is greater than nums2 [k / 2-1], can be excluded nums2 [k / 2-1] and several before it, if the two numbers are equal, then virtually any choose an array exclude it, it is still ruled out k / 2 number.



  Therefore, our third solution is: always take half of the k element at the comparison, the relatively small array k / 2 elements and can be ruled out before, and then continue to look for the remaining k-array small elements (Note: due to have ruled out k / 2 elements, so the value of k minus the number of elements to be excluded out).

  Obviously, this is an iterative process, it can be solved by recursively, here to across the border of the array and issues a termination condition recursive , k / 2, and to compare the array length, whichever is smaller. Further, since each time removing k / 2 elements, and therefore must have a final array becomes empty, or becomes k 1, which is the termination condition, see specific code shown below.

  Finally, we analyze the complexity of the solution, the solution three each time we cycle excluded k / 2 elements, the time complexity is O (logK), and k is m + half of n, the time complexity is O ( log (m + n)), the extra space is not used, the space complexity is O (1).

class Solution {
    //解法三:相当于求两个有序数组的第k小数字
    public double findMedianSortedArrays(int[] nums1, int[] nums2) {
        int m = nums1==null?0:nums1.length;
        int n = nums2==null?0:nums2.length;
        int len=m+n;
        if(len%2==0) //偶数
        return (findKthNum(nums1,nums2,0,0,len/2) + findKthNum(nums1,nums2,0,0,len/2 + 1)) / 2.0;
        else
            return findKthNum(nums1,nums2,0,0,len/2+1);
    }
    
    //找第k小数字,i和j表示两个数组的起始位置
    public int findKthNum(int[] nums1,int[] nums2,int i,int j,int k){
        int m=nums1.length;
        int n=nums2.length;
        if(i>=m) //nums1空
            return nums2[j+k-1];
        if(j>=n) //nums2空
            return nums1[i+k-1];
        if(k==1)
            return Math.min(nums1[i],nums2[j]);
        
        int index=k/2-1;
        int nums1Index=Math.min(m-1,i+index);
        int nums2Index=Math.min(n-1,j+index);
        if(nums1[nums1Index]<nums2[nums2Index])
            return findKthNum(nums1,nums2,nums1Index+1,j,k-(nums1Index-i+1));
        else
            return findKthNum(nums1,nums2,i,nums2Index+1,k-(nums2Index-j+1));
    }
}

  Solution four: the median of the data stream

  If you have done the "Offer to prove safety" related to the exercises, we should also remember that there is a title on the median, you can refer to: [63] to prove safety Offer, the median data stream . In this problem, since the data is read out from the data stream one by one, so we choose the appropriate data structures be saved, due to the nature of the median, our data structure is the best choice of maximum and minimum heap stack , implemented with a maximum data storage heap on the left, with a minimum heap for data storage on the right, and to ensure that the average data into two stacks, all of the biggest heap of data is less than the minimum heap of data, at this time we say top of the heap the data is the median.

  Inspired by the question, of course, we can put two arrays of data as a data stream, in turn added to the maximum and minimum heap heap, then you can use the algorithm to get the problem solved . Such still time complexity O (log (m + n) ), but uses two stacks of extra space.

  Implementation of the reference solution: [63] Offer to prove safety, the number of bits in the data stream

  Solution five: syncopation, smaller than the smallest maximum left to the right

  From for Four, we can not fail to consider the use of extra space to achieve the same effect in an array, this is the problem solution algorithm for this problem is given in LeetCode in.

  If we cut into two halves an array, an array of length m, with a total of 0 to m m + 1 position can be cut, cut at the assumed array A at position i, j array B at the position of the cut, and then i, j and the left side of the left side of the left half synthesis, i and j to the right of the right synthesis of the right half . Then similar for Four, if we can guarantee that (1) all data equally divided into two sides; (2) the elements on the left are smaller than the elements on the right; then the median will only associated with minimum and maximum values of the right to the left ( equivalent to the top of the heap of four kinds of solution).



  To the above two conditions are met, the key lies in how to determine i and j position , we Points to consider:

  (1) if m + n is an even number, then the left and right sides of the number of data should be equal, and the median is (the left half of the right half of the maximum value + minimum value) / 2. At this time, the figure we can see that the left to the right length equal to the data length of the data, i.e. i+j=m-i+n-j, it j=(m+n)/2-i.

  (2) if m + n is an odd number, then the ratio of the length of the left half of the right half of a large part, and the median is a maximum value of the left half. In this case, there are: data length = data + 1 on the left to the right length, i.e. i+j = m-i + n-j + 1, it j=(m+n+1)/2-i.

  Since when m + n is an even number, (m + n) / 2 and (m + n + 1) / 2 is the same, so the above two cases may be combined, that is only necessary to determine the position i, we j can be obtained a position , i.e., meet j=(m+n+1)/2-i.

  I and j above our relationship is based on a consideration of the conditions, then consider the second condition, the element on the left to get the right elements are less than, as long as the maximum value of less than the minimum to the right of the left . Because of the ordered array, then only a certain maximum value to the left two cases, i.e. either A [i-1], either B [j-1], the minimum value of the right must also only two cases, either A [ i], either B [j], the same two cases:

  (1) If the left side of A [i-1] the maximum, then it must be less than A [i], and only B [j] is compared, if smaller than B [j], then the second condition is satisfied; if more than B [j], this time should be reduced i, increased J (greater because if i, j smaller, then A [i-1] will be larger than B [j].



  (2) If the left side of B [j-1] the maximum, then it must be smaller than B [j], and only A [i] is compared, if smaller than A [i], then the second condition is satisfied; if more than A [i], this time should be increased i, decreases J (because if i is smaller, j larger, then B [j-1] only [i] is larger than A.



  分析到这里,我们应该就知道如何去找合适的i的位置,我们可以通过二分的方法去查找,然后通过上述两种情况的比较增大或者减小i的范围,进而找到合适的i的位置,并通过j的公式求出j,这样我们就可以很容易的找到中位数

  注意:这里还有一个问题就是:由于 0 <= i <= m ,为了保证 0 <= j <= n ,我们必须保证 m <= n ,因此我们每次都是对较短的那个数组进行切分。

  因此,我们可以看到此解法和数据流中位数基于堆的解法实际上有异曲同工之妙,只不过数据结构不同。由于是对较短的数组进行二分,因此此算法的时间复杂度为O(log(min(m,n))).

    public double findMedianSortedArrays(int[] nums1, int[] nums2) {
       //解法4:切分法,左边最大小于右边最小
        int m = nums1==null?0:nums1.length;
        int n = nums2==null?0:nums2.length;
        
        if(m>n) //保证m是小于n的
            return findMedianSortedArrays(nums2,nums1);
        
        //通过二分查找,在较短的那个数组里找一个位置i,使得左边最大小于右边最小
        int low=0,high=m;
        while(low<=high){
            int i=(low+high)/2;
            int j=(m+n+1)/2-i;
            if(j!=0 && i!=m && nums1[i] < nums2[j-1]) //此时应该增大i
                low=i+1;
            else if(i!=0 && j!=n && nums1[i-1]> nums2[j]) //此时应该减小i
                high=i-1;
            else{ //此时i符合条件,找到了i
                //找到左边最大的
                int maxLeft=0;
                if(i==0)
                    maxLeft=nums2[j-1];
                else if(j==0)
                    maxLeft=nums1[i-1];
                else
                    maxLeft=Math.max(nums1[i-1],nums2[j-1]);
                
                if((m+n)%2!=0) //奇数,左边最大的就是中位数
                    return maxLeft;
                
                //找到右边最小的
                int minRight=0;
                if(i==m)
                    minRight=nums2[j];
                else if(j==n)
                    minRight=nums1[i];
                else
                    minRight=Math.min(nums1[i],nums2[j]);
                return (maxLeft+minRight)/2.0;  
            }
        }
        return 0.0;
    }

总结

  作为第一道hard级别的题目,此题确实是有一定难度的,我们给出的五种解法中,时间复杂度逐步降低,最重要的是这里体会到二分查找的灵活运用,建议先看一下剑指Offer:数据流的中位数那道题目,对比这里的解法四和解法五可以比较容易理解。

Guess you like

Origin www.cnblogs.com/gzshan/p/10967691.html