Sum of Maximum Subarray Problem – Linear Algorithm

There are many algorithms for calculating the sum of the largest subarrays of a given array. The most common one is to use the divide and conquer strategy. However, the use of divide and conquer for this problem increases the time complexity and code complexity. There are simpler algorithms, this article will introduce an iterative algorithm in linear time. This should be the most efficient solution. The first code is as follows:

int maxSubArray( int* array, int length)
{
    int boundry = array[0];
    int maxArray = array[0];
    for( int i=1; i<length; ++i )
    {
        if( boundry+array[i] >= array[i] ) 
            boundry += array[i];
        else
            boundry = array[i];
        if( maxArray < boundry )
            maxArray = boundry; 
    }
    return maxArray;
}

#include<iostream>
using namespace std;
int main()
{
    int a[] = {1,-2,3,10,-4,7,2,-48};
    int num = sizeof(a)/sizeof(a[0]);
    int result = maxSubArray(a, num);
    cout<<"result:"<<result<<endl;//输出结果为18

    int b[] = {3,-1,5,-1,9,-20,21,-20,20,21};
    num = sizeof(b)/sizeof(b[0]);
    result = maxSubArray(b, num);
    cout<<"result:"<<result<<endl;//输出结果为42
    return 0;
}

The whole function is only 12 lines, so simple but efficient

Detailed idea

The idea of ​​solving the problem comes from the introductory algorithm exercises 4.1-5

Design a non-recursive, linear-time algorithm for the maximum subarray problem using the following ideas. Starting at the left boundary of the array, processing from left to right, recording the largest subarray that has been processed so far. If the largest subarray is known A[1..j], expand the solution to A[1..j+1]the largest subarray based on the following properties: A[1..j+1]The largest subarray of is either A[1..j]the largest subarray of , or some subarray A[i..j+1] (1≤i≤j+1). Given the largest subarray known A[1..j], the largest subarray of the form can be found in linear time A[i..j+1].

Note: This article only discusses the sum of the largest sub-array, so when the largest sub-array is mentioned later, it refers to the sum of the largest sub-array.

In order to be consistent with the description in the exercise, when discussing ideas in this section, the subscripts start from 1, and A[1] represents the first element.

Given that the largest subarray of A[1..1] is the first element, either the largest subarray of A[1..2] is either the largest subarray of A[1..1], or A[i ..2] the largest subarray of . In other words, the largest subarray of A[1..2] either contains the second element or does not contain the second element; so ① needs to be selected from the two cases that contain the second element and do not contain the second element. Pick the largest value.

The value that does not contain the second element can be determined, that is, the largest subarray of A[1..1], which is known; for convenience, it is called the first largest subarray. The largest subarray containing the second element needs to be calculated separately; for convenience, we call it the bounded largest subarray . The largest subarray we really need is the one with the larger value of the previous largest subarray and the bounding largest subarray.

So how do I calculate the bounded maximum subarray? Since it has been determined that the second element is included in this case, ② then we only need to divide it into two cases: only the second element is included, and not only the second element is included; also take the maximum value of these two cases. The case where only the second element is included is very simple, and the largest subarray of the boundary is just the value of A[2]; the case where not only the second element is included is also simple, not only the second element, then it must be included. An element, the first element, so we need the bounding maximum subarray of its previous element. Then the bounded largest subarray of A[2] is the maximum of these two cases.

That is, to determine the largest subarray of the A[1..2]th, the only additional element required is the bounding largest subarray of the first element.

Now the situation is clear, when calculating the largest subarray of A[1..2], the required values ​​are: the previous largest subarray (known), the value of A[2] (known), the value of the previous element Boundary maximum subarray.

Obviously, this is a problem that can be solved iteratively from scratch. Each step of the iteration only needs the three values ​​in bold in the previous paragraph; each step provides the basis for the calculation of the next step. This is a linear and efficient algorithm.

code explanation

If you understand the idea, then the code is easy to explain. At each step, the current maximum boundary subarray is calculated according to the previous step’s maximum boundary subarray and the value of this iteration, and the current maximum boundary subarray is compared with the previous maximum subarray to determine the current maximum subarray. .

extension - record index value

Because it is discussed in conjunction with the code, the subscript starts from 0. Here are two more custom nouns: The former largest subarray: the largest subarray that does not contain the current element. The largest subarray: contains only the current element and not only the current element, the larger value of the two cases

Let's take the array {1,-2,3,10,-4,7,2,-48} as an example. Initially, both indices are 0, the largest subarray and the largest subarray are both 1; when the iteration index is 1, the current value is -2, and the largest subarray of the previous element is 1, so the largest subarray is The array is -1, the previous largest subarray is 1, the largest subarray in this iteration is the previous largest subarray, the value is 1, and the index is not updated; when the iteration index is 2, the current value is 3, and the largest subarray in the front boundary is The array is -1, so the maximum sub-array of the boundary is 3; the previous maximum sub-array is 1, and the maximum sub-array of this iteration is the maximum sub-array of the boundary, with a value of 3; at this time, both the start index and the end index need to be updated to The current index is 2; when the iteration index is 3, the current value is 10, and the largest subarray in the front boundary is 3, so the largest subarray in the frontier is 13, the previous largest subarray is 3, and the largest subarray in this iteration is The maximum subarray of the boundary , the value is 13; at this time, the end index needs to be updated to the current index, but the start index cannot be updated; ………… The common point between the index 2 and the index 3 is that the maximum boundary subarray is greater than The previous largest subarray has updated the end index; the difference is that when the index is 2, the border largest subarray only contains the value corresponding to the index, so the starting index can be updated; and the border largest subarray of index 3 also contains the previous One element, so only the end index can be updated.

At this time, the situation that the index needs to be updated can be summarized as follows: Condition ①: This time the largest sub-array of the boundary contains only the current value, and is larger than the previous largest sub-array, then the starting index is updated; Condition ②: The largest sub-array of the boundary this time If it is greater than the previous largest subarray, then update the termination index;

The condition ② to update the end index should be sufficient and necessary, but the condition ① to update the start index is sufficient but not necessary. Consider the array {4,-5,1,5}, when the index is 2, the largest subarray of the boundary is 1, the previous largest subarray is 4, and only the first half of condition 1 is satisfied, however, the largest subarray of the entire array has The starting index is 2. Therefore, condition ① needs to be supplemented.

The following are the two conditions of the second version: Condition ①: The current maximum boundary subarray contains only the current value Condition ②: The current maximum boundary subarray is greater than the previous largest subarray When condition ① is satisfied, record the current index is the cache index, but does not update the start index; when condition ② is satisfied, the update end index is the current index, and the update start index is the cache index. Condition ② is always satisfied after condition ①. Condition ① may mark a new beginning, because condition ① can be satisfied repeatedly, while condition ② must mark an end.

After the idea is clarified, the code is at hand

int *maxSubArray( int* array, int length)
{
    int boundry = array[0];
    int maxArray = array[0];
    int maxEndIndex = 0;
    int maxBeginIndex = 0;
    int tmpBeginIndex = 0;
    for( int i=1; i<length; ++i )
    {
        if( boundry+array[i] >= array[i] ) 
        {
            boundry += array[i];
        }
        else
        {
            boundry = array[i];
            tmpBeginIndex = i;
        }
        if( maxArray < boundry )
        {
            maxArray = boundry; 
            maxEndIndex = i;
            maxBeginIndex = tmpBeginIndex;
        }
    }
    int *result = new int[3];
    result[0] = maxBeginIndex;
    result[1] = maxEndIndex;
    result[2] = maxArray;
    return result;
}

#include<iostream>
using namespace std;
int main()
{
    int a[] = {1,-2,3,10,-4,7,2,-48};
    int num = sizeof(a)/sizeof(a[0]);
    int* result = maxSubArray(a, num);
    cout<<"Begin:"<<result[0]<<"  End:"<<result[1]<<"  Num:"<<result[2]<<endl;

    int b[] = {3,-1,5,-1,9,-20,21,-20,20,21};
    num = sizeof(b)/sizeof(b[0]);
    result = maxSubArray(b, num);
    cout<<"Begin:"<<result[0]<<"  End:"<<result[1]<<"  Num:"<<result[2]<<endl;

    return 0;
}

The output is:

Begin:2 End:6 Num:18 
Begin:6 End:9 Num:42

There is a memory leak in the main function, but that's not the point

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324466933&siteId=291194637