Simple understanding of tree arrays and their uses

I. Introduction:

What is a tree array? As the name suggests, arrays are used to simulate tree structures. Then a question arises, why not build trees directly? Because it is not necessary, there is no need to build trees for problems that can be handled by tree arrays.

Suppose you want to query and modify an array, then the tree array can provide you with O(logN) modification and query time complexity. The query here refers to querying the sum of any interval (including a single point query), modification refers to single point modification.

For ordinary array operations, the time complexity of query interval sum is O(n), and the time complexity of modification is O(1);

When m queries and n modifications are required, the advantages of tree arrays emerge.

Advantages of tree arrays:

1. Single-point update, interval query - the time complexity is O(logN), O(logN) respectively

2. Interval update, single-point query - the time complexity is O(logN), O(logN) respectively

3. Interval update, interval query - the time complexity is O(logN), O(logN) respectively

4. Single-point update, single-point query - use ordinary arrays

Second, what is a tree array?

Here's the binary tree:

Transform it:

Take the highest node of each column as the node of the tree array. A is the original array, C is the tree array, then:

Assuming that the leftmost descendant of C[i] is C[j], then C[i] = sum(A[j], A[i]), where sum(A[j], A[i]) represents an array Accumulation of A on the interval [j,i].

C[1] = A[1];
C[2] = A[1] + A[2];
C[3] = A[3];
C[4] = A[1] + A[2] + A[3] + A[4];
C[5] = A[5];
C[6] = A[5] + A[6];
C[7] = A[7];
C[8] = A[1] + A[2] + A[3] + A[4] + A[5] + A[6] + A[7] + A[8];

For simple understanding, it is necessary to clarify the fact that there is a functional correspondence between each C[i] and its interval [j,i].

In order to illustrate this functional relationship, first introduce something:

Assuming lowbit(i) = k, k represents the lowest bit 1 of the binary number corresponding to i. For example, the binary number corresponding to 12 is 1100, then its lowest digit 1 is the third digit from right to left, so lowbit(12)=2^(3-1)=4; similarly, lowbit(8) =8,lowbit(6)=2.

Then the length of the interval [j,i] corresponding to C[i]=lowbit(i);

That is, j = i - lowbit(i) + 1.

How to find lowbit(i)?

There are two main ways:

int lowbit(x) 
{	
    return x - (x & (x - 1));
}

int lowbit(x) 
{	
    return x & -x;
}

Mainly the second method:

利用的负数的存储特性,负数是以补码存储的,对于整数运算 x&(-x)有
当x为0时,即 0 & 0,结果为0;
当x为奇数时,最后一个比特位为1,取反加1没有进位,故x和-x除最后一位外前面的位正好相反,按位与结果为0。结果为1。
当x为偶数时,假设为1100,那么取反加一就等于0100。
取反时,最低位1右边的所有0全部变成1,最低位1变成0,加上1之后,就会导致最低位(原来的最低位1,下面的最低位均指这个意思)右边的1全部变成0,而最低位又变回了1.这个变化不会波及最低位左边的数,所以最低位左边的数仍是取反状态,所以与运算的结果就是最低位1的值

The time complexity of this lowbit function is O(1).

Update and query operations for tree arrays:

Update operation: When an element of the original array is updated, it usually affects all tree array elements that contain this element.

What elements will be affected? If A[i] is to be updated, then C[i] and C[i+lowbit(i)] will be affected. Then use i+lowbit(i) as i to iterate all the time.

So why C[i+lowbit(i)]? It can be understood with the idea of ​​symmetry:

For example, C[6] and C[i+lowbit(i)], that is, C[6] and C[8]:

 It can be found that after moving C[i+lowbit(i)] down for a certain distance, the binary tree with C[i+lowbit(i)] as the root node and C[i] as the root node, their child nodes The number of points is consistent (the number of contained tree array nodes is also consistent)

Therefore, for any C[j], it contains lowbit(j) elements of the A array, and the position of lowbit(j) on the right side of it is the node that will be affected when updating.

更新操作:
void updata(int i,int k){    //在i位置加上k
    while(i <= n){
        c[i] += k;
        i += lowbit(i);
    }
}

Query operation:

The query we refer to refers to the query of the prefix sum from A[1] to A[i], not the sum of an intermediate interval. The sum of the intermediate interval of the array A can be obtained by making a difference between the two prefix sums.

We can know that C[i] only contains the sum of a part of the elements of the A array, not necessarily the prefix sum. C[i] only contains the sum of lowbit(i) elements, so we can use i-lowbit(i) to jump out of the boundary of C[i] containing elements. Doing the above process recursively overrides the prefix sum.

int getsum(int i){        //求A[1 - i]的和
    int res = 0;
    while(i > 0){
        res += c[i];
        i -= lowbit(i);
    }
    return res;
}

Two, single-point update, interval query

I have already talked about it in the first part, so I won’t expand on it

void updata(int i,int k){    //在i位置加上k
    while(i <= n){
        c[i] += k;
        i += lowbit(i);
    }
}

int getsum(int i){        //求A[1 - i]的和
    int res = 0;
    while(i > 0){
        res += c[i];
        i -= lowbit(i);
    }
    return res;
}

3. Interval update, single-point query

The difference array is used here. Simply put, the difference array:

Suppose A[n] is an array containing n elements, then define: D[0]=A[0], D[i] = A[i]-A[i-1], when i>0. Then D[n] is the difference array corresponding to array A. For example:

  • A[] = 1 2 3 5 6 9
  • D[] = 1 1 1 2 1 3

When all the elements in the interval [i,j] of the A array are added with a number at the same time, we only need to change the values ​​of D[i] and D[j+1] to express this operation.

We add 2 to the value in the [2,5] interval, and it becomes

  • A[] = 1 4 5 7 8 9
  • D[] = 1 3 1 2 1 1

Correspondingly, we use the differential array D corresponding to the A array to build a tree array, so that interval updates and single-point queries can be realized.

At this time, the interval update becomes a two-point update, and the time complexity can be changed to O(logN) by using the updata(int i, int k) of the tree array.

The single-point query at this time is the prefix sum of the difference array, which can be obtained by calculating the prefix sum of the tree array, and the time complexity is O(logN).

Four, interval update, interval query

At this point, you can think of it like this: Assuming that the original array is A[n], then the sum of the A array above the interval [0,j] is equal to A[0]+A[1]+...A[j]. Among them, A[k]=D[0]+D[1]+...D[k].

Expand and sum in turn, you can get A[0]+A[1]+...A[j] = (j+1)*D[0] + j*D[1] + ...D[j]

=(j+1)*A[j] - D[1] - 2*D[2] - 3*D[3] .....-j*D[j]

So we can build tree-like arrays for the arrays whose general term is i*D[i] and whose general term is D[i]. Then, we can use the tree array corresponding to the D[i] array to calculate the first item in O(logN) time, and the subsequent items can be calculated in O(logN) time using another tree array. The sum of the A array on the interval [0, j] only needs O(logN) time.

At the same time, when updating the interval of array A, only two elements of array D need to be changed, which will result in two tree arrays only needing to update two elements, and the time complexity is still O(logN)

If there is any mistake, please correct me, communicate politely, thank you very much

References: https://www.cnblogs.com/xenny/p/9739600.html

Guess you like

Origin blog.csdn.net/fly_view/article/details/129816579