Article directory
Part.I Preliminary knowledge
Reference:
Simple and easy-to-understand detailed explanation of tree arrays
Chap.I some premises and concepts
- The binary representation of negative numbers in the computer
- Prefix sum: The prefix sum refers to the sum of all array elements before a subscript of an array (including itself). The prefix sum is divided into one-dimensional prefix sum and two-dimensional prefix sum. The prefix sum is an important preprocessing that can reduce the time complexity of the algorithm. For example, the formula for one-dimensional prefix sum:
sum[i] = sum[i-1] + arr[i] ;
sum
is the prefix sum array,arr
and is the content array. With the prefix and array, we canO(1)
find the interval sum in the time complexity of . - suffix and:
- Discretization: Map finite individuals in infinite space to finite space to improve the space-time efficiency of the algorithm. In layman's terms, discretization is to reduce the data correspondingly without changing the relative size of the data. Discretization can be performed when the data is only related to the relative size between them, and has nothing to do with the specific number. There are four numbers
1234567, 123456789, 12345678, 123456
, we sort them first123456<1234567<12345678<123456789 → 1<2<3<4
; so the original data can be mapped as:2, 4, 3, 1
.
Chap.II lowbit function
Leaving aside its purpose, first understand how this function is calculated. As the name suggests, lowbit
the function of this function is to find the lowest bit in the binary representation of a certain number 1
. For example, x = 6
if its binary value is 110
, then lowbit(x)
it returns 2
, because the last bit 1 means 2.
How to ask lowbit
? There are generally two ways:
- First eliminate the last digit
1
(x & (x - 1)
,x-1
will not affectlowbit
the left1
), and then subtract the last digit1
from the original numberx - (x & (x - 1))
.x = 24
For example, the binary representation in an 8-bit computer is00001100
,x - 1
the binary representation is00001011
,x & (x - 1)
and the binary representation is00001000
, sox - (x & (x - 1))
its binary representation00000100
is what we wantlowbit
. - According to "The method of expressing negative numbers by computer" (2's complement), the AND ( ) of the number itself and the reverse of the number
x & -x
.x = 24
For example, the binary representation in an 8-bit computer is00001100
,-x
the binary representation is11110100
, andx & -x
the binary representation00000100
is what we wantlowbit
.
Part.II tree array
A tree array is a data structure, why construct such a data structure? This is because it has its unique advantages in solving certain problems. Consider such a problem: there is an n
array with a length of a[n]
, and we want to perform some operations on it: such as "query" (query the sum of all elements in a certain interval), "update" (change the value of an element) . Now I want to do q
an update and q
a query. q
This update and q
query are interspersed!
If the original data structure is used, the time complexity of each "update" is O(1)
(because I want to change i
the value directly a[i]=value
), and the time complexity of each "query" is O(n)
(because n
the sum of the numbers is required, it is necessary to do a n
loop of length );
If a tree array is used, the time complexity of each "query" can be reduced to a minimum O(log(n))
, but so is the time complexity of each "update" O(log(n))
. why? The reasons will not be listed for the time being, and will be analyzed in detail later.
Chap.I idea of tree array
The next picture first (from Zhihu @orangebird )
If it doesn’t work, then the next one (from CSDN@FlushHip )
The tree-like array structure (because its structure is like a number and it is an array, so it is called a tree-like array) is based on binary. Looking at the above picture, you can clearly grasp its idea, but why should it be divided in this way? This will use the above lowbit
. Let's consider an 8
array with a length of a
, the new array is called c
, and the new array is obtained from the old array through the organization in the above figure.
- Query: For example, if I want to ask
sum(1:7)
, the first7
binary representation is111
, ∑ i = 1 n = 7 ai = ( a 1 + a 2 + a 3 + a 4 ) + ( a 5 + a 6 ) + a 7 \sum\limits_ {i=1}^{n=7}{a_i}=(a_1+a_2+a_3+a_4)+(a_5+a_6)+a_7i=1∑n=7ai=(a1+a2+a3+a4)+(a5+a6)+a7, the way to write pseudocode (the array subscript is binary) issum(001:111)=c[111]+c[110]+c[100]
, that issum(1:7)=c[7]+c[7-lowbit(7)]+c[6-lowbit(6)]
, the time complexity is ⌈ log 2 ( n ) ⌉ \lceil log_2(n) \rceil⌈log2( n )⌉ , ieO(log n)
. - Update: For example
a[3]
, the value I want to change, the first3
binary representation is0011
, then for8
an array of length , I need to updatec[3], c[4], c[8]
; in other words, I need to update (binary subscript)a[0011], a[0100], a[1000]
; that is, I need to updatea[3], a[3+lowbit(3)], a[4+lowbit(4)]
. Apparently, so is its time complexityO(log n)
.
The above explains why the time complexity of the "query" and "update" operations of the tree array is so high
O(log n)
.
I had a question before: Then why not savea
and at the same timec
? If you want to do an update operation, do it directly ona
, the time complexity isO(1)
; if you want to do a query operation,c
do it on , the time complexity isO(log n)
. Note that the operations of "query" and "update" are carried out alternately. Whena
"update" is performed on the Internet,c
"new information" can only be reflected when subsequent "query" is performed after refactoring, butc
the time complexity of refactoring isO(n)
, If you do this, optimization will optimize loneliness.
Chap.II Construction of Tree Array
According to the above discussion, construct such a class, which contains the functions:
lowbit
: get an integerlowbit
BIT
: Constructor,vector<int>
initialized according toupdate
: Update function,i>0
add the first numberval
query
: query function, returnsm
the sum of the previous numbersprint
: outputtree
class BIT {
private:
int n; // the length of the tree
vector<int> tree; // the data tree
public:
int lowbit(int x) {
return x & -x; }
BIT(vector<int> a)
{
n=a.size();
vector<int> temp(n,0);
tree=temp;
for(int i=0;i<n;i++)
{
update(i+1,a[i]);
}
}
/**
* @brief updata the tree array
* @param[in] i the index, >=1
* @param[in] val the value of the update, =now-origin
* @return none
*/
void update(int i, int val)
{
for(;i<=n;tree[i-1]+=val,i+=lowbit(i));
}
/**
* @brief query the summary of the first m terms
* @param[in] m the index, >=1
* @param[out] sum the sum
* @return int
*/
int query(int m)
{
int sum=0;
for(;m>0;sum+=tree[m-1],m-=lowbit(m));
return sum;
}
void print()
{
for (int i = 0; i < n; cout << tree[i] << " ", i++);
cout << endl;
}
};
Call example:
int main()
{
int test[7]={
1,2,3,4,5,6,7};
vector<int> origin(test, test + 7);
BIT bt(origin);
bt.print(); // 打印 tree 的内容
cout<<bt.query(5)<<endl; // 输出前5项和
bt.update(3,6); // 第3项加6
bt.print(); // 打印更新后的 tree 的内容
cout<<bt.query(5)<<endl; // 输出更新后的前5项和
getchar();
return 0;
}
// ----------------- output ------------------
1 3 3 10 5 11 7
15
1 3 9 16 5 11 7
21
The above code can be downloaded for free: download address
Part.III Application of tree array
- LeetCode: 2426. Number of Pairs That Satisfy the Inequality
- Sword refers to Offer 51. Reversed pairs in an array
Chap.I LeetCode: 2426. The Number of Number Pairs That Satisfy the Inequality
That's right, it's because I encountered this question when I was brushing the questions
2426
, that's why I wrote this note, and finally showed my fangs (RUA!!).
Sec.I topic description and analysis
First, the topic description is:
You are given two integer arrays with subscripts starting from 0 nums1
and nums2
the size of both arrays is , n
and you are given an integer at the same time diff
, count the number pairs that meet the following conditions (i, j)
:
0 <= i < j <= n - 1
- and
nums1[i] - nums1[j] <= nums2[i] - nums2[j] + diff
Please return the number of pairs that satisfy the condition.
Problem-solving video: bilibili@林茶山艾府
Topic analysis (based on python
):
- First, transpose:
nums1[i] - nums2[i] <= nums1[j] - nums2[j] + diff
, so thatnums[i] = nums1[i] - nums2[i]
we only need to find all the data pairs that0 <= i < j <= n - 1
satisfy at that time .nums[i] <= nums[j] + diff
(i, j)
- Because
nums[i]
it is inevitable that there will be elements with the same value, we can use themset
for uniqueness and then sort themb
. - Discretization: Construct a tree-like array
bt
(all elements are initialized to 0), the length of the tree-like array is equal tonum
the number of different elements inlen(set(nums))
(equivalent tonums
dividing into so many grades [regardless of the size of the data, only care about the relative size of the data, This is discretization], each element of the tree array stores the number of data at this level). The tree array has two main functions, one isadd(x)
(x
add one to the value of the index, the value here is the aboveA
, but the tree array storesC
, so more than one element needs to be changed), and the other isquery(x)
(find the indexx
The sum of all data less than). - We use a pointer
i
to traversenums
, and fill the tree array during the traversal processbt
. The tree array storesx=nums[i]
the number of each "grade" element on the left. We first use itindex=bisect_right(b, x + diff)
tob
findx+diff
the minimum index value of the element greater than or equal to, and then usequery(index)
Count the sum of the numbernums[i]
of elements on the left greater thanx+diff
or equal to (that is, find the sum of all the numbers thatnums[m] <= nums[i] + diff
satisfy and)m<i
m
- Then use the maximum index value of all elements whose elements are less than or equal to (that is, find the
index2=bisect_left(b, x)
corresponding "grade" index), and then use the function to add it to the tree array to prepare for the next entry.b
x
x
add(index2)
query(i+1)
- Summing all of them
query(index)
gives us what we need
Note that although this question uses a tree array, the array does not store the element value, but the number of elements. In addition, the tree array is not constructed at once, but is gradually established during the process of traversing the query to add elements. Knowing these two points, it should be easy to understand after watching the video explanation. The author has tried to sort out this idea as much as possible, but looking back it is still a bit of a mouthful orz
Sec.II Code Implementation
The following is the C++ code implementation
class BIT {
private:
int length=0;
vector<int> tree;
public:
BIT(int n)
{
length=n;
vector<int> temp(n,0);
tree=temp;
}
int lowbit(int x){
return x & -x; }
void add(int i)
{
// i=index+1,>=1
while(i<=length){
tree[i-1]++; i=i+lowbit(i); }
}
int query(int i)
{
// i=index+1,>=1
int sum=0;
while(i>0){
sum+=tree[i-1];
i-=lowbit(i);
}
return sum;
}
};
class Solution {
public:
long long numberOfPairs(vector<int>& nums1, vector<int>& nums2, int diff) {
int n=nums1.size();
vector<int> nums(n,0);
for(int i=0;i<n;i++) {
nums[i]=nums1[i]-nums2[i]; }
vector<int> b(nums);
sort(b.begin(),b.end());
b.erase(unique(b.begin(),b.end()),b.end());
BIT bt(b.size());
long ans=0;
for(int i=0;i<n;i++)
{
ans+=bt.query(upper_bound(b.begin(),b.end(),nums[i]+diff)-b.begin());
bt.add(lower_bound(b.begin(),b.end(),nums[i])-b.begin()+1);
}
return ans;
}
};
Noteworthy points:
upper_bound(b.begin(),b.end(),val)
The function of the function is to find the smallest index iterator (which can be understood as a pointer) whoseb
element value is greater than or equal to in the container (the data is already ordered) , and return the element value of the index, which is the index valueval
*upper_bound(xx)
upper_bound(xx)-b.begin()
upper_bound(b.begin(),b.end(),val)
The function of the function is to find the largest index iterator (which can be understood as a pointer) whoseb
element value is smaller than that in the container (the data is already ordered) , and the other uses are the sameval
upper_bound
The following is the code implementation of python:
class BIT:
def __init__(self,n: int):
self.length=n
self.tree=[0]*n
def add(self, i: int):
while(i<=self.length):
self.tree[i-1]+=1
i+=(i & -i)
def query(self, i: int) -> int:
sum=0
while(i>0):
sum+=self.tree[i-1]
i-=(i & -i)
return sum
class Solution:
def numberOfPairs(self, nums1: List[int], nums2: List[int], diff: int) -> int:
n=len(nums1)
nums=[0]*n
for i in range(n):
nums[i]=nums1[i]-nums2[i]
b=sorted(set(nums))
bt=BIT(len(b))
ans=0
for i in range(n):
ans+=bt.query(bisect_right(b,nums[i]+diff))
bt.add(bisect_left(b,nums[i])+1)
return ans
Chap.II LeetCode: 51. Reversed Pairs in Arrays
This question should be a classic one, after all, it has already been included in "Jianzhi Offer". It's actually very similar to the one above, but simpler than that one. So I won’t analyze it below, just post a solution
Here is the python based code:
class Solution:
def reversePairs(self, nums: List[int]) -> int:
b = sorted(set(nums))
ans = 0
n = len(b)
bt = BIT(n)
for x in nums:
temp=n-bisect_left(b, x)
ans += bt.query(temp-1)
bt.add(temp)
return ans
class BIT:
def __init__(self,n: int):
self.length=n
self.tree=[0]*n
def add(self, i: int):
while(i<=self.length):
self.tree[i-1]+=1
i+=(i & -i)
def query(self, i: int) -> int:
sum=0
while(i>0):
sum+=self.tree[i-1]
i-=(i & -i)
return sum