Heap evolutionary journey 4-Fibonacci Heap Fibonacci Heap


The branch of the evolutionary journey of the heap-the Fibonacci heap.
Born to record wonderful reports of algorithm seminars.
The heap is ordered according to the small top heap, that is, the key of the parent node is smaller than that of the child node

Fibonacci Heap

Let's first summarize the time complexity of the three data structures mentioned above. In each operation step, if we take the minimum complexity, then the push operation only needs O(1), and the pop_min operation only needs O(log n). , Decrease_key operation only needs O(1), how hope to have a data structure that integrates all kinds of advantages, is there such a data structure to meet the requirements?
Through reading the paper, we learned that Michale L·Fredman creatively realized this idea in 1987, that is, the data structure of the Fibonacci heap, which originated in 1978 and was the binomial heap proposed by Jean Vuillemin.

time complexity

Fibonacci heap

Operation Linked List Binary Heap Binomial Heap Fibonacci Heap
Push O (1) O(log n) O(1) amortized O(1) amortized
Pop_min O (n) O(log n) O(log n) O(log n) amortized
Decrease_key O (1) O(log n) O(log n) O(1) amortized

Push

It abandons the merge operation in the push and decrease_key stages of the binomial heap, and leaves all the merge operations in the pop_min operation, which makes the Fibonacci heap and the binomial heap have different shapes. This is the principle of lazy. Fibonacci heaps are not merged when the tree can be merged, but only when they have accumulated to a certain amount. As a result, the apparent lazy eventually improves the cleaning efficiency. The final push operation insertion and update are both O(1), O(1) + O(1) and the final push operation time complexity is still O(1).
Insert picture description here
Push: It comes with a pointer to the smallest root node, inserts 5 as a single-node tree at the end of the root list, and compares and updates with the old smallest root node. The same is true for vertex 2.

Pop_min

Delete

Promote children

Look at the pop_min operation again: after popping the smallest element according to the smallest element pointer, its child nodes are processed in the same way as in the binomial heap and directly promoted to the root linked list. Then the root list is traversed, and if two trees with the same degree are found, the merge operation is performed.
Insert picture description here
Pop_min: After deleting the smallest root node 1, raise its child subtrees: 4, 3 to the root list, then merge trees with the same degree, merge vertices 7 and 3, and then use vertices 4 and 3 as the root node. To merge, and finally vertices 5 and 2.

Merge

Let's take a closer look at how the combination in this step, namely Merge, is specifically operated in the paper: the paper proposes that with the root node array root_array, the array number represents the degree of the tree. This array is equivalent to a row of buckets with a capacity of only 1. , Put the trees into the bucket in turn, if two trees enter the same bucket, that is, the root nodes of the two trees have the same degree, they are merged immediately. Finally, after merging, return to the root list.
Insert picture description here
Traverse the root list from the beginning. The degree of the tree with vertex 7 as the root node is 0, and it is thrown into bucket 0, and the tree with vertex 7 as the root node is thrown into bucket 1. For the tree with vertex 3 as the root node, bucket 0 is already full, indicating that there are trees with the same degree, then merge them, compare the size of the root node, and use the smaller one as the new parent node. After traversing to the end, there are only two trees left. At this time, they are moved back to the root list, and the merge operation is complete.

Decrease key

Finally, let’s look at the Decreasekey operation in the Fibonacci heap. After updating the node key value, if the parent node key value is less than or equal to the child node key value, it is not like the upward bubble in the binomial heap, but from The original tree is split down and directly connected to the root linked list.
Insert picture description here
Decrease key: 0 -> 1: The key of vertex 5 is reduced to 3, which still conforms to the order of the heap, no need to do anything; 1 -> 2: 3 to 0, then vertex 0 is dropped on the root list; 2b -> 3: The keyword of vertex 7 is reduced to 1, and the tree with the newly generated vertex 1 as the root node is thrown into the root list (the brown mark will be explained right away, and there is 3c)

Another feature of the Fibonacci heap proposed by Michale L·Fredman in 1987 is on this picture. As mentioned earlier, after the root node pops up in the pop_min operation, the Fibo heap and the binomial heap are directly connected to the root linked list, but if there is an extremely flat tree in the white background, there are too many child nodes, which will consume a lot. time. In order to optimize this situation, Fibo heap uses a unique mark loser operation. This operation is implemented in the reducekey stage, Rule1: Loss of a child node, that is, when the child node is cut off and connected to the root linked list, the parent node is marked as loser. Everyone can think that even if the child is lost, isn't this parent sure that it is a loser? Rule2: When a parent node marked as loser loses another child, that is, when the parent node loses two children, he will lose face in the tree anymore, cut off the entire subtree of this node, and remove him Also connected to the root node list. In the end, we use these two rules to mark losers and cut nodes, avoiding a flat tree, and the root node will no longer have too many child nodes, which improves a lot of efficiency. If the new data structure in 1987 did not have this operation, its amortized analysis time complexity would not have increased so much in the end. The figure below is an example demonstration. Insert picture description here
First, vertices 5 and 8 have been marked as becomingLoser, Reduce the keyword of vertex 9, and throw 6 into the root list. This is the second child of vertex 8 is lost, so vertex 8 is also dropped and clearedLosermark. In this way, vertex 5 also lost the second child, repeat the operation, knowing that a normal vertex is encountered: 4, marked asLoser, And then stop the operation.

Amortized time

I don't know if you noticed that the Amortized alanysis in the time complexity comparison chart before. It stands for amortized analysis and helps us analyze the upper bound of algorithm complexity more accurately. In the past, we used the worst-case scenario for each operation to arrive at the theoretical upper bound of the algorithm complexity, which often has a large error with the actual upper bound of the complexity, because a series of operations often have an internal relationship, not all in the worst case. Runs, so we use amortized analysis. In an amortized analysis, the time required to perform a series of data structure operations is obtained by averaging all operations performed, so that the theoretical upper bound of the complexity cost is closer to the actual cost upper bound. Amortized analysis can be used to prove that in a series of operations, after averaging all operations, even if a single operation has a large cost, the average cost is still small. In the actual process, we often use aggregation analysis, accounting methods and potential energy methods to achieve amortized analysis. In the Fibonacci heap complexity analysis, we use the potential energy method. Because each operation may have a different amortized cost, we use a potential function Φ(Di) to represent the potential energy corresponding to the current operation. . Let us look at the amortized cost formula in the figure: ai = ci + Φ(Di)-Φ(Di-1) , ai is the amortized cost, ci is the actual cost, and the total amortized cost of n operations is In the figure, ∑ai= ∑ci+Φ(Dn)- Φ(D0) , == Φ(Dn)-Φ(D0)== represents the total potential energy difference, and finally it can be released when needed. The additional cost we give it , It is equivalent to some operations that we give more money to execute, and the remaining money is used to subsidize a later operation when the money is insufficient. In the end, the total amount of money spent is definitely less than the original worst-case money, and it can meet the actual situation. need. Different potential functions may produce different amortized costs, but they are all upper bounds of actual costs. The choice of the best potential function depends on the required time horizon. Insert picture description here
After understanding the meaning of amortized analysis, let's see what it will look like when analyzing Fibonacci heaps. The potential function here is Φ=the number of root nodes at this time + twiceLoser Number of nodes.Insert picture description here

Pop_min

(1) Delete the root node. The promotion of children is mainly spent on promoting children, because there will be at most d_max children, so the actual time is O(d_max). As for the change of the potential function, the number of trees has changed, because the child subtrees were not included in the original, and now they are all mentioned in the root list, so they will not be greater than d_max, and finally they can be substituted into the formula.
(2) We suppose that there are t trees at the beginning, and m merges are performed, and finally d trees are left. First, we need to traverse the root list, which requires O(t). And each merging requires only O(1), there are m times here, so it costs O(m). Then at the end of the final merge, you need to move the remaining bush buckets back, which costs O(d) again. There is also a relation t = m + d, because each merge will reduce the number of trees by 1, so the sum of m and d does not change. For the potential function, the main reason is the change in the number of root nodes. There were originally t trees, but now there are only d trees.

Among them, the key value of the parent node in the decreasekey operation is updated, and the child nodes in its subtree may also be updated. This is iteration. The additional cost generated happens to use the potential energy difference generated by our potential function calculation. Pay more for this part The time is spent and released.
In such a calculation, the data structure of the Fibonacci heap is indeed compared with the original one, and the complexity of some operation algorithms is reduced from O(log n) to O(1), and some from O(n) to O (log n), which improves a lot of efficiency.

In fact, there is another question: Why is the Fibonacci heap called the Fibonacci heap? In fact, its naming method is very similar to the origin of the binomial heap. Let's look at a tree of degree k in a Fibonacci heap. How many nodes does it have at least? First roughly infer the following figure. It is found that the nodes of the tree with degree k in this Fibonacci pile happen to be greater than or equal to the number in the corresponding Fibonacci sequence.
Here F_d+2 refers to the d+2 number in the Fibonacci sequence, and Φ is the golden number: (√5-1)/2
Insert picture description here
we prove it strictly from a mathematical point of view:
first introduce a "grandchildren" Theorem, there are two explanation methods here:
Just look at the picture(Don't look at the text description below), suppose the root node of the tree in the heap is x. When the degree is 0, there are obviously only 0 children. Let it be the y1 tree, and the two y1 trees are merged into a tree of degree 1. , After experiencing the decreasekey, if it becomes a loser and has a child node that is cut off, then there are only 0 children, which is recorded as y2; and so on, the child nodes of y3 are y1 and y2, and there is at least 1 child. In such a recursion, the child nodes of yd are y1 to yd-1, and the number of child nodes of yd (d is a subscript) is >=d-2 by means of mathematical induction. Finally, summarize the "grandchildren" theorem, that is, the root node with degree d, and label his children from 1 to d, 1<=i<=d, then the grandchild node owned by the child node of i>=i-2.
②Note that the merge operation can only be performed on trees of equal degree. For vertex x, we number them as y1, y2,… yd in the order of becoming its children. Then y1 and x are merged, and there are no children at this time. Next, y2 and x are merged. At this time, x already has a child y1, so y2 already has a child, but, because the decrease_key operation may be One child is deleted at any time, so y2 may have only one child, and the number of children of y2 is >= 0, and so on, to get the following properties.
Insert picture description here
With the "grandchildren" theorem, we use this to find at least a few nodes in the tree of the root node of degree d, and use Nd to represent this value: it has d child nodes, numbered 1 to d, then Nd The sum of points is equal to 1 representing the root node plus all the nodes in the subtree of each child node. A total of d subtrees, the number of nodes from N0 to Nd-2, the minimum number of nodes of the i-th child node is equal to the number of grandchildren = i-2, and finally Nd=1+N0+N0+……+Nd-3+Nd- 2. The similar Nd-1=1+N0+N0+……+Nd-3, so Nd=Nd-1+Nd-2, which is the same as the recurrence formula of the Fibonacci sequence, so this strange data structure It's called a Fibonacci pile.Insert picture description here

thought

Sometimes it pays to let mess build up
pile up some miscellaneous things and finish them together when there is time.

Your parents want lots of grandchildren
If you lose a child yourself, be careful that your parents don't want you.

Relaxed Heap

Find in my blog

Guess you like

Origin blog.csdn.net/weixin_44092088/article/details/106121404