Tree structure and heap

Tree structure and heap

Can improve the operational performance of priority queues

1. Linear and tree structure

First analyze the reasons for low efficiency:

  • The sequential insertion operation is inefficient, the root of which is the need to retrieve the insertion position along the table order, For the sequence table, O(n) elements need to be moved, and for the linked list, O(n) steps need to be crawled along the link.

  • If the linear order storage method of the data is not changed, it is impossible to break the O(n) complexity limit. To make a priority queue with higher operational efficiency, other data structure organization methods must be considered.

  • Using the ancestor/descendant sequence of the tree structure, it is possible to get better operation efficiency

Generally speaking, to determine the highest priority element does not need to be compared with all other elements. Take the knockout in a sports competition as an example. Assuming that there are n players participating, first N-1 games need to be played to determine the champion, and each player only needs to play About log 2 n games, after the champion is determined, to determine the real second place, only the runner-up must compete with all the people who lost to the champion, and only need to follow the championship victory route, no more than log 2 n times

2. Heap and its nature

An effective technology for implementing priority queues with a tree structure is called a heap. From a structural point of view, the heap isComplete binary tree of data stored in the node, But the storage of data in the heap must meet a special heap order: the data stored in any node (in the order considered) is prior to or equal to the data in its child nodes (if any)

  • On the path from the root of the tree to any leaf node in a heap, the data stored in each node decreases according to the specified priority relationship (non-strict)
  • The highest priority element in the heap must be located in the root node of the binary tree (top of the heap), and it can be obtained in O(1) time
  • Elements located on different paths in the tree, do not care about their order relationship here

If the required order is the smallest element first, the constructed heap is a small top heap, otherwise it is a large top heap.

[External link image transfer failed. The source site may have an anti-hotlinking mechanism. It is recommended to save the image and upload it directly (img-5bcZhPc0-1612257238795)(C:\Users\93623\AppData\Roaming\Typora\typora-user-images\ image-20210114152155871.png)]

The above picture shows the shape of a pile (that is, the shape of a complete binary tree) and a path in the pile. Except for the lack of the right side of the bottom layer, all nodes in the pile are full. In the picture, from root to leaf. The smaller the circle on the path

A complete binary tree can naturally and completely store information in a continuous linear structure (such as a continuous table), a heap can also be naturally stored in a continuous table, and the parent of any node in the tree can be easily found through the subscript. Node/sub-node

Heaps and binary trees have the following properties:

Q1: Add an element at the end of a heap (add an element at the end of the corresponding continuous table), the whole structure can still be regarded as a complete binary tree, but it may not be a heap (the last element may not satisfy the heap order)

Q2: One heap removes the top of the heap (the element at position 0 in the table), and the remaining elements form two "child heaps". The subscript calculation rules of the child node/parent node of the complete binary tree are still applicable, and the heap order is still established on the path

Q3: Add a root element (stored in position 0) to the table (two sub-heaps) obtained by Q2, and the obtained node sequence can be regarded as a complete binary tree, but it may not be a heap (the root node may not satisfy the heap order)

Q4: Remove the last element in a heap (the rightmost node at the bottom level, which is the last element in the corresponding continuous table), and the remaining elements still form a heap

[External link image transfer failed. The source site may have an anti-hotlinking mechanism. It is recommended to save the image and upload it directly (img-ddnvulHP-1612257238798)(C:\Users\93623\AppData\Roaming\Typora\typora-user-images\ image-20210115140627664.png)]

Heap implementation priority queue

  • Using the heap as a priority queue, you can directly get the most priority element in the heap, O(1)
  • The operation of inserting elements (upward screening): To add a new element to the heap (priority queue), you must be able to get a heap containing all the original elements and the new element just added, O(logn)
  • Pop the smallest element from the heap (downward screening). After popping the smallest element from the heap, you must be able to remake the remaining elements into a heap, O(logn)

Three. Heap implementation of priority queue

(1) Insert elements and filter upwards

According to Q1, by adding an element at the end of a heap, the result is still a complete binary tree, but not necessarily a heap. In order to restore such a complete binary tree to a heap, only one upward filter is needed.

Method of upward screening:

Constantly compare the newly added element (set to e) with the data of its parent node, and exchange the positions of the two elements if e is smaller, Through this comparison and exchange, the element e continues to move up. This operation has been achieved when the data of the parent node of e is less than or equal to e, or when e has reached the root node, it stops, and then all paths through e The elements of satisfies the required order, and the rest of the paths remain in order, so this complete binary tree satisfies the heap order

  • Put the newly added element (in the continuous list) after the existing element, and perform an upward filtering operation
  • The number of comparisons and exchanges in the upward filtering operation will not exceed the length of the longest path in the binary tree, so the inserting element operation can be completed in O(logn) time

(2) Pop-up elements and filter down

Pop the top element of the heap, remove an element from the end of the original heap, and put it on the top of the heap to get a complete binary tree. Now, except for the element on the top of the heap, which may not meet the heap order, the rest of the elements meet the heap order. Now we need to try to change The structure is restored to a heap

[External link image transfer failed. The source site may have an anti-hotlinking mechanism. It is recommended to save the image and upload it directly (img-0xl9bU8Z-1612257238801) (C:\Users\93623\AppData\Roaming\Typora\typora-user-images\ image-20210115143047139.png)]

  • Compare e with the top element (root) of the two "sub-heaps" of A and B, and the smallest one is used as the top of the whole heap
    • If e is not the smallest, the smallest ratio is the root of A or B. Set the root of A to be the smallest and move it to the top of the heap, which is equivalent to deleting the top element of A
    • Put e into A with the top of the heap removed, which is the same problem on a smaller scale
    • The case where the root of B is the smallest
  • If a certain time is heavier and e is the smallest, the local tree with it as the top has become a heap, and the entire structure has also become a heap.
  • Or e has fallen to the end, at this time it itself is a heap, and the entire structure becomes a heap

to sum up:

  1. Pop the top of the heap O(1)
  2. Take the last element from the heap as the root of the complete binary tree O(1)
  3. Perform a downward filter O (logn), the number of operations does not exceed the length of the path in the tree

Next, use python to implement a heap-based priority queue class, use list to store elements, and add elements at the end of the table, with the head as the top of the heap

class PrioQueue:
    """
    implementing priority queues using heaps
    """
    def __init__(self, elist=[]):
        self._elems = list(elist)
        if elist:
            self.buildheap()
            
    def is_empty(self):
        return not self._elems
    
    def peek(self):
        if self.is_empty():
            raise PrioQueueError("in peek")
        return self._elems[0]
    
    def enqueue(self, e):
        self._elems.append(None) # add a dummy element
        self.siftup(e,len(self._elems)-1)
    
    def shiftup(self, e, last):
        elems, i, j = self._elems, last, (last-1) // 2
        while i > 0 and e < elems[j]:
            elems[i] = elems[j]
            i, j = j, (j-1)//2
        elems[i] = e
        
   	def dequeue(self):
   		if self.is_empty():
            raise PrioQueueError("in dequeue")
         elems = self._elems
        e0 = elems[0]
        e = elems.pop()
        if len(elems) > 0:
            self.shiftdown(e, 0, len(elems))
        return e0
    
    def shiftdown(self, e, begin, end):
        elems, i, j = self._elems, begin, begin*2+1
        while j < end:
            if j+1 < end and elems[j+1] < elems[j]:
                j += 1
            if e < elems[j]:
                break
            elems[i] = elems[j]
            i, j = j, 2*j+1
        elems[i] = e
        
    def buildheap(self):
        end = len(self._elems)
        for i in range(end//2, -1, -1):
            self.shiftdowm(sele._elems[i], i, end)

list(elist) The meaning of making a table copy starting from elist:

  • Make a copy to separate the internal table from the original table and exclude sharing
  • A new empty table is also created for the default case, avoiding the python programming trap of using mutable objects as default values

In the implementation of shiftup, the element is not stored first and then the exchange is considered, but "hold it" to find the correct insertion position. The loop condition ensures that the elements that jump out are all elements with lower priority. During the inspection process Move them down one by one

to sum up:

  1. The priority queue is implemented based on the concept of heap, and the time complexity of the creation operation is O(n). This thing only needs to be done once.
  2. The complexity of insert and pop operations is O(logn). The first step of the insert operation is to add an element at the end of the table, which may cause the list object to replace the element storage area, so the worst case of O(n) may occur.
  3. Only one simple variable is used in all operations, no other structures are used, so the space complexity is O(1)

Area, so the worst case of O(n) may occur
3. All operations only use a simple variable, no other structure is used, so the space complexity is O(1)

Guess you like

Origin blog.csdn.net/weixin_46129834/article/details/113567245