Heap and priority queue

pile

1. Heap definition:

If the key of any node of this complete binary tree is less than or equal to the keys of its left and right children, it is called a small root heap, otherwise it is a large root heap. Such as:

In fact, when the heap is stored, it is not stored in a tree structure, but in the form of an array. How to understand it?

Simply put, the heap is logically represented as a complete binary tree and physically represented as an array. Such as:

There is an association between the tree and the array: the subscripts of the parent (parent), left child (left), and right child (right) in the array in the tree satisfy the relationship: (1) The subscript parent of the parent in the array is known, Then the corresponding subscripts of the left child and right child in the array:

                                left = 2 * parent + 1;

                                right = 2 * parent + 2;

                 (2) Knowing the child's subscript child (regardless of the left child or the right child): parent = (child - 1) / 2;

This array can also be viewed as a level-order traversal of the above complete binary tree.

2. The operation of the heap

        2.1 Rebuild the heap: Question: How to rebuild the heap when the top record of the heap changes? Let's take the small heap as an example:
        [Algorithm idea] First remove the record in the root node of the complete binary tree corresponding to the heap, which is called to be adjusted Record.

        At this time, the original child node with a larger keyword is equivalent to an empty node, and a record with a smaller keyword is selected from the left and right subtrees of the empty node. If the keyword of the record is still smaller than the one to be adjusted record key, move the record up to the empty node.
        The above moving process is repeated until the keywords of the left and right subtrees of the empty node are both smaller than the keywords of the records to be adjusted. At this point, the
record to be adjusted can be placed in an empty node.
        The above adjustment method is equivalent to the process of gradually "screening" the records to be adjusted downward, so it is generally called "screening" method or "downward adjustment".

        We use code to explain:

public static void shiftDown(long[] array, int size, int index) {
        // index代表要调整的位置。size为堆的大小。

        // 
        while (true) {
            // 1. 判断 index 所在位置是不是叶子
            // 逻辑上,没有左孩子一定就是叶子了(因为完全二叉树这个前提)
            int left = 2 * index + 1;
            if (left >= size) {
                // 越界 -> 没有左孩子 -> 是叶子 -> 调整结束
                return; // 循环的出口一:走到的叶子的位置
            }

            // 2. 找到两个孩子中的最值【最小值 via 小堆】
            // 先判断有没有右孩子
            int right = left + 1;       // right = 2 * index + 2
            int min = left;             // 假设最小值就是左孩子,所以 min 保存的最小值孩子所在的下标
            if (right < size && array[right] < array[left]) {
                // right < size 必须在 array[right] < array[left] 之前,不能交换顺序
                // 因为先得确定有右孩子,才有比较左右孩子的意义
                // 有右孩子为前提的情况下,然后右孩子的值 < 左孩子的值
                min = right;            // min 应该是右孩子所在的下标
            }

            // 3. 将最值和当前要调整的位置进行比较,判断是否满足堆的性质
            if (array[index] <= array[min]) {
                // 当前要调整的结点的值 <= 最小的孩子值;说明这里也满足堆的性质了,所以,调整结束
                return; // 循环的出口一:循环期间,已经满足堆的性质了
            }

            // 4. 交换两个值,物理上对应的就是数组的元素交换 min 下标的值、index 下标的值
            long t = array[index];
            array[index] = array[min];
            array[min] = t;

            // 5. 再对 min 位置重新进行同样的操作(对 min 位置进行向下调整操作)
            index = min;
        }
    }

  The process is as follows:   

         2.2 Building the initial heap: Question: How to build the initial heap from an arbitrary sequence?
       [Algorithm idea] An arbitrary sequence is regarded as a corresponding complete binary tree. Since the leaf nodes can be regarded as a single-element
heap, the above-mentioned adjustment heap algorithm ("screening" method) can be used repeatedly, and all the Subtrees are adjusted to heaps until the
entire complete binary tree is adjusted to heaps.
        It can be proved that in the above complete binary tree, the last non-leaf node is located at the Ln/2Jth position, and n is the number of binary tree nodes
. Therefore, "screening" needs to start from the Ln/2Jth node and go backwards layer by layer until the root node.

public static void buildHeap (int[] array) {
        //我们假定传入的数组是经过处理的,即数组内的元素个数就是堆的元素个数。
        //通过二叉树可以观察到只需要从最后一个节点的双亲结点开始从底向上进行向下调整
        for (int i = (array.length-2)/2; i >=0 ; i--) {
            shiftDown(array,array.length,i);
        }
    }

    private static void shiftDown(int[] array, int size, int index) {
    //index 为当前需要调整的位置
        while (index * 2 + 1 < size){
            
            int left = index * 2 + 1;
            int right = left + 1;
            //找出最小孩子的下标
            int min = left;
            if (right < size && array[min] > array[right]){
                min = right;
            }
            //如果当前结点满足堆的性质则结束。
            if (array[index] < array[min]){
                return;
            }
            //交换当前结点与最小孩子的值
            swap(array,min,index);
            //继续向下调整
            index = min;
        }

    }

    private static void swap(int[] array, int min, int index) {
        int t = array[index];
        array[index] = array[min];
        array[min]= t;
    }

Second, the priority queue (priority queue)

        1. Definition: Elements in a priority queue can be inserted in any order, but are retrieved in an ordered order. That is, whenever the remove method is called, the smallest element in the current priority queue is always obtained. However, the priority queue does not sort all elements. If these elements are processed iteratively, there is no need to sort them. The priority queue uses a neat and efficient data structure called a heap. A heap is a self-organizing binary tree whose add (ad) and (remove) operations allow the smallest element to be moved to the root without having to spend time sorting the elements.
        Like the TreeSet, the priority queue can hold either a class object that implements the Ccomparable interface or
a Comparator object provided in the constructor.
        A typical use of priority queues is task scheduling. Each task has a priority, and tasks are added to the queue in random order
. Whenever a new task is started, the task with the highest priority is removed from the queue (since 1 is conventionally set
as the "highest" priority, the remove operation removes the smallest element).

Implement the priority queue:


// 直接使用 long 类型作为我们的元素类型,不考虑泛型了
public class MyPriorityQueue {
    // 很重要的属性:堆 = 数组 + 有效元素个数
    private long[] array;
    private int size;

    public int size() {
        return size;
    }

    public boolean isEmpty() {
        return size == 0;
    }

    // 由于我们的元素类型是 long 类型,不需要考虑 Comparable 和 Comparator 的问题
    // 所以我们只需要一个构造方法即可
    public MyPriorityQueue() {
        array = new long[16];
        size = 0;
    }

    public void offer(long e) {
        // 放入我们的优先级队列中,放入之后,保证堆的性质仍然是满足的
        ensureCapacity();

        array[size] = e;
        size++;

        // [size - 1] 就是刚刚插入的元素的位置
        shiftUp(array, size - 1);
    }

    // 前提:size > 0
    public long peek() {
        // 返回堆顶元素
        if (size < 0) {
            throw new RuntimeException("队列是空的");
        }
        return array[0];
    }

    public long poll() {
        // 返回并删除堆顶元素
        if (size < 0) {
            throw new RuntimeException("队列是空的");
        }

        long e = array[0];

        // 用最后一个位置替代堆顶元素,删除最后一个位置
        array[0] = array[size - 1];
        array[size - 1] = 0;        // 0 代表这个位置被删除了,不是必须要写的
        size--;

        // 针对堆顶位置,做向下调整
        shiftDown(array, size, 0);

        return e;
    }

    // 检查我们的优先级队列对象是否正确
    // 1. 0 <= size && size <= array.length
    // 2. 满足小堆的特性(任取结点(除开叶子结点),其值 <= 它的两个孩子的值(如果存在的话)
    public void check() {
        if (size < 0 || size > array.length) {
            throw new RuntimeException("size 约束出错");
        }

        // 如果每个结点都没问题,说明小堆成立
        for (int i = 0; i < size; i++) {
            int left = 2 * i + 1;
            int right = 2 * i + 2;

            if (left >= size) {
                // 说明是叶子,跳过
                continue;
            }

            // 左孩子破坏了规则
            if (array[i] > array[left]) {
                throw new RuntimeException(String.format("[%d] 位置的值大于其左孩子的值了", i));
            }

            // 右孩子破坏了规则
            if (right < size && array[i] > array[right]) {
                throw new RuntimeException(String.format("[%d] 位置的值大于其右孩子的值了", i));
            }
        }
    }

    private void shiftDown(long[] array, int size, int index) {
        while (2 * index + 1 < size) {
            // 说明 index 一定有左孩子的
            int min = 2 * index + 1;
            int right = min + 1;
            if (right < size && array[right] < array[min]) {
                min = right;
            }

            if (array[index] <= array[min]) {
                return;
            }

            swap(array, index, min);

            index = min;
        }
    }

    private void swap(long[] array, int i, int j) {
        long t= array[i];
        array[i] = array[j];
        array[j] = t;
    }

    private void ensureCapacity() {
        if (size < array.length) {
            return;
        }

        array = Arrays.copyOf(array, array.length * 2);
    }

    // 向上调整期间,不需要 size
    private void shiftUp(long[] array, int index) {
        while (index != 0) {
            int parent = (index - 1) / 2;
            if (array[parent] <= array[index]) {
                return;
            }

            swap(array, index, parent);
            index = parent;
        }
    }
}

2. The priority queue implemented in Java:

        An unbounded priority queue based on a priority heap . The elements of the priority queue are sorted in their natural orderComparator , or according to the provided when constructing the queue , depending on the construction method used. Priority queues do not allow nullelements . Priority queues that rely on natural ordering also do not allow insertion of incomparable objects (doing so may result in

ClassCastException)。

The head of         this queue is the smallest element determined by the specified ordering . If multiple elements are the minimum value, the header is one of the elements - the selection method is arbitrary. Queue get operations poll, remove, peekand elementaccess the element at the head of the queue.

        Priority queues are unbounded, but have an internal capacity that controls the size of the array used to store queue elements. It is usually at least equal to the size of the queue. As elements are added to the priority queue, its capacity automatically increases. There is no need to specify the details of the capacity increase strategy.

        This class and its iterators Collectionimplement all optional methods of the and Iteratorinterfaces . The iterator provided in the method is not guaranteed to traverse the elements of the priority queue in any particular order. Consider using if you need to traverse in order .iterator()Arrays.sort(pq.toArray())

Method summary
 boolean add(E e)
          Inserts the specified element into this priority queue.
 void clear()
          Removes all elements from this priority queue.
 Comparator<? super E> comparator()
          Returns the comparator used to sort the elements in this queue, or if this queue is sorted according to the natural orderingnull of its elements .
 boolean contains(Object o)
          Returns if this queue contains the specified element true.
 Iterator<E> iterator()
          Returns an iterator over the elements in this queue.
 boolean offer(E e)
          Inserts the specified element into this priority queue.
 E peek()
          Gets but does not remove the head of this queue; returns null if this queue is empty.
 E poll()
          Gets and removes the head of this queue, or null if this queue is empty.
 boolean remove(Object o)
          Removes a single instance of the specified element from this queue, if one exists.
 int size()
          Returns the number of elements in this collection.
 Object[] toArray()
          Returns an array containing all the elements of this queue.
<T> T[]
toArray(T[] a)
          Returns an array containing all the elements of this queue; the runtime type of the returned array is that of the specified array.

        

3. Summary:

        1. The heap is a binary tree by definition, but it is an array in actual implementation.

                The subscript relationship between the parent node and the child node of the binary tree in the array:

                left= parent * 2 +1;

                right = parent * 2 + 2;

                parent = (child-1)/2;

        2. Heap: Find the most value in some frequently changing data sets.

        3. The core operation of the heap: downward adjustment, building the initial heap.

        4, can realize the priority queue.

        5. Top-k problem.

Guess you like

Origin blog.csdn.net/weixin_52575498/article/details/123719073