Implementation of the heap sort algorithm in Python

Heapsort is a powerful algorithm for sorting arrays and lists in Python. It's popular because it's very fast and doesn't require any extra space like merge sort and quick sort.

The time complexity of heap sort is O(n*log(n)).

Heap sort is an in-place algorithm, it does not create any data structure to save the intermediate state of the data. Instead, it modifies our original array.

So this saves us a lot of space when the data is very large.

The only downside to this algorithm is that it is very unstable. If multiple elements in our array have the same value at different indices, their positions will change when sorted.

The heapsort algorithm works by recursively creating a min-heap or max-heap, taking the root node, placing it at the first unsorted index in our array, and turning the last heap element into the root node.

This process is repeated recursively until we have only one node left in the heap. Finally, the last heap element is placed at the last index of our array.

If we think about it, the process is similar to the selection sort algorithm in that we take the largest or smallest values ​​and put them at the top of our sorted array.

Implement the heap sort algorithm

Let's first look at implementing the build_heap() function, which takes the original array, the length of the array, and the index of our parent node. Here, if we look at an array, the index of the last parent node is at (n//2 - 1) inside our array.

Likewise, the index of the left child of that particular parent is 2 parent_index + 1 and the index of the right child is 2 parent_index + 2 .

In this example, we are trying to create a max heap. This means that each parent node needs to be larger than its child nodes.

To do this, we'll start at the last parent node and move up to the root of our heap. If we wanted to create a min-heap, we would want all parent nodes to be smaller than their children.

The build_heap() function will check if the left or right child node is larger than the current parent node, and swap the largest node with the parent node.

This function calls itself recursively, because we want to repeat the previous process step by step for all parent nodes in the heap.

The following code snippet demonstrates a working implementation of the built_heap() function mentioned above in Python.

def build_heap(arr, length, parent_index):
    largest_index = parent_index
    left_index = 2 * parent_index + 1
    right_index = 2 * parent_index + 2
    if left_index < length and arr[parent_index] < arr[left_index]:
        largest_index = left_index
    if right_index < length and arr[largest_index] < arr[right_index]:
        largest_index = right_index
    if largest_index != parent_index:
        arr[parent_index],arr[largest_index] = arr[largest_index],arr[parent_index]
        build_heap(arr, length, largest_index)

Now, we have a function that takes the largest value in our array and puts it at the root of our heap. We need a function to get the unsorted array, call the build_heap() function and extract elements from the heap.

The following code snippet demonstrates the implementation of the heapSort() function in Python.

def heapSort(arr):
    length = len(arr)
    for parent_index in range(length // 2 - 1, -1, -1):
        build_heap(arr, length, parent_index)
    for element_index in range(length-1, 0, -1):
        arr[element_index], arr[0] = arr[0], arr[element_index]
        build_heap(arr, element_index, 0)

We step through the build_heap() function of each parent node in our array. Note that we give length//2-1 as the start index and -1 as the end index, with a step size of -1.

This means we start at the last parent node and gradually decrease our index until we reach the root node.

The second loop, for , fetches elements from our heap. It also starts at the last index and stops at the first index of our array.

We swap the first and last elements of our array in this loop and execute the build_heap() function on the newly sorted array by passing 0 as the root index.

Now, we have written our program to implement heap sort in Python. Now it's time to sort an array and test the code above.

#Python小白学习交流群:153708845
arr = [5, 3, 4, 2, 1, 6]
heapSort(arr)
print("Sorted array :", arr)

output:

Sorted array : [1, 2, 3, 4, 5, 6]

As we can see, our array is fully sorted. This means our code works fine.

If we want to sort in descending order, we can create a min-heap instead of the max-heap implemented above.

This article will not explain min-heap, because what is min-heap has been discussed at the beginning of this tutorial.

Our program works as follows. The blocks below show the state of our array at each stage of code execution.

Original Array [5, 3, 4, 2, 1, 6] # input array
Building Heap [5, 3, 6, 2, 1, 4] # after build_heap() pass 1
Building Heap [5, 3, 6, 2, 1, 4] # after build_heap() pass 2
Building Heap [6, 3, 5, 2, 1, 4] # after build_heap() pass 3
Extracting Elements [6, 3, 5, 2, 1, 4] # before swapping and build_heap pass 1
Extracting Elements [5, 3, 4, 2, 1, 6] # before swapping and build_heap pass 2
Extracting Elements [4, 3, 1, 2, 5, 6] # before swapping and build_heap pass 3
Extracting Elements [3, 2, 1, 4, 5, 6] # before swapping and build_heap pass 4
Extracting Elements [2, 1, 3, 4, 5, 6] # before swapping and build_heap pass 5
Sorted array : [1, 2, 3, 4, 5, 6] # after swapping and build_heap pass 5

The build_heap() function is executed 3 times because we only have 3 parents in our heap.

Afterwards, our element extraction phase swaps the first element with the last element and executes the build_heap() function again. This process for length - 1, our array is sorted.

Guess you like

Origin blog.csdn.net/qdPython/article/details/132189911