17  Heaps

18 Key concepts

18.1 Heaps, shape and heap properties

A heap is a specialized tree-based data structure that satisfies two properties: the shape property and the heap property.

The shape property of a heap requires that it is a complete binary tree. This means that all levels of the tree are completely filled, except for possibly the last level, which is filled from left to right. For example, the following is a complete binary tree:

          1
       /     \
      2       3
     / \     /
    4   5   6

The heap property of a heap depends on whether it is a max-heap or a min-heap. In a max-heap, every node is greater than or equal to its children, while in a min-heap, every node is less than or equal to its children. For example, the following is a max-heap:

          8
       /     \
      5       7
     / \     /
    3   4   6

In this tree, every node is greater than or equal to its children.

These properties make heaps useful for implementing priority queues, as they allow for efficient access to the maximum or minimum element in the data structure. Additionally, heaps can be implemented efficiently using an array, where the left child of a node is at index 2i + 1 and the right child is at index 2i + 2, where i is the index of the node in the array.

18.2 Heap operations

The most common operations on heaps are insertion and deletion of elements. There are two types of heaps: max-heap and min-heap, and the operations differ slightly depending on which type of heap is used.

18.2.1 Insertion

To insert an element into a heap, the element is added to the next available position at the bottom level of the heap (which will maintain the shape property), and then it is “bubbled up” to its proper position to maintain the heap property. For a max-heap, this means swapping the new element with its parent until it is larger than its parent (or until it reaches the root), while for a min-heap, it means swapping the new element with its parent until it is smaller than its parent (or until it reaches the root).

18.2.2 Deletion

To delete an element from a heap, the element at the root is removed, and then the heap is restructured to maintain the shape property and heap property. For a max-heap, this means replacing the root with the last element in the heap, then “bubbling down” this element to its proper position by swapping it with its larger child until it is larger than both its children (or until it reaches a leaf node), while for a min-heap, it means replacing the root with the last element in the heap, then “bubbling down” this element to its proper position by swapping it with its smaller child until it is smaller than both its children (or until it reaches a leaf node).

Heaps can also support other operations such as finding the minimum or maximum element in the heap, and merging two heaps together. These operations can be implemented efficiently with the help of the heap properties.

18.3 Heapsort, priority queues

Heapsort is a sorting algorithm that uses a heap data structure to sort an array. The algorithm consists of two main steps:

  1. Build a max-heap from the input array.

  2. Repeatedly extract the maximum element from the heap and insert it into the output array, until the heap is empty.

The resulting output array will be sorted in ascending order. Heapsort has a time complexity of O(n log n) in the worst case, and it is an in-place sorting algorithm (i.e., it uses only a constant amount of additional memory).

Priority queues are abstract data types that allow efficient access to the minimum (or maximum) element in a set of elements. Priority queues can be implemented using heaps, where the minimum (or maximum) element is always at the root of the heap. Priority queues are used in a variety of applications, such as task scheduling, graph algorithms (e.g., Dijkstra’s shortest path algorithm), and Huffman coding.

To insert an element into a priority queue, it is added to the bottom level of the heap, and then “bubbled up” to its proper position to maintain the heap property. To extract the minimum (or maximum) element from a priority queue implemented with a heap, the root element is removed, and then the heap is restructured to maintain the heap property. These operations can be performed in O(log n) time, where n is the number of elements in the heap.

19 Learning outcomes

19.1 Check heap and shape properties

The heap property and shape property are two key properties that must be maintained by a binary heap.

The heap property states that for a max-heap, every parent node has a value greater than or equal to its children, while for a min-heap, every parent node has a value less than or equal to its children. To check the heap property of a binary heap, we can compare the value of each parent node with the values of its children. If the heap property is violated, we can swap the parent node with its larger (or smaller) child, and continue the comparison until the heap property is restored.

The shape property states that a binary heap is a complete binary tree, where all levels except possibly the last level are completely filled, and all nodes in the last level are as far left as possible. To check the shape property of a binary heap, we can count the number of nodes at each level of the tree, and verify that the number of nodes at the last level is between 1 and 2^(h-1), where h is the height of the tree. If the shape property is violated, we can swap nodes in the tree to restore the complete binary tree structure.

Overall, maintaining the heap property and shape property requires careful manipulation of the binary heap structure during insertion, deletion, and other heap operations, to ensure that the desired properties are maintained at all times.

19.2 Describe heap operations using pseudocode

Here is pseudocode for the main heap operations:

19.2.1 Max-Heapify(A, i):

left = 2i
right = 2i + 1
largest = i

if left <= A.heapsize and A[left] > A[largest]:
    largest = left

if right <= A.heapsize and A[right] > A[largest]:
    largest = right

if largest != i:
    swap A[i] and A[largest]
    Max-Heapify(A, largest)

19.2.2 Build-Max-Heap(A)

A.heapsize = A.length
for i = floor(A.length/2) downto 1:
    Max-Heapify(A, i)

19.2.3 Heap-Extract-Max(A)

if A.heapsize < 1:
    error "Heap underflow"
    
max = A[1]
A[1] = A[A.heapsize]
A.heapsize = A.heapsize - 1
Max-Heapify(A, 1)

return max

19.2.4 Heap-Increase-Key(A, i, key):

if key < A[i]:
    error "New key is smaller than current key"
    
A[i] = key
while i > 1 and A[i/2] < A[i]:
    swap A[i/2] and A[i]
    i = i/2

19.2.5 Max-Heap-Insert(A, key)

A.heapsize = A.heapsize + 1
A[A.heapsize] = -infinity
Heap-Increase-Key(A, A.heapsize, key)

These operations allow for the manipulation of a binary heap data structure, maintaining the heap property and shape property as elements are added and removed from the heap.

19.3 Implement heapsort using a heap

Heapsort is an efficient in-place sorting algorithm that uses the binary heap data structure to sort an array of elements. Here’s how to implement heapsort using a heap:

  1. Build a max-heap from the array by calling Build-Max-Heap(array).
  2. Swap the first element of the array (which is the largest element) with the last element.
  3. Decrement the heap size of the array by 1.
  4. Call Max-Heapify(array, 1) to maintain the heap property of the remaining elements in the heap.
  5. Repeat steps 2-4 for the remaining elements in the array (from n-1 down to 2).
  6. The array is now sorted in ascending order.

Here is the pseudocode for heapsort:

Heapsort(A):
    Build-Max-Heap(A)
    for i = A.length downto 2:
        swap A[1] and A[i]
        A.heapsize = A.heapsize - 1
        Max-Heapify(A, 1)

This implementation of heapsort has a time complexity of O(n log n), where n is the number of elements in the array. The space complexity of heapsort is O(1), as it sorts the array in-place without requiring any additional memory.