One of the most neglected data structures in the CS repertoire is
the heap. Unfortunately, the jargon is overloaded, so “heap” can mean
two different things – but there is a connection, which I’ll explain
in a little while.
In many programming languages, when you can dynamically create
objects in memory, the space from which you allocate the memory to
create them is called the heap. That is not what I’m talking about.
What I am talking about is an extremely simple but powerful structure
that is used for maintaining a collection of values from which you want
to be able to quickly remove the largest object quickly. This may sound
trivial, but this turns out to be an incredibly useful structure. You can use this basic structure to implement a very fast sorting algorithm; to maintain a prioritized set of values/tasks; to manage chunks of memory; etc. (The
prioritized set is the most common case in my experience, but that’s probably because I’ve spent so much of my career working on distributed systems.)
When you’re talking about data structures, one of the important things
you need to do in order to really understand how to choose the best one is to understand exactly what properties a data structure provides: what operations need to be fast; what operations you need to have, but don’t care about the speed; how fast things need to be; how much memory you need to create
one, and so on.
In a heap, what we want is to be able to add an element
to the heap quickly, and to be able to remove the largest element quickly. We’re not going to remove arbitrary elements of the heap: we only remove the largest. We can’t query the heap – that is, we can’t ask “Is X in the heap?” – all we can do is remove the largest value. And while providing these properties, we want to use as little memory as possible.
Now, I haven’t yet defined what I mean by “quickly” in the above definition. There’s a good reason for that: there are a set of tradeoffs
that produce different solutions. For now, we’ll settle for one
version of “quickly”: O(lg n) – that is, in the worst case, the amount of time it takes to either remove the largest value, or to add a new value, is
proportional to the logarithm of the number of values in the heap. We can
provide that kind of performance with a kind of heap called a binary heap.
A binary heap is basically a binary tree with three properties, called
the heap properties of the tree:
- For any node in the tree, the node is larger than
- Every level of the tree except for the leaves is full.
- The leaf level of the tree is full from the leftmost leaf
to the last leaf. (In other words, the last level is filled
left to right.)
So, for example, this figure is a binary heap. It’s got three levels. The first two levels are full; the third level is full on its left side. The next
insert point would be the next leftmost slot, the left child of 6.
We can write a heap in a textual form using parens: we write each node
as a triple: (x, y, z), where x is the value of the parent, and y and z are the parenthesized heaps that are the children of x. So the same heap that I showed above can be written in text form as:
To make it easier to read, we can stretch it across a couple of lines:
(11, (10, (9, (8,,), (2,,)), (5, (3,,), (1,,))), (7, (6,,), (4,,)))
For the moment, we’ll ignore just how this tree is represented. We’ll get
to that later.
It should be obvious how to find the largest element – it’s the root of
the binary heap. But once we remove it, then we don’t have a heap: there’s a hole at the top – so part of the operation of removing the maximum element
has to be re-validating the heap properties.
The way that we do this is called bubble-down What we do is take the last element in the heap – the rightmost leaf – and insert it as the root. Now there’s no hole, but the heap isn’t valid. So we look at the two children of the root, and take the larger one, and swap it with the root. Now the root is larger that both of its children, but the child-heap that now has a new root may be invalid. So we check it, and if it’s not valid, we repeat
the bubble-down operation on that child-heap.
Let’s look at that with an example. Take the heap from the
figure above. If we remove the largest element (11), and then
replace it with the last element of the heap, we get the following
(1, (10, (9, (8,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))
Now, we look at the two children, 10 and 7. 10 is larger, so we swap
the 1 and the 10, which gives us:
(10, (1, (9, (8,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))
Now, we look at the modified subheap – the left child heap. Its two
children are 9 and 5, both larger that 1. 9 is larger, so we swap:
(10, (9, (1, (8,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))
Now we’ve modified the left-left child subheap, and it’s invalid, so we
again swap for the larger child:
(10, (9, (8, (1,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))
Now our heap is valid again.
So how long did that take? We removed the root, and replaced it with
the last child. That’s a couple of steps – not the dominant cost of the operation. Then we did the bubbledown – that’s the main cost. So how many
steps are there in a bubbledown? In the worst case – like the example case above – you’ll need to do a number of swaps equal to the depth of the tree. The depth of the tree is lg(n), where n is the number of values in the tree. So we do at most lg(n) swaps – so the cost of removing the largest element
is O(lg n).
What about adding an element? It’s basically the inverse of removing the root; it’s a bubble-up. We insert the value into the position
after the last leaf. Then we compare the leaf to its parent. If it’s
larger than its parent, we swap the node with its parent. And we repeat going
up the tree.
Let’s look at a quick example of insert. Say we wanted to insert 12
to our heap. We’d start by adding 12 as a leaf:
(10, (9, (8, (1,,), (2,,)), (5, (3,,), (12,,)) (7, (6,,), (4,,)))
It’s larger than its parent, 5, so we swap:
(10, (9, (8, (1,,), (2,,)), (12, (3,,), (5,,)) (7, (6,,), (4,,)))
And so on – we’ll swap two more times, giving us a new heap:
(12, (10, (8, (1,,), (2,,)), (9, (3,,), (5,,)) (7, (6,,), (4,,)))
So again, we’re looking at a maximum of lg(n) swaps to
do an insert. So we’re O(lg n) for insert.
I’m running out of writing time, so I’ll have to leave a
description of how to represent a heap, and an example
implementation of a heap for this evening. But I’ll leave you with one
quick example of how a heap can be used. Suppose I want to sort
a list of integers. We know that the best worst-case time for
sorting is O(n lg n). We can implement a very simple sort by inserting all
of the values into a heap, and then removing them in order. That’s n inserts, each with maximum cost lg(n); and n removes, maximum cost lg(n). So it’s
O(n lg n) to sort the list – and no O(n2) case like quicksort!