# Mathematical Data Structures Part 1: Binary Heaps

Being a PhD in computer science, it’s only natural that I love data structures. In particular, I’m fascinated by the way that the math factors in to the way we structure data. Data structures fit into a beautiful intersection between information theory and algorithms: the way that a good data structure is built is a reflection of what information it really needs to maintain. The best data structure encodes exactly the information it needs in order to do it’s job – no more, and no less. The mathematical impacts of that are beautiful, and sometimes surprising. To try to demonstrate that, I’m going to take a couple of posts, and work my way through one of my favorite examples of a surprising outcome in a structure called a fibonacci heap.

A heap is a structure designed to solve a common problem. You’ve got a collection of objects, each of which has an associated numeric value. You want, at any time, to be able to find and remove the largest value in the collection, and to be able to add new elements to it. Those two operations are the core of the heap. Some variations also allow you to increase the value of objects inside the heap, or to remove values other than the maximum.

There are a lot of different ways to implement a heap. One obvious one is to just maintain a sorted sequence of objects. The problem with that is performance: some of the common operations are painfully slow!

Using the sorted sequence approach, removing the largest value is easy: you just remove the last element of the sequence. That’s very fast: it’s constant time. But you also need to be able to add values to the heap, and that’s not so good.

There’s two basic ways of doing a sequence: an array, or a linked list. In both cases, the performance isn’t acceptable. If we used an array,then in order to add a new object to the collection, we’d need to:

1. Find the correct position for it in the array. We can do that by doing a binary search, which takes time $O(lg n)$ where $n$ is the length of the array. This step isn’t bad – in general, we’re pretty happy with $O(lg n)$ operations.
2. Insert the value into the array – which means shifting all of the elements that come after it one place to the right. That’s $O(n)$ time, which is pretty crappy.

In the linked list approach, inserting the value isn’t a problem – it’s a constant time operation. But finding the position where it should be inserted is linear time. So we’re still talking about linear time.

Similarly, we could use a linked list, where inserting the element is constant time, but then finding its position is $O(n)$ – again, unacceptable.

The problem with the sorted sequence approach isn’t really related to the kind of structure we use to maintain the sorted list; the problem is that we’re maintaining more information that we need. At any time, we want to be able to find the largest element of the heap quickly – we don’t care about the relative positions of any pair of values that don’t include the largest element of the collection! But if we keep a sorted list, every time we insert an element, we’re spending a lot of time comparing things whose comparison we don’t really care about!

To be able to make it faster, we need to build a data structure that doesn’t waste time and effort computing and maintaining information that we don’t want.

So our goal is to find ways of building structures that always let us both find the largest element quickly, and add new elements quickly, without maintaining more information that is really necessary. We’ll start off with a simple but good version, and then work our way through to better ones.

The most basic implementation of a heap is called a binary heap. A binary heap is a binary tree with two key properties:

1. Every node in the tree is larger than its children.
2. The tree is left-full: every level of the tree is full except for the last; and the last level is filled in from left to right.

The left-full property might seem a bit strange, but it turns out to be pretty straightforward. A binary heap can be implemented using an array. The root node is stored in the first position of the array; its children are in positions 2 and 3; the children of node 2 are stored in positions 4 and 5; the childen of position 3 are stored in positions 6 and 7. Using one-based indices, for any node N, it’s children are stored in positions 2N and 2N+1. Adding a new leaf to the tree can always be done by just appending one value to the array. The left-full property just means that you always extend the array by adding an element onto the left.

Implementing a heap this way is simple:

1. To get the maximum value, you just look at the first element of the array – O(1).
2. To remove the largest element from the array, you get the value from the first element of the array, and save it. Then you remove the last element from the array, and bubble it down – swapping it with one of its children if they’re bigger than it. We’ll look at this in more detail, but the bubble down process is O(lg n) in the worst case.
3. Inserting a new element is done by adding it to the end of the array, and then bubbling up, by comparing it to its parent, and swapping if it’s bigger than its parent. Again, it’s $O(lg n)$.

I’m going to show code for this. For fun, I wrote the code in a language called xtend. Xtend is a Java extension that cleans up the syntax, gets rid of semicolons, improves the type system, adds lambdas, and does a few other really neat things.

The whole beast is just a wrapper around an array:

```class BinHeap> {
val ArrayList _contents

new() {
_contents = new ArrayList()
}

...
}
```

If you know Java, this is mostly clear. In xtend, you write constructors using the name “new” instead of the name of the class being constructed.

Then we’ll set up some utilities to make other stuff easier to write.

```  def leftChildPosition(int pos) {
2 * (pos + 1) - 1
}

def rightChildPosition(int pos) {
2 * (pos + 1)
}

def int parentPosition(int pos) {
if (pos == 0) {
throw new MaxHeapException()
} else {
(pos + 1)/ 2 - 1
}
}

def void swap(int one, int two) {
val T first = _contents.get(one)
_contents.set(one, _contents.get(two))
_contents.set(two, first)
}
```

Again, these should be straightforward. The only tricky thing is that the JVM uses zero-based arrays – so the left child of the node in position $N$ is $(2*(N+1) - 1)$: we need to add one to the node number to shift to one-based position; and then subtract one from the result to switch back to zero-based position. We do a similar thing for each of the other position computations.

Now we can get to the interesting bits. How do we get values into the heap?

```def insert(T v) {
val idx = _contents.size()
bubbleUp(idx)
}
```

Insert is exactly what I described in prose above: append the new value onto the end of the array, and then bubble it up. Bubbling is the interesting part:

```  private def void bubbleUp(int pos) {
if (pos > 0) {
val parentPos = parentPosition(pos)
if (_contents.get(pos) > _contents.get(parentPos)) {
swap(pos, parentPos)
bubbleUp(parentPos)
}
}
}
```

Bubbling up from a position $P$ compares $P$ to its parent. If it’s bigger than its parent, it swaps positions with the parent, and then tries to continue bubbling up from its new position.

For example, imagine we had a tree like:

```9
8
5
4
0
6
3
7
2
1
```

Now, suppose we wanted to add the value “10” to this. We’d add 10 to the end of the array, which would make it a child of 6. That would give us:

```  9
8
5
4
0
6
3
10
7
2
1
```

So, we’d compare 10 to its parent – it’s bigger, so we’d swap:

```  9
8
5
4
0
10
3
6
7
2
1
```

Then we’d compare 10 to its new parent, 8. It’s bigger, so we swap:

```  9
10
5
4
0
8
3
6
7
2
1
```

And finally, we’d compare 10 to its new parent, 9. It’s bigger so we swap, and then we’re done.

```  10
9
5
4
0
8
3
6
7
2
1
```

Appending to the end of the array is constant time, so the dominant time cost is the bubbling. The maximum possible number of swaps in the doubling process is depth of the tree minus 1 – and the depth of a full binary tree with N members is $\lceil ln N \rceil$. So it’s $O(lg n)$ swaps, and the overall cost of inserts is $O(lg n)$.

Getting the largest value is trivial:

```def getMax() {
_contents.get(0)
}
```

Removing the largest value is a lot like adding a value: we really play with the last element of the array, and then do a bubbling process – only this time we’ll bubble in the opposite direction:

```  def removeMax() {
if (_contents.size == 0) {
throw new MaxHeapException()
} else {
val result = getMax()
val last = _contents.remove(_contents.size() - 1)
if (_contents.size() > 0) {
_contents.set(0, last)
bubbleDown(0)
}
result
}
}
```

Bubbling down is similar to bubbling up, but it’s a bit more complicated, because we need to look at both children.

```  private def void bubbleDown(int pos) {
val rightChildPos = rightChildPosition(pos)
val leftChildPos = leftChildPosition(pos)
if (leftChildPos >= _contents.size) {
return
}
// Try to bubble left if there is no right child, or if the lift child is
// bigger than the right.
if (rightChildPos >= _contents.size || _contents.get(leftChildPos) > _contents.get(rightChildPos)) {
if (_contents.get(pos) < _contents.get(leftChildPos)) {
swap(pos, leftChildPos)
bubbleDown(leftChildPos)
}

} else {
// Try to bubble right
if (_contents.get(pos) < _contents.get(rightChildPos)) {
swap(pos, rightChildPos)
bubbleDown(rightChildPos)
}
}
}
```

The process is almost the same as bubbling up, but moving in the opposite direction. We're starting with a parent node, and comparing it to its children. If it's bigger than either of its children, then we swap it with the largest child, and then continue bubbling down.

For example, let's look at the same heap we looked at for insert:

```  9
8
5
4
0
6
3
7
2
1
```

If we want to remove 9, we set the value 9 aside, and then remove 3 from the end of the array, and put it at the root of the tree:

```  3
8
5
4
0
6
7
2
1
```

Then we'd compare 3 against its two children, 8 and 7. Since 8 is the larger child, we swap 8 for 3:

```  8
3
5
4
0
6
7
2
1
```

Now we compare 3 with its new children, 5 and 6. 6 is bigger, so we swap 6 with 3:

```  8
6
5
4
0
3
7
2
1
```

3 has no children, so we're done: it's bubbled down as far as it can go.

Note: I messed up this example in the original version of the post. Thanks to John Armstrong for pointing it out.

The cost here is the same as insert, for the same reason. The dominant cost is the bubbling, and the bubbling is bounded by the depth of the tree. So removing the maximum is also $O(lg n)$.

It's worth noting that heaps can be used to build a very reasonable sorting algorithm. To sort a collection, just insert all of the elements of the collection, and then remove them one by one. It's $O(n lg n)$, and it's conceptually quite simple. It's not widely used, because the old classic quicksort is faster - not in big(0) notation, but it ends up with a smaller constant. (In big-O notation, something that takes 3(lg n) steps and something that takes 6(lg n) steps are both $O(lg n)$, but the one whose constant is 3 is still twice as fast as the one whose constant is 6.)

## 8 thoughts on “Mathematical Data Structures Part 1: Binary Heaps”

1. _Arthur

The Linux scheduler, known as the Complely Fair Scheduler, uses a red-black tree instead of a heap.
The process with the most accumulated run time credits is always the leftmost node.

2. John Armstrong

There are a few times you slip into “smallest” instead of “largest”. More seriously, you pick the wrong element in your bubbling-down example; e end of the array is 3, not 1, and so the resulting heap is not left-balanced in the end.

3. valerio

Very clear post! If I can make a suggestion, consider changing the introductory part where you state:

“To remove the largest element from the array, you get the value from the first element of the array, and save it. Then you remove the last element from the array, and bubble it down”

This made no sense to me, until I reached the code below and I found that you’re supposed to take the last element to the top of the heap and THEN bubble it down.

4. Eeyore

How would this compare to having a class that tracks the largest, and a collection for everything else?

1. markcc Post author

If you did that, then every insert would be O(1): you’d just need to compare the current largest against the new element.

But every time you removed the largest element, you’d have to do an O(1) search through the collection for the new largest. Linear time for remove-largest is really, really bad.