Zippers: Making Functional "Updates" Efficient

20 Replies

In the Haskell stuff, I was planning on moving on to some monad-related
stuff. But I had a reader write in, and ask me to write another
post on data structures, focusing on a structured called a
zipper.

A zipper is a remarkably clever idea. It’s not really a single data
structure, but rather a way of building data structures in functional
languages. The first mention of the structure seems to be a paper
by Gerard Huet in 1997, but as he says in the paper, it’s likely that this was
used before his paper in functional code — but no one thought to formalize it
and write it up. (In the original version of this post, I said the name of the guy who first wrote about zippers was “Carl Huet”. I have absolutely no idea where that came from – I literally had his paper on my lap as I wrote this post, and I still managed to screwed up his name. My apologies!)

It also happens that zippers are one of the rare cases of data structures
where I think it’s not necessarily clearer to show code. The concept of
a zipper is very simple and elegant – but when you see a zippered tree
written out as a sequence of type constructors, it’s confusing, rather
than clarifying.

The basic idea of a zipper is to give you a way of efficiently working with data
structures in a functional language. There are a lot of cases where in an imperative
language, there’s some basic operation which is cheap and simple in the imperative
language, because it’s performed by an in-place update. But in a functional language,
you can’t update a field of a data structure: instead, you have to create a new copy of the structure with the altered
field.

For example, consider the list [a b c d e f g]. Implemented
as a cons-list, it’s a list of 7 cons-cells. Suppose you wanted
to replace “e” with “q”. In an imperative language, that’s no problem: just
do a set-car! of the 5th cell. In a functional language, you would
need to create a new list with “q” instead of
“e”. You could re-use the common tail [f g], but you would need
to re-create the other 5 cells: you’d need to create a new cell to
attach “q” to [f g]. Then you’d need to create a new
cell to connect “d” to [q f g]. And so on.

That makes the functional program much slower than the imperative one.
If you’ve got a data structure that conceptually changes over time, and you’re going to make lots of changes,
the cost of doing it functionally can become very high, because of all of the copying
you do instead of mutating a data structure.

In general, it’s very hard to get around that. You can’t update in place
in a functional language (at least, not without some serious cleverness, either
in your code (like monads), you language (like linear types), or your compiler).
But for many applications, there’s some notion
of a focus point – that is, a particular key point where changes
happen — and you can build structures where updates around the focus
can be performed efficiently.

For example, if you’re building a text editor, you’ve got the point
where the cursor is sitting – and the changes all happen around the cursor.
The user might type some characters, or delete some characters – but it always
happens around the cursor.

What a zipper does is take a data structure, and unfold it around a focal
point. Then you can make changes at the focal point very quickly – about as
quickly as an in-place update in an imperative language.

The idea of it is a lot like a gap-buffer. Right now, I’m actually working
on a text-editor. I’m writing it using a gap-buffer. Conceptually, an
edit-buffer is one continuous sequence of characters. But if you represent it
as a continuous sequence of characters, every insert is extremely expensive.
So what you do is split it into two sub-sequences: one consisting of the
characters before the cursor point, and one consisting of the
characters after the cursor point. With that representation,
inserting a character at the cursor point is O(1). Moving by one character is
also O(1). Moving by N characters is O(N). With various improvements, you can
do much better than that – but the key bit is that split between before the
focus point and after it.

A zipper is a tree or graph-based version of a similar idea. For this
discussion, I’ll describe it in terms of trees; the graph version is more complicated,
but you should be able to get the idea from seeing how it works on trees. The
idea is that you take the tree structure, and you split it around a focus. You’re focused on some node in the
tree. You keep track of a set of nodes that come before you, and a
set of nodes that come after you – those are basically like the
pre-gap and post-gap regions of a gap buffer. But because you’re working in a
tree, you need a bit more information: you need to know the path from
the root of the tree down to the current node.

It’s called a zipper because what you do to create this pre-focus, path,
and post-focus bits of the structure is unzip the tree. For example,
look at the tree below. It’s a representation of a string of text represented
by a tree. In this particular tree, all of the data is stored in the leaves.
The internal nodes contain metadata, which I haven’t shown in the diagram.

Now, suppose I want to put the focus on “mno”. To do that, I climb down
the tree, unzipping as I go. I start at the root, node N1. Then I go
right. So I put N1 and its left subtree into the left-context of my
zipper-tree, and add “Right at N1” to the path. That puts the focus at N3. To
get to “mno” from N3, I need to go left. So I put N3 and its right child into
the right context, and add “Left at N3” to the path. Now the focus is at N4.
To get to “mno”, I need to go right: so I put N4 and its left child into the
left context, and add “Right at N4” to the path. Now I’ve got the focus set
where I want it at “mno”; and I’ve got right and left contexts.

With the zipper, you can make all sorts of changes very easily at the
focus. Suppose I want to change the focus node, by inserting some text. I can
do that functionally, without actually changing anything, by creating a new
zipper tree which is identical to the old one, but which changes the value of
the focus node – that is, if I were to add “123” right after “mno”, I could do
it by creating a new focus node “mno123”, with the same path, left, and right
contexts. It takes minimal extra memory to create the copy, because I can
re-use the path and the contexts.

I could also add new children nodes. Suppose that instead of adding
“123” to the focus, I want to keep each leaf containing three characters.
could replace the focus with a new node, N5, which had children “mno”
and “123”. I could re-use the “mno” node, and the path, left, and right
contexts.

That’s the beauty of the zipper: most operations can be in terms of local
changes, re-using most of the structure. If we were using a standard tree,
then to add a new node in the position of “mno”, we would need to create
copies of N4, N3, and N1; instead, we only need to create the one new
node.

Doing other things isn’t that difficult either. Suppose we wanted to move
the focus to “pqr”. We’d need to shift the focus from “mno” to N3, then to N3,
and then to “pqr”. To get from “mno” to N4, we take the last step off of the
path – which says we went right at N4 – so we set the focus to N4, and
re-establish “mno” as its right child. So the focus would be N4, with “jkl” as
its left child, and “mno” as its right child. To get from N4 to N3, we unroll
another step of the path: since we went left at N3, that means that N3 is the
new focus, with N4 as its left child. Then we’d go down to the right from N3,
so we’d add “right at N3” to the path, and “pqr” would be the new focus.
Moving the focus like that is a tad more difficult than just traversing
non-zipper tree, but it’s not significantly slower – and it makes the edits
much, much faster.

So why is it harder to code? Because when we’re dealing with trees, we’re pretty much always dealing with balance. And balance isn’t a local property. No matter which kind of tree you use – red/black, 2/3, AVL – you might need to climb up the tree to do the balance maintenance. That mangles the simple zipper.

You’ve got two choices. One is to re-balance
the tree immediately. You can definitely do that. For
example, if you think of how you do a re-balance in
a red-black tree, you climb up the tree doing fixes until you’ve got things rebalanced. You can definitely do that – by using the zipper to move around the tree. But a big part of the point of the zipper is to keep operations local, and the re-balancing is not a
local operation. Much of the time, you can do things
locally, but sometimes you’ll be stuck re-zipping as you move the focus up the tree fixing the balance; in
the worst case, you need to re-zip the entire tree, all the way to the root.

The alternative is something called scarring. You put marks in the tree called scars that identify places where you made changes that could trigger a rebalance. (Or more generally,
in places where you made an edit that could have violated some invariant of the data structure.) You don’t do the fix immediately – you just mark it with
the scar, and then at some point, whenever it makes sense for your application, you go back to the scars, and fix the tree. (Scaring can also have a more general meaning, which involves memorizing certain paths through
the tree, so that you can make changes at the leave, then a few steps up, then back at the leaf. It’s a similar concept; in both forms of scarring, you’re optimizing to reduce the cost of zipping up and down the tree. )

Either way, it gets a bit more complicated – and when you look at the code
for a zipper, the re-balancing/invariant fixing has a tendency to dominate the complexity
of the code. The zipper itself is so simple and so elegant that it just disappears under
the weight of tree-balancing.

20 thoughts on “Zippers: Making Functional "Updates" Efficient”

Ricardo Herrmann January 13, 2010 at 10:43 pm

Don’t forget mentioning that “The Derivative of a Regular Type is its Type of One-Hole Contexts“. Who said programmers don’t need to know Calculus ? 😉 That would also make a good related post (hint, hint).

Loading...

Reply ↓
Ricardo Herrmann January 13, 2010 at 10:54 pm

Here are some more good posts on ADT differentiation:

Finite Differences of Types
Differentiation of Datastructures for Non-Functional Programmers
Derivatives of Types

Which simply means there’s no need to be smart about the construction of zippers.

Loading...

Reply ↓
MPL January 13, 2010 at 11:50 pm

I heard the structure of zippers compared to the structure of a recursive computation on a tree. There are downward pointing nodes corresponding to the parameters that would be active in the local context, with upwards nodes (so to speak) representing the return path. Viewed in the right way, with a bit of mental stretching, you could see a zipper as a sort of manipulable continuation. Other zipper-like structures can supposedly be constructed in a similar manner. I guess they make a good example of the code/data duality.
Of course, balanced trees and zippers are always going to be slightly at odds, like you said, because zippers try to optimize local access, while balanced trees try to optimize global access. If insertions/edits are not locally clustered, a zipper might not make sense at all.

Loading...

Reply ↓
Ørjan Johansen January 14, 2010 at 12:36 am

MPL: You beat me to the remark about continuations. Actually, I think it’s easier to view continuations as zippers – isn’t that what a call stack is, the unzipped “after” part of an ongoing computation.
Oleg Kiselyov also did the reverse, implementing zippers in a completely generic way as (delimited) continuations, at http://okmij.org/ftp/Computation/Continuations.html#zipper
Now that requires mental stretching. I won’t pretend to understand it.

Loading...

Reply ↓
MPL January 14, 2010 at 1:44 am

Oooh, that looks like great fun, thanks for the link. I’ve been trying to learn Haskell lately (perverse or not, I like starting at the hard end).
Reading about zippers and continuations were what made me finally understood the lisp saying that “data is code and code is data” (in a deeper way than “sexps are made of pairs”): the data reflects past computations, and guides future ones. At its best (map, foldr, zippers), this sort of thinking can produce beautiful things.

Loading...

Reply ↓
ja January 14, 2010 at 2:09 am

Actually, balanced trees and zippers don’t have to be at odds. In fact, fingers (as in finger trees) are just zippers without holes!
The 2-3 finger tree view designed by Hinze & Paterson has two fingers, one at each end. Fingers can actually be located anywhere in many types of tree, including AVL trees. See, for instance, Brown & Tarjan’s “A fast merging algorithm”, especially the Joe-Versus-the-Volcano-esque Figure 7 (at the end of the report):
“A fast merging algorithm” by Brown & Tarjan
Brown & Tarjan do go to some effort to rebalance the AVL tree, but 2-3 finger trees show that it is possible to maintain balance without working quite so hard.
Joe Versus the Volcano

Loading...

Reply ↓
Federico January 14, 2010 at 6:10 am

Thanks so much for the post !! As always a very clear explanation.

Loading...

Reply ↓
Daniel Sobral January 14, 2010 at 9:11 am

That actually misses one of the most important uses of zippers, which is when one needs to “move” in both directions.
Functional data structures usually support only one-way moving. One can only go down on trees, never up to the parent, only right on lists, and so on. This makes it possible to maximize reuse of such structures when making modifications.
Some algorithms, however, need to be able to go in more than one direction. For instance, XPath queries may refer to the parent of a node.
Zipper then provides a way to enable traversal of immutable structures both ways, in an efficient manner.

Loading...

Reply ↓
Ricardo Herrmann January 14, 2010 at 10:09 am

Here are some more good posts on ADT differentiation:

Finite Differences of Types
Differentiation of Datastructures for Non-Functional Programmers
Derivatives of Types

Which simply means there’s no need to be smart about the construction of zippers.

Loading...

Reply ↓
Eugene Day January 14, 2010 at 1:02 pm

I’ve lurked at your blog for a long time and really like it, although I’m not much of a programmer, I really enjoyed the stuff on chaos. Control systems and Operations research are my specialties.
I’m starting a semi-formal, casually peer reviewed science blog, and I’m seeking qualified reviewers. Anyone who would like to drop by is welcome.
eclectologia(at)gmail.com

Loading...

Reply ↓
Brendan Miller January 14, 2010 at 3:13 pm

I love your data structure posts! Keep up the good work please. It’s so rare to see blogs discussing computer sciency issues.
99% of the programming blogs out there just don’t touch anything related to data structure or algorithms… Which makes me said, because that’s really the most fun and intellectually challenging aspect of programming.
I’m also glad you didn’t make this a Haskell specific post. While I’ve used functional languages like ML and Scheme, I haven’t tried Haskell yet, so some of the Haskell specific issues tend to go over my head.

Loading...

Reply ↓
Bard Bloom January 14, 2010 at 3:30 pm

Nice discussion! Thank you. I’m kind of far from functional languages anymore, but it’s good to see your articles on them.

Loading...

Reply ↓
Leon P Smith January 14, 2010 at 5:15 pm

There is another approach to persistent data structures than what is used by purely functional languages, as detailed in Making Data Structures Persistent by Driscoll, Sarnak, Sleator, and Tarjan. (also available here)

Loading...

Reply ↓
augustss January 14, 2010 at 5:38 pm

Huet’s given name is Gérard.

Loading...

Reply ↓
solrize January 14, 2010 at 9:01 pm

You may have been thinking of Carl Hewitt, who worked on the so-called “Actor model” of programming.

Loading...

Reply ↓
andreas January 15, 2010 at 7:25 am

Nice and well written post. I found a small typo. The First N3 in
“We’d need to shift the focus from “mno” to N3, then to N3, and then to “pqr”.” should be N4 as your saying in the next sentence.

Loading...

Reply ↓
MikeRo January 15, 2010 at 1:09 pm

With languages like F#, we can “escape” pure-FP as desired – so I’m wondering when I have the choice, to use a pure-FP DS with zippers, or use “mutable” when necessary.
The question to me is, how does the “hole context” of a zipper effect concurrent operations on the same DS? Seems to me that you’d need some kind of “zipper manager” that had global state or a way to rewind operations (like managing cache coherency) to get the right answer at the end, which is probably more complicated than non-FP approaches to concurrency.

Loading...

Reply ↓
Ørjan Johansen January 16, 2010 at 5:05 pm

MikeRo: It is my very vague understanding (mostly from just reading the abstract) that the second article in that Kiselyov link I gave above is sort of about such “zipper managers”, and that indeed things get very complicated when updating at two or more different locations simultaneously. (I’m not sure if he supports actual concurrent updates or just interleaved sequential ones.)
One thing I believe you lose immediately when using almost any mutability in your data structure is the ability to keep old versions of the data structure hanging around, e.g. for undo.

Loading...

Reply ↓
MPL January 18, 2010 at 12:53 am

@14,15
Skimming the Kiselyov link, it sounds like he’s arguing that you can construct several different solutions for dealing with the data concurrently.
Having a way to rewind/undo operations actually comes free with immutable/functional data structures: just keep a reference to the previous version (well, not free. Eventually the garbage collector will send you a bill). If you don’t need to merge the results back together (which is entirely plausible in some cases), then nothing else has to be done.
If you do need to get all the processes working on the same data again, there’s a few options. You can always do the basic thing and take out locks on the data: this is no better or worse than the mutable solution.
The sophisticated, concurrent shared update solutions are complicated. But that problem is inherently complicated, so that’s not a big loss either. They are all, essentially, ways of merging two sets of updates together.
The huge advantage of these functional data structures is that although it may be difficult to get something that works concurrently at all, it is much harder to get random nonsense errors, like carelessly done mutable shared data structures can do all too easily.
Interestingly enough, back when I was still following the F#/.Net chatter, there were rumors of adding ways to make it impossible to mutate data structures, so you could enforce some of the functional properties. The motivation being concurrent programing, where mutable data structures can cause havoc.

Loading...

Reply ↓
Pingback: Immutable Data Structures | Loominate