This post started off as an introduction to a post about the simply typed lambda calculus. It got a bit out of control – but reading it over, I think that it’s valuable background. So I’ve made it into its own post.
ST type theory is an interesting idea, but it’s mostly interesting as a historical curiosity and background. Type theory started with things like ST, but it turned into something very different. You can see the roots in ST, but modern type theory takes a completely different approach. ST is an axiomatic foundation for a stratified version of type theory. Modern type theory, in contrast, has constructed a universe of mathematics that, rather than being based on an axiomatic foundation of first order predicate logic, is based almost entirely on ideas of computability and computation.
The core of the modern version of type theory is the intuitionistic type theory of Per Martin-Lof, and it’s part of the constructionist school of math.
Like ST type theory, constructivism can also been seen as a reaction to paradox and inconsistency. Some mathematicians decided that the right way to approach the problem of paradox was to build a better, stronger axiomatic foundation. That effort let to things like ST type theory, ZFG/NBG set theory, and the principia mathematica. But others thought that that approach was wrong.
The axiomatic approach ends up with things like the axiom of choice in its foundation. And the axiom of choice leads directly to results that seem to be completely nonsensical. For example, the classic Banach-Tarski paradox.. Banach-Tarski says that there’s a way of taking a sphere the size of an orange, cutting it into some very odd slices, and then reassembling those slices into two oranges exactly the same size as the original. On the face of it, that’s ridiculous!
The key to Banach-Tarski is that it’s built on an operation that’s impossible in reality. It’s exploiting the boundaries of a theory. You start with a sphere. That’s an object on which metrics like “volume” are well-defined. You’re “cutting” it – not really cutting, but actually subdividing it – into pieces. Those pieces are objects where the metrics no longer work. You’ve broken the metric. And then, you’ve got objects for which the entire concept of volume no longer makes any sense, and you reassemble them into new objects – and those new objects are, once again, objects where the volume metric works.
From one point of view, you’ve done something that makes sense. You started with a topological space with meaningful metrics. A topological space is just a set with a bunch of cute structural properties – the neighborhood relationships. Those structural properties in turn have higher order structure properties, which can be used to define metrics like distance and volume. So the original sphere has a meaningful volume. But still, mathematically, it’s just a set with neighborhood relationships.
Since it’s just a topological space, you can divide it into sub-spaces – that is, subsets which retain structural properties. They’re still topological spaces, but now, the neighborhood relations, because of the way that you divided the original set, no longer have the properties that you need to define metrics. They’re not metric spaces.
When you recombine them, you can combine them into two new topological spaces that do have the properties that you need to define metrics – so volumes are, once again, meaningful. But in that process, you’re redefining the metrics, and so they can be redefined in any way that you want, producing any results that you want.
There’s a pretty strong argument that that entire process is an abuse of the underlying axioms. It only works because you can destroy the metric, and then recreate it. When you state what’s going on clearly, it seems like it’s cheating. But according to the standard axiomatic set-theory based math, everything that happens is completely legal. It’s ridiculous and non-sensical, and it produces a result that looks like it must be obviously wrong, but according to the axioms, it’s just fine.
The problem of cheating on the metrics is bad, but it’s not all that’s wrong with Banach-Tarski according to the constructivists. B-T says that you can cut a spherical metric space into pieces. Only you really can’t – it’s all just a clever sleight-of-hand! You can’t describe or specify how to cut the original sphere into pieces. There is no shape, no structure, no process that you can use to actually cut the sphere. The axioms say that such a cut exists, but the actual method of cutting is not just unknown, but essentially unknowable.
The constructivists argue that that’s rubbish. How can you argue that it exists, but we can’t produce it, we can’t find it, we can’t study it. According to the constructivists, it exists only as an artifact of a flawed system. It’s meaningless to say that something exists if it only exists in an unreachable theoretical ether.
The constructivists say that things like Banach-Tarski and the Russell paradox are part of the same basic problem: defining things not by how they can be created, but by how they can be described using some fuzzy airy-fairy gibberish derived from axioms. The solution isn’t to find a better set of axioms, but to find a fundamentally different way of doing things.
The constructivist method says that the way to do math is to build things. That the only way you can really consider a statement to be true is if you have concrete proof: the only way to prove that something exists is to show a concrete example of it. The only way to prove that something is false is with a specific counterexample.
That leads to an absolutely fascinatingly different way of doing math, which is what we’ll see in Martin-Lof’s intuitionistic type theory. When you talk about typed programming languages, you’ll hear a mantra: “The program is the proof”. That is literally true. In Martin-Lof’s intuitionistic type theory, the way that you prove things is by, essentially, writing programs! A proof of a fact is a computation. A proof of an existential statement isn’t just some abstract reasoning that shows that it exists – it’s a computation that produces a concrete example. A proof of a universal statement is a metaproof – a program that, given any value, produces a proof specific to that value.
It avoids the basic problem with the axiomatic problem. You can’t have something ridiculous like the set of all sets that don’t include themselves, because the only way to show that a set exists is to show a process to create it – and there’s no way to create a paradoxical set! You don’t get nonsense like Banch-Tarski, because the division of the spherical topological space into the non-metric subspaces is impossible: you can’t show a process that produces it. In the world of intuitionism and constructivism, existence proofs that don’t produce concrete examples don’t exist!
It’s an impressive thing, and it’s really a mind-blowingly different way of understanding what it is to do mathematics.
For introducing Martin-Lof’s type theory, I’m going to be mostly working from the Nordstrom/Petersson/Smith text, with supplementation from many different sources. For the programming language oriented part, I highly recommend Types and Programming Languages by Benjamin Pierce.
Before we really get started, I’m going to back and do some review. I’m going to go start with an introduction to the simply typed lambda calculus, and intuitionistic logic, because they’re both really essentials for understanding what’s coming next. So that will be the next couple of posts.