Category Archives: Logic

Metamath and the Peano Induction Axiom

In email, someone pointed me at an automated proof system called [Metamath][metamath]. Metamath generates proofs of mathematical statements using ZF set theory. The proofs are actually relatively easy to follow, which is quite unusual for an automated theorem prover. I’ll definitely write more about Metamath some other time, but I thought it would be interesting today to point to [metamaths proof of the fifth axiom of Peano arithmetic][meta-peano], the principle of induction. Here’s a screenshot of the first 15 steps; following the link to see the whole thing.
peano5-proof.jpg
[metamath]: http://us.metamath.org/mpegif/mmset.html#overview
[meta-peano]: http://us.metamath.org/mpegif/peano5.html

The Category Structure for Linear Logic

So, we’re still working towards showing the relationship between linear logic and category theory. As I’ve already hinted, linear logic has something to do with certain monoidal categories. So today, we’ll get one step closer, by talking about just what kind of monoidal category. As I keep complaining, the problem with category theory is that anytime you want to do something interesting, you have to wade through a bunch of definitions to set up your structures. I’ll do my best to keep it short and sweet; as short as it is, it’ll probably take a bit of time to sink in.
First, we need a symmetry property. A monoidal category C (with terminal object t, tensor ⊗, and natural isomorphisms λ, ρ) is *symmetric* if/f for all objects a and b in C, there is a natural isomorphism γa,b : (a ⊗ b) → (b ⊗ a); and the categories monoidal natural isomorphism ρ = λb º γb,t : (b ⊗ t) → b.
In other words, if you reverse the arrows, but leave everything else the same, you still wind up with a monoidal category.
In a similar vein: regular functors are also sometimes called *covariant* functors. If you take a functor and *reverse* the directions of all the arrows, you get what’s known an a *contravariant* functor, or *cofunctor*.
We also need a *closure* property. Suppose C is a symmetric monoidal category. C is *monoidally closed* if/f there is a functor ⇒ : C × C → C, such that for every object b in C, there is an isomorphism Λ : (a⊗b → c) → (a → (b ⇒ c)), and Λ is a natural transformation for a and c. (Remember that a and c are categories here; this is a natural transformation from category to category.)
Basically, this just says that the tensor stays cleanly in the monoid, and there’s always a natural transformation corresponding to any use of tensor.
So what these two really do is pull up some of the basic category properties (closure with respect to composition, symmetry) and apply them to categories and natural transformations themselves.
Finally, we need to be able to talk about what it means for a *functor* to be closed. So suppose we have a monoidally closed category, with the usual components, up to and including the Λ natural transformation. A functor F : C → C is *closed* if, for all a,b in C, there is an arrow fa,b : ba → F(b)F(a) such that for all g : a → b, fa,b º Λ(gº λa) = Λ(F(g) º λF(a)).
In simple english, what a closed functor is is a functor that can be *represented by* an arrow in the category itself. That’s what all of that gobbledygook really means.
So… as a quick preview, to give you an idea of why we just waded through all of this gunk: if you take a monoidally closed category of the appropriate type, then collections of resources are going to be categories; ⊗ is the tensor to join collections of resources; and “-o” implications are the closed functors “⇒”.

Towards a Model for Linear Logic: Monoidal Categories

Time to come back to category theory from out side-trip. Category theory provides a good framework for defining linear logic – and for building a Curry-Howard style type system for describing computations with *state* that evolves over time. Linear logic provides a way of defining a valid *model* for part of linear logic (the multiplicative operators) that aren’t tractable using other modeling techniques.
I’m going to show you the relationship between models for linear logic and category theory. It’s going to take a couple of days to go through the whole shebang, but even the parts that we need to go through to get there are fun.
The first step is to define a *monoidal category*, also known as a *tensor* category. We’ve already done most of that when we built [monads][monads] earlier this week; a monad is a kind of monoidal category.
A monoidal category is a category C with one object t, and a *binary functor* ⊗ : C × C → C. This binary functor is called *tensor*. Tensor has three required properties defined using *natural isomorphisms*, called α, λ, and ρ.
α says that tensor must be associative: α(A,B,C) : (A ⊗ B) ⊗ C → A ⊗ (B ⊗ C).
λ says that tensor has a *left identity*: λA : (I ⊗ λ) → A.
ρ says that tensor has a *right identity*, which is the same as the left identity: ρA : (ρ ⊗ 1) → A;.
And finally, the natural transformations need to make the following diagrams commute for all values of A, B, and C. These are known as the *coherence conditions* for the monoidal natural transformations.
monoid.jpg
A monoidal category is said to be *strict* if α, λ, and ρ are all identities. It turns out that for every monoidal category, there is an *equivalent* (in the sense of natural isomorphism) to a struct monoidal category.
And now, here comes an example of just why category theory is useful. In some of the [detailed models of quantum physics][quantum-condense], they try to describe the structure of different kinds of matter using what they call *topological orders*. The standard theory for describing the topological orders to different states of matter is called *Landau* theory. It turns out that Landau theory doesn’t describe the topological order of high temperature semiconductors or very-low-temperature condensate states. Category theory – in particular, the theory surrounding strict monoidal categories does a better job of describing the topological order of the different states of matter than any other mathematical approach that’s been tried so far.
[monads]: http://scienceblogs.com/goodmath/2006/07/monads_and_programming_languag_1.php
[quantum-condense]: http://dao.mit.edu/~wen/pub/qorder.html

Linear Logic

[Monday][yesterday], I said that I needed to introduce the sequent calculus, because it would be useful for describing things like linear logic. Today we’re going to take a quick look at linear logic – in particular, at *propositional* linear logic; you can expand to predicate linear logic in a way very similar to the way we went from propositional logic to first order predicate logic.
So first: what the heck *is* linear logic?
The answer is, basically, logic where statements are treated as *resources*. So using a statement in an inference step in linear logic *consumes* the resource. This is a strange notion if you’re coming from a regular predicate logic. For example, in regular predicate logic, if we have the statements: “A”, “A ⇒ B”, and “A ⇒ C”, we know that we can conclude “B ∧ C”. In linear logic, that’s not true: using either implication statement would *consume* the “A”. So we could infer “B”, or we could infer “C”, but we could *not* infer both.
When people talk about linear logic, and why it makes sense, they almost always use a vending machine analogy. Suppose I walk up to a vending machine, and I want to buy a soda and a candy bar. I’ve got 8 quarters in my pocket; the soda costs $1.50; the candy bar costs $.75.
In linear logic, I’d say something like the following (the syntax is wrong, but we’ll get to syntax later): (Q,Q,Q,Q,Q,Q,Q,Q), (Q,Q,Q,Q,Q,Q) ⇒ Soda, (Q,Q,Q) ⇒ Candy.
Using the rules, I can by a soda by “spending” 6 of my Qs. I wind up with “(Q,Q) ∧ Soda”, and “(Q,Q,Q) ⇒ Candy”. I’ve consumed 6 Qs, and I’ve consumed the “(Q,Q,Q,Q,Q,Q) ⇒ Soda” implication. I can’t do anything else; I don’t have enough Qs.
The basic statements in linear logic, with intuitive meanings are:
1. A ⊗ B. This is called *multiplicative conjunction*, also known as *simultaneous occurrence*. This means that I definitely have both A and B. This has an *identity unit* called “1”, such that A ⊗ 1 ≡ 1 ⊗ A ≡ A. 1 represents the idea of the absence of any resource.
2. A & B : *additive conjunction*, aka *internal choice*. I can have either A *or* B, and I get to pick which one. The unit is ⊤, pronounced “top”, and represents a sort of “I don’t care” value.
3. A ⊕ B. This is called *additive disjunction*, also known as *external choice*. It means that I get either A or B, but I don’t get to pick which one. The unit here is 0, and represents the lack of an outcome.
4. A ⅋ B : *multiplicative disjunction*, aka *parallel occurence*; I *need to have* both A and B at the same time. The unit for this is ⊥, pronounced “bottom”, and represents the absence of a goal. In the vending machine metaphor, think of it as the “cancel” button on the vending machine: I decided I don’t want any, so I’m not going to spend my resources.
5. A -o B : Linear implication. Consume resource A to *produce* resource B. The normal symbol for this looks like an arrow with a circle instead of an arrowhead; this operator is often called “lolly” because of what the normal symbol looks like. I’m stuck writing it as “-o”, because there’s no HTML entity for the symbol.
6. !A : Positive exponentiation, pronounced “Of course A”. This *produces* an arbitrary number of As. Equivalent to A ⊗ !A.
7. ?A : Negative exponentiation, pronounced “Why not A?”. This *consumes* As.
Ok. So, suppose I want to talk about buying lunch. I’ve got 10 dollars to buy lunch. I’ll be full if I have a salad, a coke, and a tuna sandwich. If I wanted to write “I’ve got a dollar” as “D”, “I have a salad” as “S”, “I have a coke” as “C”, “I have a tuna sandwich” as “T”, and finally, “I’m full” as “F”
* I can write “I have 10 dollars” in LL as: “(D ⊗ D ⊗ D ⊗ D ⊗ D ⊗ D ⊗ D ⊗ D⊗ D ⊗ D)”.
* I can write “Tuna sandwich and salad and coke” as a group of things that I want to have all of as: “T ⅋ S ⅋ C”.
* I can say that I’ll be full if I have lunch as “T ⅋ S ⅋ C -o F”
If I want to talk about buying lunch, I can describe the prices of the things I want using implication:
* A coke costs one dollar: “D -o C”; I can spend one dollar, and in return I get one coke.
* A salad costs 3 dollars: “(D ⊗ D ⊗ D) -o S”
* A tuna sandwich also costs three dollars: “(D ⊗ D ⊗ D) -o S”
Now, I can do some reasoning with these.
* By taking 1 of the dollars, I can get one C. That leaves me with “D ⊗ D ⊗ D ⊗ D ⊗ D ⊗ D ⊗ D ⊗ D⊗ D ⊗ C”
* By taking 3 D, I can get one S. “D ⊗ D ⊗ D ⊗ D ⊗ D⊗ D ⊗ C ⊗ S”.
* By taking 3 D, I can get one T. “D ⊗ D⊗ D ⊗ C ⊗ S ⊗ T”.
* Now I’ve got my lunch. I can eat it and be full, with three dollars left: “D ⊗ D⊗ D ⊗ F”.
Just from this trivial example, you should be able to see why linear logic is cool: the idea of being able to talk about *how* resources are used in an inference process or a computation is really valuable, and linear logic gives you the ability to really work with the concept of resource in a solid, formal way. If you think of it in terms of the Curry-Howard isomorphism [types-as-proofs concept from the simply typed lambda calculus][types-as-proofs], you can imagine using linear logic for types of values that are *consumed* by a computation – i.e., they’re no longer available once they’ve been used.
I’m going to adopt a slightly different format for the sequents for working in linear logic. The way that I produced the center bars in yesterdays post was really painful to write, and didn’t even end up looking particularly good. So, the way that I’m going to right the sequents in this post is to wrap the “top” and “bottom” of the sequent in curly braces, and separate them by a “⇒”, as in:

{GivenContext :- GivenEntailment } ⇒ { InferredContext :- InferredEntailment}

Now, let’s take a look at the sequent rules for propositional linear logic. I’m using the version of these rules from [Patrick Lincoln’s SIGACT ’92 paper][sigact92]. Yeah, I know that’s a bit of a big load there. Don’t worry about it too much; the important part is the concept described up above; the sequents are useful to look at when you have a hard time figuring out what some operator means in inference. For example, you can see the difference between & and ⊕ (which I found confusing at first) by looking at their sequents, to see what they do.
1. **Identity**: { } ⇒ { A :- A }
2. **Cut**: { Γ1 :- A, Σ1   Γ2, A :- Σ2 } ⇒ { Γ1, Γ2 :- Σ12}
3. **Exchange Left**: { Γ1, A, B, Γ2 :- Σ } ⇒ { Γ1, B, A, Γ2 :- Σ }
4. **Exchange Right**: { Γ :- Σ1, A, B, Σ2 } ⇒ { Γ :- Σ1, B, A, Σ2}
5. **⊗ Left**: {Γ, A, B :- Σ} ⇒ { Γ, A ⊗ B :- Σ }
6. **⊗ Right**: { Γ1 :- A, Σ1   Γ2 :- B, Σ2} ⇒ { Γ1, Γ2 :- (A ⊗ B), Σ12}
7. **-o Left**: { Γ1 :- A, Σ1  Γ2, B :- Σ2 } ⇒ { Γ12, (A -o B) :- Σ12}
8. **-o Right**: { Γ, A :- B, Σ} ⇒ { Γ :- A -o B, Σ}
9. **⅋ Left**: { Γ1,A :- Σ1  Γ2 :- B, Σ2 } ⇒ { Γ12, (A ⅋ B) :- Σ12}
10. **⅋ Right**: { Γ :- A, B, Σ} ⇒ { Γ :- A ⅋ B, Σ}
11. **& Left**: { Γ, A :- Σ } ⇒ { Γ,A & B :- Σ}/{ Γ, B :- Σ } ⇒ { Γ,A & B :- Σ}
12. **& Right**: { Γ :- A,Σ   Γ :- B,Σ} ⇒ { Γ :- (A & B), Σ }
13. **⊕ Left**: Γ,A :- Σ   Γ,B :- Σ} ⇒ { Γ,A ⊕ B :- Σ}
14. **⊕ Right**: {Γ :- A,Σ} ⇒ {Γ :- A ⊕ B, Σ}/{Γ :- B,Σ} ⇒ {Γ :- A ⊕ B, Σ}
15. **!W**: {Γ :- Σ} ⇒ {Γ,!A :- Σ}
16. **!C**: {Γ,!A,!A :- Σ} ⇒ { Γ,!A :- Σ }
17. **!D**: { Γ, A :- Σ} ⇒ { Γ,!A :- Σ}
18. **!S**: { !Γ :- A, ?Σ} ⇒ { !Γ :- !A, ?Σ}
19. **?W**: {Γ :- Σ} ⇒ {Γ:- ?A, Σ}
20. **?C**: {Γ :- ?A,?A, Σ} ⇒ { Γ :- ?A, Σ }
21. **?D**: { Γ :- A, Σ} ⇒ { Γ :- ?A, Σ}
22. **?S**: { !Γ, A :- ?Σ} ⇒ { !Γ,?A :- ?Σ}
23. **⊥ Exp Left**: { Γ :- A, Σ} ⇒ { Γ, A :- Σ}
24. **⊥ Exp Right**: { Γ, A :- Σ} ⇒ { Γ :- A, Σ}
25. **0 Left**: { Γ,0 :- Σ } *(Nothing can be inferred)*
26. **⊤ Right**: { Γ :- ⊤,Σ}
27. **⊥ Left**: { ⊥ :- }
28. **⊥ Right**: {Γ :- Σ} ⇒ {Γ :- ⊥,Σ}
29. **1 Left**: { Γ :- Σ} ⇒ { Γ,1 :- Σ}
30. **1 Right**: { :- 1}
This is long enough that I’m not going to get into how this hooks into category theory today, except to point out that if you look carefully at the multiplication and exponentiations, they might seem a tad familiar.
[types-as-proofs]: http://goodmath.blogspot.com/2006/06/finally-modeling-lambda-calculus.html
[sigact92]: http://www.csl.sri.com/~lincoln/papers/sigact92.ps
[yesterday]: http://scienceblogs.com/goodmath/2006/07/a_brief_diversion_sequent_calc.php