Why Axiomatize Set Theory?

Naive set theory is fun, and as we saw with Cantor’s diagonalization, it can produce some incredibly beautiful results. But as we’ve seen before, in the simple world of naive set theory, it’s easy to run into trouble, in the form of Russell’s paradox and a variety of related problems.

For the sake of completeness, I’ll remind you that Russell’s paradox concerns the set R={ s | s ∉ s}. Is R∈R? If R∈R, then by the definition of R∉R. But by definition, if R∉R, then R∈R. So R is clearly not a well-defined set. But there’s nothing about the form of its definition which is prohibited by naive set theory!

Mathematicians, being the annoying buggers that they are, weren’t willing to just give up on set theory over Russell’s paradox. It’s too beautiful, too useful an abstraction, to just give up on it over the self-reference problems. So they went searching for a way of building up set theory axiomatically in a way that would avoid problems by making it impossible to even formulate the problematic statements.

To some extent, from a modern viewpoint, it can seem a little silly. The problems of naive set theory involve self-reference issues. And most modern math geeks have absorbed the fact that self-reference is inevitably a problem. That’s the simplistic understanding that most of us have of Gödel’s incompleteness theorem: that in any formal system, whether it’s set theory or something else, if it’s expressive enough to be a complete system, there’s some way of forcing it to allow the formulation of self-reference statements that introduce problems like Russell’s.

There’s two things to remember when you think that:

  1. The initial work on axiomatizing set theory predates Gödel.
    The late 19th/early 20th century mathematicians didn’t know about incompleteness,
    and believed that you could create a perfect universal mathematics. So trying
    to form a perfect axiomatization of set theory was a very natural thing for them
    to try to do.
  2. Naive set theory makes problems like Russell’s paradox extremely easy to
    form. They’re ubiquitous. Naive set theory is a beautiful idea, but isn’t a
    particularly well-formed mathematical theory. Axiomatizing it, so that it has
    a solid formal basis is a useful thing to do. It means that until you
    push it out to its limit, that it’s a well-formed valid mathematical theory. So
    even recognizing that we can’t complete avoid the kinds of issues that permeate
    naive set theory, we can reduce their impact, and most of the time,
    avoid them entirely.

The first step in axiomatic set theory is the class/set distinction. I
find that a metaphor comparing set theory to predicate logic is useful for understanding
the reason for that distinction.

Consider a naive version of predicate logic. Not first order predicate logic,
but a fully general predicate logic. In a logic like that, we wouldn’t distinguish between
predicates and objects – that is, a predicate can range over predicates. So we can have
statements like P(P). Suppose that P(x) is defined as meaning “the predicate x is always
false”. Now, suppose that we say P(¬P). That’s a statement that it’s false that P is
always false. If P is always false, then that means that P is wrong when it says
that ¬P is always false. We’ve just found Russell’s paradox in predicate logic form –
the liar’s paradox.

How does predicate logic avoid that problem? By creating a first-order version of predicate logic, in which predicates are not allowed to reason about predicates, and you can’t use quantifiers ranging over predicates to create universal statements. So suddenly, P(¬P) becomes an ill-formed statement.

That’s pretty much what the set/class distinction accomplishes. Proper classes –
classes that aren’t sets – aren’t members of other classes. They’re like the predicates,
which you can’t range over in first order logic. Sets are like the things that can be
quantified over in first-order predicate logic. By doing this, we lose a lot of expressive
ability: a lot of things become excluded from the space of things that we can reason over.
But just like first order predicate logic, most of the time, that expressive limit is acceptable, and it allows us to escape the most common forms of problems that make
the system unsound.

So mathematicians created a structure in which we made that distinction – and create a formal system of axioms that defined the entire basis of the theory. With those axioms, set theory becomes a complete, well-founded basis for mathematics.

Next time, we’ll take an initial look at the axioms, and roughly what they mean. Then we’ll dive in and take a deep look at a couple of the more interesting or controversial ones.

0 thoughts on “Why Axiomatize Set Theory?

  1. Chad Groft

    Referring to Russell’s paradox vs. Gödel’s theorem: Another thing you have to remember is that Gödel’s theorem only proves that contradictions arise in complete systems (with decidable sets of axioms, but that’s not an issue here). Just because a system is incomplete does not mean that it’s incorrect. That’s like calling a person a liar because [s]he’s willing to say “I don’t know” on occasion.
    It’s better to think of Russell’s result and Gödel’s as completely separate, even if the method of proof is somewhat similar. Russell is saying that certain systems, like naïve set theory or Frege’s original system, are invalid. Gödel says that if certain other systems, such as PA, ZF, or NF, are valid (and they may well be), then they don’t encompass all truths.

    Reply
  2. Ed Minchau

    Doesn’t fuzzy logic solve the whole Epimenides paradox? That is, Epimenides was half right. So, no more excluded middle, and no more Russell’s paradox.

    Reply
  3. Ryan Dickie

    I really liked this post. You always find a way of explaining things quite clearly and in simple terms.
    But here i’m going to go off on a tangent. Sometimes i try to think of the universe as a computer. I always find it interesting to think that with simple logic we can be incomplete or inconsistent. I find it amusing to think that if we tried to completely model the universe using mathematics we may find similar problems. Is it possible these paradoxes could manifest themselves in nature (or at least the mathematical formulation of nature)?

    Reply
  4. albion tourgee

    A very nice posting, and excellent comments, particularly Chad’s interesting view of the distinction between Russell and Godel. A wonderful book I once read on set theory by Stephen Pollard proposed that set theory is really more about theories than sets. With regard to Ryan’s post, Pollard advocates thinking of these issues in terms of interpretability of one theory in another theory. So, if you think of the universe as a computer, you may be able to make some progress by interpreting computer theory in a branch of mathematics and then working back and forth. Anyway, Mark, keep up the great posts.

    Reply
  5. Kurt

    I suppose I’m overlooking something obvious, but I don’t follow your example here. Why would the statement P(¬P) be considered paradoxical?

    Reply
  6. Dustin

    Kurt: it’s the formal way of writing the liar’s paradox.
    I’ve always got a small chuckle out of the class/set distinction, but the restrictions of sets by the ZF axiom of comprehension is a necessary nit-pick.

    Reply
  7. Dustin

    @albion

    A wonderful book I once read on set theory by Stephen Pollard proposed that set theory is really more about theories than sets.

    I’d go a bit further than to simply propose that it’s more about theories than sets. It’s a rigorous axiomatic theory, and the sets we know and (might) love are just a model for the axiomatic theory. A set is just taken to be one of the primitive objects of ZFC. It’s a little like formal geometry — formal geometry isn’t about particular lines, it’s a theory of properties of primitive objects that conform to its axioms.
    ZFC, when rigorously axiomatized, is a first order logic. In order to do that, the distinction between membership and containment is dropped — so it already takes on a formal character that we can’t use effectively when trying to distinguish between membership and containment, and that’s important when we’re working with concrete examples of sets — like “The Event ‘3 blue marbles are drawn from the bag'”.
    So I agree.

    Reply
  8. Bob O'H

    Ed – I’m guessing (because I haven’t looked at this properly), but the statement “X has a truth value of 0.4” either has a truth value of 0 or 1. Hence the excluded middle appears at the next level up. So, Epimenides and Russell just meet on the next floor.
    Now someone can show why I’m wrong. 🙂
    Bob

    Reply
  9. Dustin

    Also, the point of ZFC is to give a rigorous formulation to the foundation of mathematics. Also, it wasn’t some kind of presuppositionalist hooey — we needed a formulation which would account for the mathematics we already had.
    Fuzzy logic is great for applications and computational problems, but it isn’t something that we can justify the whole of mathematics with. I think there will be a bright future for fuzzy logic in mathematics, since certain mathematical theorems are so difficult to prove with certainty, but it’s still inadequate as a basis for mathematics.
    There are mathematicians who would agree with discarding the excluded middle even in the course of pure mathematics, but you lose everything but constructive proof in that case. Anyway, they still aren’t using fuzzy logic, since there isn’t anything to be approximated. They’re using what they call “intuitionist logic”.

    Reply
  10. Mark C. Chu-Carroll

    Caledonian:
    That’s a phenominally stupid statement. Ever heard of modal logics? Fuzzy logics? They’re fully meaningful, but they don’t have binary truth values. In most modal logics, you have at least true, false, and undetermined truth; fuzzy logic has degrees of truth.

    Reply
  11. Dustin

    The nervous system doesn’t operate on binary truth values, either. Yet, we’re able to gather meaningful information from everyday experience.
    I think fuzzy logic’s intermediate truth values could be helpful in this new trend of computational/empirical investigation of very difficult-to-prove theorems. Say we’d like to prove that objects with property X must have property Z, but have only managed to prove that objects with properties X & Y have property Z. Why not simply come up with intermediate truth values for Y? So now, thing X has property Y with fuzzy truth value .9999. Now, what does that do to the validity of our proof?
    It would be like sensitivity analysis for pure mathematics. A necessity, now that proofs are getting to be tens of thousands of pages in length.

    Reply
  12. Norm Breyfogle

    Caldonian, I’m getting the impression that you’re either smart for your fairly young age and have only mastered either/or thinking, or that you’re not particularly young but have simply not explored anything much beyond either/or thinking. Are either of these speculations true?
    I don’t mean to offend; I’m truly curious.

    Reply
  13. Thony C.

    Caledonian:
    That’s a phenominally stupid statement. Ever heard of modal logics? Fuzzy logics?

    and not to forget quantum logic.

    Reply
  14. Steve Lockwood

    I am interested in a number of points. Graham Priests dialetheism solves the Liar Paradox by accepting the conclusion that the liar is both true and false.
    Within the fields of science all paradoxes have either been resolved or are being resolved, paradoxes don’t exist in nature. Consider Olber’s Paradox.
    We have to consider how very difficult it is for many to grasp the paradoxical nature of many ideas. Russel’s Set is very difficult to grasp. Whereas the Barber Paradox is easy. The down-side is it is also trivial.

    Reply
  15. Dustin

    But there’s no reality to force resolution of mathematical paradoxes. Mathematics is an entirely mental activity which, in its purest form, has nothing to do with physical reality. That’s why Russell’s Paradox is a real paradox, and why set theory needed to be axiomatized to destroy it. Mind you that a paradox, being logically absurd, is the ultimate goal of a proof by contradiction. A system which allows paradoxes is one which would also allow us, by reduction to absurdity, to prove anything.
    Regarding dialetheism, that doesn’t solve the problem — it dismisses it. Dialetheism also isn’t a formal system. I’m going to try to contain my contempt for it, and I’ll simply point out that because it can’t be formalized, it can’t be a basis for mathematics.

    Reply
  16. Canuckistani

    Chad Groft, I’m going to resort to the appeal to authority to defend Mark’s choice here: Douglas Hofstadter used Russell’s paradox to introduce Goedel’s theorems in Goedel, Escher, Bach: an Eternal Golden Braid.
    Ed Minchau, fuzzy logic easily resolves the Sorites paradox, but I’ve never heard of it being applied to the liar’s paradox. I’d want to see the math worked through on that one.

    Reply
  17. Kurt

    Dustin said: Kurt: it’s the formal way of writing the liar’s paradox.
    Well, yes, that’s what Mark said in his post. However, I fail to see the paradoxical nature of statement P(¬P). Now, either I am having a failure of understanding (it certainly wouldn’t be the first time for that), or Mark misspoke in his definition of P. Which do you suppose is the case? And can you convince me of your position?

    Reply
  18. Ørjan Johansen

    Kurt: You are right, Mark’s definition of P is broken. The definition should be P(x) = (x(x) is false), and then P(P) is paradoxical. Applying an “always” leaves the non-paradoxical possibility that P is sometimes but not always false. On the other hand the original famous statement of the liar paradox may have been broken in this way too (“All Cretans lie”, as stated by a Cretan).

    Reply
  19. Chad Groft

    @Dustin:

    Also, the point of ZFC is to give a rigorous formulation to the foundation of mathematics.

    Not really, no. The point of ZFC is to give an axiomatization for the construction of well-orderings and cardinalities. That it gives a foundation for everything else (except possibly category theory) is nice, but unintentional. For one thing, ZFC is much, much stronger than one really needs to do “normal” math, thanks to the replacement axioms. I’ll refer you to the Appendix of Akihiro Kanamori’s The Higher Infinite on this one.
    Also, could you please clarify what you mean by “the distinction between membership and containment is dropped”?
    @Canuckistani:

    I’m going to resort to the appeal to authority to defend Mark’s choice here: Douglas Hofstadter used Russell’s paradox to introduce Goedel’s theorems in [GEB].

    Appeal to authority all you want. The simple fact is that Mark is flat-out wrong in his interpretation of Gödel’s theorems, and has been for some time, whether he cares to acknowledge it or not. Self-reference does in fact create problems for any theory which extends basic number theory (it must be undecidable, therefore incomplete if it has a decidable set of axioms), but that doesn’t automatically make such a theory self-contradictory.
    The problem Russell found is genuinely worse in context than that which Gödel found. A contradiction makes a formal theory completely useless. ZFC is not completely useless (as far as we know).
    Now, I don’t know exactly what Hofstadter said, and I don’t have my copy of GEB on hand, but just because he uses Russell’s paradox to introduce Gödel’s theorem (which is valid, to an extent — as I said, the arguments are very similar, modulo a ton of technical details for Gödel’s) does not mean that their meanings are the same. If Hofstadter claims this — which I very much doubt, lacking a direct quote — then he’s flat-out wrong too.

    Reply
  20. Brian Jaress

    I think you can keep the same definition of P that Mark gave and just have P(P) as a paradox:
    P(not P) = “P is always true”
    P(P) = “P is always false”
    Defining P(x) using x(x) works, but I don’t see any problem with using “always.” (I don’t understand the bit about “always” meaning “sometimes.”)

    Reply
  21. Torbjörn Larsson, OM

    I find it amusing to think that if we tried to completely model the universe using mathematics we may find similar problems. Is it possible these paradoxes could manifest themselves in nature (or at least the mathematical formulation of nature)?

    I don’t think they manifest as in the models, for several reasons.
    Hopefully Chad and others can correct me if I’m wrong, but I think Gödel permits us to add the theorems that we discover isn’t covered by the previous axioms as new axioms – at some risk for errors, perhaps. But AFAIK there is no formal reason to not being able to find consistent formal models of nature. See the parallel axiom and its three possibilities.
    Also, for convenience or lack of knowledge we use several models. We wouldn’t use string/M theory to model evolution, for example. But say that M theory will be the fundamental theory of nature. It is quite likely that it in principle would be reducible to rather simple math foundations, without a lot of the above adding of axioms. And as noted, it must avoid paradoxes and singularities.
    At the same time, formal problems such as paradoxes and singularities inform us about new theories but also it seems they can inform us about nature. Some singularities have physical manifestations, and some computational problems seems to have it too.
    Scott Aaronsson has a paper where he explores P != NP, and finds that when models are difficult computationally they are so for nature as well. NP problems takes time to arrive at a chosen equilibrium and/or isn’t definitive since the systems easily get stuck in non-optimal, quasi-steady state solutions. (Otherwise we could use analog experiments in nature to solve our computational problems.)

    Caledonian:
    That’s a phenominally stupid statement. Ever heard of modal logics? Fuzzy logics?

    and not to forget quantum logic.

    And not to forget facts. By appropriate mappings we can reduce facts to trinary logic: “yes”, “no, “don’t know (yet)”. And we can still reason about facts and theories on facts – or, at least, most of us.

    Reply
  22. Torbjörn Larsson, OM

    And we can still reason about facts and theories on facts – or, at least, most of us.

    Reading Caledonian more carefully, it is revealed that this is seriously in error.
    I meant to write “And we can still reason meaningfully about facts and theories on facts – or, at least, most of us”, of course!

    Reply
  23. Chad Groft

    re: #23:

    but I think Gödel permits us to add the theorems that we discover isn’t covered by the previous axioms as new axioms – at some risk for errors, perhaps.

    That’s absolutely right. If we have a formal theory T, and we have reason to believe that it’s true (on the natural numbers or some system containing them), then the formalization of the statement “T is consistent” is true, but unprovable from T. That means that the stronger theory T’ = T + Con(T) is also true, hence consistent; but we can’t prove that from T’, so the still stronger theory T” = T’ + Con(T’) is true, etc. Look up Torkel Franzen’s book Inexhaustibility if you want to know more.

    But AFAIK there is no formal reason to not being able to find consistent formal models of nature.

    That’s my understanding as well; it may (should?) be possible to find physical laws which completely determine the evolution of the universe. Whether we can actually predict that evolution by any computation is another matter. I’m not sure one way or the other, but I suspect not given the chaotic systems which exist.

    Reply
  24. Kurt

    Brian Jaress said:
    I think you can keep the same definition of P that Mark gave and just have P(P) as a paradox:
    P(not P) = “P is always true”
    P(P) = “P is always false”

    The thing that makes a statement paradoxical is that there is no consistent way to assign a truth value to it. The two statements above are both simply false.
    If that is not apparent, then consider that Mark’s definition of P was: Suppose that P(x) is defined as meaning “the predicate x is always false”. In this case, P is actually pretty trivial. Aside from the choice of domain, there is only one predicate which is always false, namely the constant function F(x)≡false for all x. So P(F)=true, and P(Q)=false for all Q≠F. In other words, P is true some of the time and false some of the time.

    Reply
  25. Brian Jaress

    Kurt: Ah, now I get it. I didn’t understand the thing about “always” because I thought he was talking about following an “always” rule — when he was really talking about breaking it without paradox.
    Now that I understand it, it seems like it should have been obvious.
    Is there a definition where P(not P) is a paradox? (I may be wrong, but I think the x(x) definition gives a paradox for P(P) but not for P(not P)).

    Reply
  26. scottb

    Dustin
    Dialethism *is* a formal system. It’s just not based on classical logic. Priest formulates the whole thing using paraconsistent logic in “In Contradiction”.
    Nor is it dialethism that simply dismisses the problem. ZF is widely thought of as providing a foundation for mathematics, and it’s a pretty good one. But it’s a classical example of how our language limits the thoughts we can have.
    ZF prohibits us from quantifying over proper classes – and yet to *say* that, axiomatically, requires quantifying over a proper class – to say *anything* about all classes requires quantifying over the class of all classes – an object ZF insists doesn’t exist.

    Reply
  27. Xanthir, FCD

    To the best of my knowledge, Chad is still wrong, and Mark is likely right.
    Godel’s exercise forms a contradiction. Or, rather, it *would* form a contradiction if you assume that the logic is complete. Thus, if it’s complete, it’s inconsistent.
    You can, however, abandon completeness to save consistency. You simply say that Godel’s statement is unanswerable in the formal system it was constructed in – it has no truth value.
    The latter course is generally taken, as an incomplete system is simply useless in (hopefully) isolated instances, whereas an inconsistent system is almost completely useless (unless you have a sure-fire way to know that you’re avoiding a (possibly implicit) contradiction in every proof that you make). You also lose reductio ad absurdum, which is an extremely useful and basic proof strategy.
    So, I still believe Mark is correct in his treatment of Russell’s paradox here. You could treat question, “Is R in R?” as paradoxical. Or you could simply say that it’s an unanswerable question. Either way, it runs afoul of the same essential problem that Godel hit on.

    Reply
  28. Torbjörn Larsson, OM

    Chad:
    Thanks for the comments and the reference! I will add it to the inexhaustible staple of to-reads. 🙂
    Yes, I too think that unpredictability and other ‘approximation’ problems would outdo us. And probably purely computational problems as well, some classes of string theories seems to be NP-complete.

    Reply
  29. Chad Groft

    @Xanthir: It amuses me that you say

    To the best of my knowledge, Chad is still wrong

    and follow it with

    Godel’s exercise… *would* form a contradiction if you assume that the logic is complete….. You can, however, abandon completeness to save consistency. You simply say that Godel’s statement is unanswerable in the formal system it was constructed in

    which is exactly what I said. I do see where we part ways, though:

    [Gödel’s statement] has no truth value.

    This is where you go awry: you confuse truth with provability in T. They are not the same. Given the relevant universe (say, the set of natural numbers with 0, successor, +, and *), every first-order sentence has an unambiguous interpretation and a well-defined truth value. Every statement, including Gödel’s, is either true or false; there is no such thing as a statement with, as you say, “no truth value”. The connection between truth and provability is one-way: If all the axioms of T evaluate as true, then so will all the theorems.
    Indeed, this is how Russell’s paradox works: he comes up with a set B which cannot exist, because the statement “B is an element of B” has no well-defined truth value. If your interpretation of truth was correct, this wouldn’t be a problem (and we’d lose the law of excluded middle).
    Gödel, on the other hand, comes up with a sentence s which (provably in T) is equivalent to “there is no encoding of a proof of s from the axioms of T”. If we assume as above that T is valid on the natural numbers, then s is true iff there is no proof of s from T. From here s must be true; otherwise it is both false and provable from T. So s must be true, but you cannot argue this fact from T alone, and the theorems of T do not encompass all truths.
    Are you seeing the distinction yet? Russell shows that certain systems must be false, because contradictory; Gödel shows that certain other systems, which are probably true on philosophical grounds, cannot be the whole story.
    (I should point out that there must be universes where the axioms of T hold but s does not; otherwise s would be provable from T. I state only that s is true in the actual natural numbers.)

    Reply
  30. Chad Groft

    Apologies if this is a double-post; I got an error last time.
    @Xanthir: It’s amusing that you say

    To the best of my knowledge, Chad is still wrong

    and follow it with

    Godel’s exercise… *would* form a contradiction if you assume that the logic is complete….. You can, however, abandon completeness to save consistency. You simply say that Godel’s statement is unanswerable in the formal system it was constructed in

    which is exactly what I said. I do see where we part ways, though:

    [Gödel’s statement] has no truth value.

    This is where you go awry: you confuse truth with provability in T. They are not the same. Given the relevant universe (say, the set of natural numbers with 0, successor, +, and *), every first-order sentence has an unambiguous interpretation and a well-defined truth value. Every statement, including Gödel’s, is either true or false; there is no such thing as a statement with, as you say, “no truth value”. The connection between truth and provability is one-way: If all the axioms of T evaluate as true, then so will all the theorems.
    Indeed, this is how Russell’s paradox works: he comes up with a set B which cannot exist, because the statement “B is an element of B” has no well-defined truth value. If your interpretation of truth was correct, this wouldn’t be a problem (and we’d lose the law of excluded middle).
    Gödel, on the other hand, comes up with a sentence s which (provably in T) is equivalent to “there is no encoding of a proof of s from the axioms of T”. If we assume as above that T is valid on the natural numbers, then s is true iff there is no proof of s from T. From here s must be true; otherwise it is both false and provable from T. So s must be true, but you cannot argue this fact from T alone, and the theorems of T do not encompass all truths.
    Are you seeing the distinction yet? Russell shows that certain systems must be false, because contradictory; Gödel shows that certain other systems, which are probably true on philosophical grounds, cannot be the whole story.
    (I should point out that there must be universes where the axioms of T hold but s does not; otherwise s would be provable from T. I state only that s is true in the actual natural numbers.)

    Reply
  31. Xanthir, FCD

    Okay, I see what you’re saying, Chad. Russell’s Paradox constructs a statement with no well-defined truth value at all, while Godel’s statement merely has no well-defined truth value within T.
    I do see the distinction now. It wasn’t that I misunderstood anything, it’s just that the area you were pointing to was flying right past me. I also see the reasoning behind your original statement, that the two proofs should be regarded as completely separate, though coincidentally similar, results.

    Reply
  32. Pete Dunkelberg

    If U is undecidable, create two models: one with U as a new axiom, the other with not U. No problem.

    Reply
  33. Chad Groft

    Well, yes, Pete, you can do that. The question is whether U is true in the intended model, i.e., then actual natural numbers. As I argued above, the Gödel sentence must be true there.

    Reply

Leave a Reply