The Axioms of Set Theory

35 Replies

Axiomatic set theory builds up set theory from a set of fundamental initial rules. The most common axiomatization, which we’ll be used, is the ZFC system: Zermelo-Fraenkel with choice set theory. The ZFC axiomatization consists of 8 basic rules which are pretty much universally accepted, and two rules that are somewhat controversial – most particularly the last rule, called the axiom of choice.

There’s an interesting parallel between the axioms of ZFC and the axioms of Euclidean logic. Euclidean logic consists of a set of universally accepted rules, and one strange one – the parallel axiom – which was problematic for many years, until someone finally realized that you could create a variety of alternate geometries by using variations on the parallel axiom. Similarly, with set theory, there’s the initial set of basic axioms, and then there’s the axiom of choice; if you throw away the axiom of choice, you can construct set theories with different properties.

Now, it’s time for a first look at the basic list of axioms. An important thing to remember as you look at them is that axiomatic set theory is intended to be a foundational theory of mathematics – and so the only objects that exist in the domain of the theory are sets – no numbers, no functions, nothing. It’s sets, all the way down.

The Axiom of Extensionality: ∀A,B: A=B ⇒ (∀C: C∈A ⇒ C∈B).: This is a formal way of saying that a set is described by its members: two sets are equivalent if and only if they contain the same members.
The Empty-Set Axiom: ∃∅: ∀X: X∉∅.: There exists a set, the empty set, which contains no members.
The Axiom of Pairing: ∀A,B: (∃C: (∀D: D∈C ⇒ (D=A ∨ D=B))): This one is very hairy in logical form. What it really says is that given any two sets A and B, there’s a set C containing only A and B as members. Grind through the logical form, and that’s what it says: for any sets A and B, there’s a set C, and any member of C is equal to either A or B.
The Axiom of Union: ∀A: ∃B: ∀C: C∈B ⇒ (∃D: C∈D ∧ D∈A): Once again, the precise formal statement in FOPL is tough going. There’s an easier way to write it using the union symbol: ∀A: ∃B : B=∪_a∈Aa; that is, for any set A, there’s a set B consisting of the unions the members of A.
The Axiom of Infinity: &exists;N: ∅∈N ∧ (∀x: x∈N ⇒ x∪{x}isin;N): This is the first of the axioms that I think is actually difficult. Some of the earlier ones can be hard to read, but once you get through the notation, they’re really pretty simple. This one, the notation isn’t so bad, but the meaning is a bugger. What is says is, there’s a set that (a) contains the empty set as a member, and (b) for each of its members x, it also contains the singleton set {x} containing x. So, if following the formal statement, we called that set N, N contains ∅, {∅},{{∅}}, {{{∅}}}, etc. What the axiom of infinity does is really two basic things: it gives us our first countably infinite set; and it gives us a construction which can be turned into Peano integers.
The Meta-Axiom of Specification: ∀A: &exists;B: ∀C: C∈B ⇒ C∈A ∧ P(C): This is pretty simple, but it’s got one trick. The axiom of specification is really a second-order axiom. But in order to make it work in a first-order logical framework, we cheat, and say it’s a scheme for what’s actually an infinite set of axioms – for any predicate P, there’s another instantiation of the axiom of specification. What it says is that given a set A and a predicate P, there’s a set B consisting of the members of A for which P is true. Another way of saying that is that for any set A, we can define a subset of A using a predicate which selects members of A.
The Meta-Axiom of Replacement: ∀A: ∃B: ∀y: y∈B ⇒ ∃x∈A: y=F(x).: This is another meta-axiom. It’s a tricky one in its way. One way of looking at it as saying that you can define functions in terms of predicates: P(x,y) is a function if the set defined by it has the necessary properties. We’ll talk more about it later.
The Powerset Construction Axiom: ∀A: ∃B: ∀C⊆A: C∈B: This is a nice, easy one. For any set A, the powerset – that is, the class of all subsets of A – is a set.
The Foundation Axiom: ∀A≠∅: ∃B∈A: A∩B=∅: This one has had its moments of controversy, but it’s really pretty simple in concept. Every set A contains some member B which is a set completely disjoint from A.
The Axiom of Choice: ∀X: ((∀A∈X: A≠∅) ∧
(∀B,C≠∈X: B∩C=∅)) ⇒
(∃Y: ∀I∈X:∃!J∈Y: J∈I): For any disjoint set of non-empty sets X, there is a set Y called the choice set for X containing exactly one member from each element of X.

There we are. That’s it: set theory in a nutshell. You can derive pretty much all of mathematics from those 10 axioms, plus simple first order predicate logic. The integers fall out pretty naturally from the axiom of infinity; once you’ve got the integers, you can use the axiom of pairing to create the rationals; once you’ve got the rationals, you can use these axioms to derive Dedekind cuts to get the reals; once you’ve got the reals, you can use the axiom of replacement to get the transfinites. It just all flows out from these 10 rules.

The amazing thing is that they’re not even particularly hard – the full meanings of some of them should be obvious already; and the others, you’ll see when we get to them that it doesn’t take too much to really understand them. It took some real genius to derive these rules; figuring out how to draw down the entirety of set theory into 10 rules, while preventing problems like Russell’s paradox is an astonishingly difficult task. But once a couple of geniuses did that for us, the rest of us dummies are in great shape: we don’t need to be able to derive them; we just need to understand them.

35 thoughts on “The Axioms of Set Theory”

Davis May 20, 2007 at 11:18 pm

Just a couple of clarification requests — in Specification, it would be helpful to state what P denotes (it’s not clear until you get well into the explanation, which makes a reading of the formal statement difficult on the first pass); in Replacement, the same goes for F — even within the explanation, it’s not entirely clear what F is supposed to denote.

Loading...

Reply ↓
Dustin May 20, 2007 at 11:55 pm

The axiom of specification is really a second-order axiom. But in order to make it work in a first-order logical framework, we cheat, and say it’s a scheme for what’s actually an infinite set of axioms

Man, I totally forgot about that. And that’s the axiom I quote the most in discussions about ZFC.

Loading...

Reply ↓
micah May 20, 2007 at 11:58 pm

Maybe I’m confused, but don’t you need the implication in Specification to run in both directions? Otherwise you could just take B to be the empty set…

Loading...

Reply ↓
Chris Ho-Stuart May 21, 2007 at 12:00 am

The statement of the axiom of extensionality looks wrong.
I think you want double implication signs in there. Of course… it could just be the fonts we are using?
Here is a cut and paste from your blog. The arrows are unicode U-21D2 (rightwards double arrow)
∀A,B: A=B ⇒ (∀C: C∈A ⇒ C∈B)
Here it is, with U-21D4
∀A,B: A=B ⇔ (∀C: C∈A ⇔ C∈B)
In my broswer, the maths fonts come out badly unless I increase the text size.

Loading...

Reply ↓
Dustin May 21, 2007 at 12:06 am

I think it’s just the font. All of the double implications seem to be missing, both in specification and in extensionality.

Loading...

Reply ↓
Mobius May 21, 2007 at 1:01 am

Actually, Axiomatic Set Theory (or Zermelo Fraenkel Set Theory) is independent of the Axiom of Choice (AC). Godel proved in 1936 that is was impossible to disprove Ac using the other axioms. Cohen proved in 1963 that it was impossible to prove AC.

Loading...

Reply ↓
Chris May 21, 2007 at 1:09 am

IMHO the existence of the empty set doesn’t need its own axiom. It follows from the axioms of infinity and specification: Take an infinite set A and the predicate “x != x”, which is always false. Then apply the axiom of specification and you get the empty set.

Loading...

Reply ↓
Davis May 21, 2007 at 2:30 am

Take an infinite set A and the predicate “x != x”, which is always false.

This doesn’t quite do what you intend — it still doesn’t imply that ∀X: X∉∅; it only implies it for X∈A. It’s possible with some more work you could deduce the empty set from Specification — demonstrate there *is* such a set via Specification, and then deduce that ∀X: X∉∅. I suspect this won’t actually work for some reason, but it’s apparently too close for my bedtime for me to see why.

Loading...

Reply ↓
Thony C. May 21, 2007 at 2:54 am

Mobius wrote:

Actually, Axiomatic Set Theory (or Zermelo Fraenkel Set Theory) is independent of the Axiom of Choice (AC). Godel proved in 1936 that is was impossible to disprove Ac using the other axioms. Cohen proved in 1963 that it was impossible to prove AC.

Unless I am very much mistaken you are confusing the Axiom of Choice with the Continuum Hypothesis whose independence from ZFC was indeed proved by the people you name at the times you name meaning that there are, analogous to Euclidian and Non-Euclidian Geometries, Cantorian and Non-Cantorian Set Theories.

Loading...

Reply ↓
Chad Groft May 21, 2007 at 3:07 am

@Mobius and Thony C.: Both the Choice question and the Continuum Hypothesis were “resolved” in the way you claim, and by the same people. Gödel started with ZF, build the class of constructible sets as a “sub-universe”, and showed that this sub-universe satisfied ZFC + GCH. Paul Cohen found a way to start with the constructible sets (modern treatments don’t need to do this, but his version does) and “add new sets” in ways that make ZFC true but CH false, or ZF true but Choice false.
Short version: neither of you are confused, but you might have been working from incomplete information.

Loading...

Reply ↓
Chad Groft May 21, 2007 at 3:11 am

To those of you worried about Specification and the empty set: this is due to the misprints people noticed earlier. The [Meta-]Axiom of Specification should be “For every predicate P and set A, there is a set B which contains precisely those elements of A which satisfy P”. One of the arrows should have two arrowheads on it.
Given this, and given a set A, one can in fact construct the empty set by the predicate P(x) which is “x ≠ x”. The set B one obtains from Specification is in fact the empty set: in order to be a member of B, a set x must be in A and satisfy P(x), and no set has that property; so x is not in B for any x, and B is empty.

Loading...

Reply ↓
Chad Groft May 21, 2007 at 3:19 am

Incidentally, it would be completely valid to simply take the converses of Extensionality, Pairing, Union, and Powerset, rather than make the arrows in them double-headed. The sentence currently in place is logically valid (one of the rules of equality), and once you fix Specification, you can get the actual unordered pair, union, and powerset by that axiom.
The converse versions are sometimes called the weak versions of the axioms. It’s often more convenient to show that some potential model of set theory satisfies the weak versions of the axioms than the strong ones, and only do the hard work once in proving Specification.
One last nit, since nobody else seemed to catch this: The set that Infinity calls for isn’t closed under the operation x -> {x}, it’s closed under x -> x u {x}. Zermelo’s original idea may have called for the first, though; his idea of, say, 5 was {4}, while von Neumann’s was {0,1,2,3,4}. von Neumann’s scheme works much more easily with cardinality and so forth, so most set theorists use it nowadays.

Loading...

Reply ↓
Torbjörn Larsson, OM May 21, 2007 at 3:52 am

IMHO the existence of the empty set doesn’t need its own axiom.

As discussed, that seems to be true. But it also seems that isn’t the axiom that is put in for clarity.
Since I am a naive set type of guy, bear with me. At first I didn’t get the Foundation axiom. Wikipedia clarifies: no set is supposed to be a member of itself. Ah, great; is that how Russel’s paradox is avoided?
No, since if the axioms “were already inconsistent, then adding the axiom of regularity could not make them consistent.” ( http://en.wikipedia.org/wiki/Foundation_axiom ) It is the requirement in the schema of of specification that the new sets it forms are subsets of existing sets that does that.
Instead it is noted on specification, that it can be derived from replacement with the empty set. “the axiom schema of separation is often left out of modern lists of the Zermelo-Fraenkel axioms. However, it’s still important for historical considerations, and for comparison with alternative axiomatizations of set theory,” ( http://en.wikipedia.org/wiki/Axiom_of_separation )
The “axiom of separation” is the axiom of specification. Some axioms have several names, seems there is quite some history here. 🙂
Btw, it seems the requirement for subsets of existing sets seems to be repeated elsewhere which is what makes the axioms a bit hard to read. And perhaps that is why we want to enforce equivalence in at least one of the definitions, if I understand Chad correctly.

Loading...

Reply ↓
Torbjörn Larsson, OM May 21, 2007 at 3:57 am

Also, as I understand it Chad’s comment on ordering explains the Foundation axiom (and why it is called “regularity”), since it isn’t there to avoid Russel’s paradox:
“The axiom of regularity is arguably the least useful ingredient of Zermelo-Fraenkel set theory, since virtually all results in the branches of mathematics based on set theory hold even in the absence of regularity […] However, it is used extensively in establishing results about well-ordering and the ordinals in general.” ( http://en.wikipedia.org/wiki/Foundation_axiom )

Loading...

Reply ↓
Antendren May 21, 2007 at 4:01 am

Am I the only one who learned “Specification” as “Comprehension”?

Loading...

Reply ↓
Dustin May 21, 2007 at 4:26 am

Antendren — No. I did too.

Loading...

Reply ↓
g May 21, 2007 at 4:50 am

“Comprehension” usually means something simpler than what Mark’s called “specification”, namely going from P to { x : P(x) } — without restricting to elements of a given set. An unrestricted comprehension axiom gives you more or less all the other axioms for free, but unfortunately it also gives you Russell’s paradon. Not such a bargain after. Hence the weaker “separation” (or “specification”) scheme, which seems to be effective in avoiding paradoxes. There are other ways of weakening the comprehension axiom, and they lead to different set theories. But ZF is what almost everyone uses.

Loading...

Reply ↓
Justin Moretti May 21, 2007 at 6:45 am

Every set A contains some member B which is a set completely disjoint from A.
This is making my brain explode in clouds of sawdust and toy bunny stuffing: How can any member of a set be disjoint from it? Or is it that I’m trying to fight above my weight and ‘disjoint’ has a special meaning here?

Loading...

Reply ↓
Chris May 21, 2007 at 7:12 am

> How can any member of a set be disjoint from it?
Consider the set IN of natural numbers. The number 0 is an element of IN, it’s identified by the empty set. The intersection of the empty set and IN contains no elements. So IN has a member which is disjoint to IN itself.

Loading...

Reply ↓
Chris May 21, 2007 at 7:16 am

> How can any member of a set be disjoint from it?
Consider the set IN of natural numbers. The number 0 is an element of IN, it’s identified by the empty set. The intersection of the empty set and IN contains no elements. So IN has a member which is disjoint to IN itself.
Another view of this axiom is that there are no infinite inclusion chains:
… D ∈ C ∈ B ∈ A
Every such chain must terminate in a finite number of terms.

Loading...

Reply ↓
Mark C. Chu-Carroll May 21, 2007 at 8:23 am

Justin:
Chris beat me to it. The disjointness part of the foundation axiom prevents infinite chains of membership. And that, among other things, means that you can use inductive proofs based on membership structure of nested sets: you’re guaranteed to have a base case for building the induction.

Loading...

Reply ↓
The Enzo May 21, 2007 at 12:36 pm

I want to get the first transfinite number א tattooed on my forearm. Bad idea?

Loading...

Reply ↓
Chad Groft May 21, 2007 at 12:39 pm

re Foundation: My understanding is that most mathematicians don’t take Foundation and the definition of IN as necessarily true (who’s to say what the number 3 actually is, so long as it behaves correctly), so much as harmless and useful. In this case “harmless” means that, if the rest of the ZF axioms are consistent, then adding Foundation doesn’t change that. “Useful” means that Foundation given the set-theoretic universe a nice hierarchical structure, over which one can use induction and recursion and such, while keeping enough sets to reflect all of mathematics anyhow. (This isn’t obvious from its statement, of course.)

Loading...

Reply ↓
Torbjörn Larsson, OM May 21, 2007 at 1:19 pm

Thank you Chris and Mark for the explanation of the foundation axiom and its uses.
I take it the axiom prevents several disjoint infinite inclusion chains, such as natural numbers and even natural numbers, to mess up the process of induction and thus the definition of ordinals.
These my problems was why I was a naive set person. 😮

Loading...

Reply ↓
Torbjörn Larsson, OM May 21, 2007 at 1:21 pm

Thank you Chris and Mark

And now Chad.

Loading...

Reply ↓
Davis May 21, 2007 at 2:05 pm

The set B one obtains from Specification is in fact the empty set…

Excellent, thanks. I was worried this might not be obviously true; I don’t trust my mathematical judgment when I’m sleepy.

Loading...

Reply ↓
Thony C. May 21, 2007 at 4:13 pm

Apologies to Mobius my ignorance and thanks to Chad for the enlightenment.
One small quibble. Correctly the set theoretical paradox should be called the Zermelo-Russell Paradox as they both discovered it independently of each other. In fact Zermelo discovered it first but Russell was the first to make it public. So credit where credit is due.

Loading...

Reply ↓
Daniel Ehrenberg May 22, 2007 at 10:17 pm

Mark,
This is a really interesting post, but I have one question. Why is the meta-axiom of specification

∀A: ∃B: ∀C: C∈B ⇒ C∈A ∧ P(C)

and not

∀A: ∃B: ∀C: C∈B iff (C∈A ∧ P(C))

? It seems that with the formulation you gave, there’s a possibility that the set B omits arbitrary elements from set A which *do* satisfy the predicate. If so, the construction of set B is trivial–just the empty set. You couldn’t just reverse the conditional, because then set B could just be set A. I’m not sure if thinking about the construction is the right way to go about it, though.
It’s sorta funny how some of these axioms correspond closely to programming concepts, but not really exactly.

Loading...

Reply ↓
Daniel Ehrenberg May 22, 2007 at 10:24 pm

I looked it up, and I think this is how Wikipedia defines it.

Loading...

Reply ↓
Mark C. Chu-Carroll May 23, 2007 at 10:08 am

Daniel and others:
It’s a stupid typo. I mistyped the HTML entity character in a bunch of places; I typed ⇒ where I meant to type ⇔. I’ll try to go through the article with my notes this evening, and figure out which ⇒s are supposed to be ⇔s.

Loading...

Reply ↓
Jonathan Vos Post May 23, 2007 at 12:53 pm

arXiv:0705.3162
Title: A 5-quantifier (in,=)-expression ZF-equivalent to the Axiom of Choice
Authors: Kurt Maes
Comments: 9 pages
In this paper I present an (in, =)-sentence, AC**, with only 5 quantifiers, that logically implies the axiom of choice, AC. Furthermore, using a weak fragment of ZF set theory, I prove that AC implies AC**.
Up to now 6 quantifiers were the minimum and 3 quantifiers don’t suffice since all 3-quantifier (in, =)-sentences are decided in a weak fragment of ZF set theory. Thus the gap is reduced to the undecided case of a 4 quantifier sentence ZF-equivalent to AC.

Loading...

Reply ↓
Pete Dunkelberg May 25, 2007 at 11:32 pm

Zermelo Fried Chicken.
yummy

Loading...

Reply ↓
Torbjörn Larsson, OM May 26, 2007 at 12:33 pm

the axiom prevents several disjoint infinite inclusion chains, such as natural numbers and even natural numbers,

Uups, same cardinality. But natural numbers and reals by cuts would perhaps make my point.

Loading...

Reply ↓
Julia June 2, 2007 at 5:06 pm

Enzo, if you ever get to reading this —
Go for it!
(My first tattoo is going to be a frog.)

Loading...

Reply ↓
Chrysippus Posthumus January 19, 2019 at 1:11 pm

Please tell me what’s wrong with the following.

The history of Mathematical Logic over the past century or so is largely a series of developments set in motion by the discovery of the set-theoretic paradoxes. In order to get around them, logicians formalized systems that Gödel proved Incomplete. Gödel’s were the first of the limitative theorems. One corollary of the First Incompleteness Theorem that Tarski felt obliged to take into account as he developed his canonical formal semantics was the Undefinability of Truth. This result made the semantic paradoxes the important issue it is today.
The logicians who struggle with paradoxes deserve great respect and praise, but when some of the best logicians on Earth can struggle in vain to develop formal languages that express truths that grade-schoolers can state easily, we need to ask ourselves whether we haven’t unnecessarily complicated matters.
So let’s try to think iin ways that grade-schoolers might.
Let us specify a set-theory. First, let it contain the Axiom of Extensionality,
(Ext) (x)(y)[(z)(zϵx ≡ zϵy) ≡ x=y].
AVirtually every set-theory contains some revision of the Naïve Comprehension Axiom Schema,
(NCA) (ⱻz)(y)(yϵz ≡ ψ).
Let ours be the Identity Comprehension Axiom Schema,
(ICA) (ⱻz)(y)[(yϵz ≡ ψ) ≡ y=y].
As phrased, (Ext) and (ICA) place no upper bopunds on the Gödel codes of axioms. ‘ψ’ can be any formula, no matter how long. This will be important when we come to discuss Gödel Theory.
In English, (ICA0 says that every formula defines a set if and only if that set’s members are self-identical. A little more loosely, a set is defined by a formula if and only if it’s possible for its members to exist.
Suppose, first, that ‘yϵz ≡ ψ’ is not self-contradictory, and would ordinarily define a set. The only “limitation” that would be put on the members of the set by the other side of the biconditional would be that each of them must be self-identical, which is no limitation. Now suppose ‘yϵz ≡ ψ’ is self-contradictory, like the formula for the Russell Set. The left side of the biconditional would be logically false, so, for the axiom to be true, the right side woud have to be logically false, so y would have to be non-self-identical. There is no such y, and that means that ‘ψ’ does not determine whether yϵz for any y that exists, or, more loosely, ‘ψ’ does not define a set. Whatever y you pick, ‘yϵz ≡ ψ’ is false.
Thus, (ICA) should allow for every instance of Naïve Comprehension to be an axiom that isn’t self-contradictory, and none that are.
I’ve said that there is no upper bound on the Gödel codes of axioms. Gödel’s Incompleteness Theorems require recursively defined Gödel codes, and all recursively defined quantifiers have upper bounds. (It can be easy to overlook this, since Gödel never needed to make it explicit in his work. His paper used Type Theory, in which all quantifiers have implicit upper bounds, including those in the Axiom of Extensionality.) So the set of Axioms of our set-theory is not recursively definable, and it does not fall into the class of theories that Gödel proved incomplete. This means that Löb’s Theorem has not been proven to apply to them either.
Tarski’s Undefinability Theorem for Truth is a corollary of Gödel’s First Incompleteness Theorem. Let’s consider the Truth-set:
(T) (ⱻz)(y)[(yϵz ≡ Ty) ≡ y=y],
where Ty means ‘y is a true sentence’. (We’ll use Tarski’s definition here.) There are certainly sentences that satisfy ‘Ty’, and sentences that don’t. Tarski proved his Theorem by positing the following set:
(L) (ⱻz)(y yϵz)( ≡ ~Ty).
This is certainly self-contradictory, but no more than the Russell Set, and it can be dealt with in the same way:
(LI) (ⱻz)(y)[(yϵz ≡ ~Ty) ≡ y=y].
(LI) states that there is no Liar set; a welcome conclusion. Since there is none, no one has to worry about how it relates to (T). Other semantic paradoxes should fail in similar ways, and if truth can be defined in our formal system, our system is complete.
Thus, the set-theoretic paradoxes, the semantic paradoxes, and most of the limtative theorems, all fail, leaving us with a logic that can do what a grade-schooler can.

Loading...

Reply ↓