Basics: Axioms

Leave a reply

Today’s basics topic was suggested to me by reading a crackpot rant sent to me by a reader. I’ll deal with said crackpot in a different post when I have time. But in the meantime, let’s take a look at axioms.

What is an axiom?

If you want to do any kind of formal or logical reasoning, or any kind of inference, you need to start with some set of known facts. There is simply no way of performing inference starting from absolutely no knowledge. Axioms are the set of known facts that are accepted as
basic primitive unproven facts: all proofs are ultimately built upon the inference rules of some logic combined with an initial set of axioms.

In math, we tend to try to find minimal sets of axioms: we prefer to take as little for granted as possible, and then build from that basis using logical inference; but sometimes we’ll go off on a philosophical vein for a while, either denying the necessity for any unprovable axioms (Descartes), or freely accepting axioms that seem reasonable, and seeing where they lead (Chaitin).

For example, in basic planar Euclidean geometry, we typically start with some basic math, and five geometric axioms: all other statements in Euclidean planar geometry are supposed to be provable (up to Gödel’s limits) starting from these five axioms:

Given any two points, they can be joined by exactly one line.
Given any finite, non-zero length line segment, it can be extended infinitely into
exactly one line.
Given any line segment, there is exactly one circle with one endpoint of the segment as the center, and with the other endpoint on the circle.
All right angles are equivalent modulo translation, rotation, and mirroring.
Given a line l and a point p which is not on l, there is exactly one line that passes through
p but never intersects l.

The last one of those is really the interesting one, because it’s the one which doesn’t really
look like an axiom. Throw out any of the others, and you get an incomplete or inconsistent
geometry; throw out the fifth one, and you get valid geometries that are just different.

Being a tad more fundamental, there are a set of 9 axioms that form the basis of the ZFC system
of math – that is, Zermelo-Fraenkel set theory with the axiom of choice. Most modern math is built on the ZFC axioms. One thing worth pointing out about the ZFC axioms is that despite the fact that I just said that there are 9 axioms, ZFC is actually an infinite set of axioms: depending on the exact presentation, at least one of the ZFC axioms is actually an axiom schema: a template for an infinite series of related axioms. (Any complete axiomatization of set theory – and therefore any version of math built on set theory – must be infinite.)

For example, in the classic formation of ZFC, the axiom of replacement is actually a schema for an infinite series of axioms. The axiom of replacement says :

(∀ x: (∃!y : P(x,y))) → (∀A: (∃B:(∀y: y∈B⇔(∃x∈A:P(x,y)))))

Or in (confusing) english, “Given any set x: if there exists exactly one set y such that a predicate P is true for the pair of X and Y, then given any set A, there must be a set B where a set is a member of B if and only if there is some x in A where P holds for both x and y.”

The thing to note there is that P is a free symbol: it can be instantiated by any predicate. But ZFC doesn’t include the ability to quantify over predicates. So this isn’t really a single axiom – it’s a schema that generates a set of axioms, one for each possible value of P.

Whew! I hope I got that right; the ZFC axioms are extremely easy to foul up by misplacing a paren. I’m sure some kind commenter will correct my if I blew it!

We often talk about axiomatizations of some field of math: for example, when I wrote about the natural numbers, I showed an axiomatization of them using the Peano axioms. An axiomatization of a formal system is a reduction of that system to a basic set of axioms from which the other facts about that field can be derived using inference.

0 thoughts on “Basics: Axioms”

Flaky March 7, 2007 at 4:28 pm

“But ZFC doesn’t include the ability to quantify over predicates.” Isn’t that a property of the logic you use to express the theory in, rather than the theory?

Loading...

Reply ↓
Harald Hanche-Olsen March 7, 2007 at 4:56 pm

That doesn’t look quite right: P should not be a symbol at all. Rather, P(x,y) should be a formula in the first order language used to express ZFC, which has free variables x, y. If you wish to think of P as symbol, it is a symbol of the metalanguage which we use to be able to talk about formulas of ZFC in the abstract. (So of course we cannot use a quantifier over P within ZFC, since P is not a symbol within ZFC in the first place.)
But this is rather difficult to discuss in a meaningful way in the context of a “basics” post …
Oh, and to the geometry axioms: There is no way you can get an inconsistent theory from a consisten one by throwing out some axioms. An inconsistent theory is one in which you can derive a contradiction, but if you throw out axioms, you can only derive a subset of the old theorems. In particular, if there were no contradictions there before, there won’t be any after you tossed some of the axioms.

Loading...

Reply ↓
clem March 7, 2007 at 5:24 pm

Do you have any interesting thoughts about the difference between minimizing and maximizing axiomatic structures? (X is a minimal set closed under axioms Y vs X is the collection of all elements satisfying axioms Y) Case in point: the various anti-foundation axioms of non-well-founded set theory.

Loading...

Reply ↓
archgoon March 7, 2007 at 8:27 pm

So, is the next post going to be on rules of inference?

Loading...

Reply ↓
Antendren March 7, 2007 at 8:58 pm

Clem, as long as we’re in first order logic, there is no upper bound on the size of a structure satisfying a given set of axioms. (Unless your axioms are exceedingly boring, to the point where they can only be satisfied by finite structures, and then only by finitely many such finite structures.)

Loading...

Reply ↓
Walker March 7, 2007 at 10:02 pm

As a logician, I always object to calling an axiom a “fact”. While that is true if you are axiomatizing a physical phenonmenon, it is way too limiting. More importantly, it is not why axiomatization is so successful.
An axiom is essentially a constraint that serves indirectly as a definition.
Let’s take Hilbert’s first four axioms of synthetic geometry.

1. Every line contains at least two point
2. For every line, there is a point not contained by that line
3. There is a line
4. For every two points there is exactly one line containing them

Now, by themselves, these axioms are not enough. To have an axiom system I also need to identify the undefined terms. These are terms that have no meaning in of themselves. In this axiom system point, line, and contains are all undefined.
An interpretation is any meaning given to the undefined terms. One interpretation is to define point=”number”, line=”set of numbers”, and contain=”element of”. So is the standard Euclidean meaning for those terms. But I could have really bizarre definitions like point=”student”, line=”school bus” and contain=”on the bus”. It is like the way group theory generalizes numbers.
A model is any interpretation that satisfies all the axioms. For example, the school bus model is an interpretation — but not a model — since if I have two kids on one bus, I need at least one kid on another bus, but now I have a pair that is not on the same bus. This is what I mean when I say an axiom is a definition. It defines what interpretations count as models and what do not.
The advantage of this approach is that any theorem proved about the axioms are true in all models. Hence, theorems about Euclidean geometry apply to non-Euclidean geometry provided that I only use the axioms that they have in common.
This view of axioms highlights exactly why I want as few axioms as possible. Epistemologically, my claim that a certain real world phenomenon is a model of an axiom system is empirical. To know that it is a model, I have to empirically verify that all the axioms are satisfied. This is outside of the realm of mathematical truth and quite succeptable to error. In order to minimize my chance of error, I want to have to check as few axioms as possible.

Loading...

Reply ↓
speedwell March 8, 2007 at 12:13 am

Walker, greatly indebted to you for an “Oh!” moment. Specifically your definition of axiom as “a constraint that serves indirectly as a definition.”

Loading...

Reply ↓
Torbjörn Larsson March 8, 2007 at 2:55 am

∃!”

‘Exists a unique’, got it. I can imagine the CS use of !∃, presumably ‘not exists’.
So I have to watch out for those exclamation marks!

There is no way you can get an inconsistent theory from a consisten one by throwing out some axioms.

So consistent theories are robust against such changes. Interesting, and probably useful at times.

Loading...

Reply ↓
Enigman March 8, 2007 at 5:27 am

Correcting “commenter will correct my if I blew it” to “commenter will correct me if I blew it” even though you didn’t blow it (rather, this was another gemlike Basic in my view, and Walker’s comment similarly) is pedantic, but I feel pedentic criticising the latter’s “In order to minimize my chance of error, I want to have to check as few axioms as possible.”
But surely real systems will usually satisfy axioms only approximately, so that there is very likely to be some error in applications. It is perhaps more a matter of minimizing the important errors? But that would seem to imply that we want axiomatic structures that correspond (in some rough and ready way) with the structures of reality; so that serious deviations can be detected easily, so that less serious deviations will be likely to remain insignificant, etc. Then, since simplicity seems to be an indicator of truth (a puzzling epistemological fact), and since simpler axiom systems are easier to work with, and to add to, etc., hence we want a simple set of axioms?

Loading...

Reply ↓
Doug March 8, 2007 at 1:31 pm

Thanks for your comments Walker!
“If you want to do any kind of formal or logical reasoning, or any kind of inference, you need to start with some set of known facts.”
“5. Given a line l and a point p which is not on l, there is exactly one line that passes through p but never intersects l.
The last one of those is really the interesting one, because it’s the one which doesn’t really look like an axiom.”
Given the usual meaning of the word ‘fact’ it appears that you have an inconsistent set of statements here, or that you’ve used some other definition of the term ‘fact’. So, what do you mean by ‘fact’?
I’d object to calling mathematical axioms ‘facts’ on a few different grounds:
1. Mathematics doesn’t necessarily reference the empirical world. Facts do.
2. Mathematical axioms may work in defining one system, but not in another. For instance the axiom of extension will work fine in (almost) all current versions of “pure” (crisp) set theory. But, it won’t work as sufficient in fuzzy set theory or a crisp set theory which has fuzzy subsets within it, since degrees of membership can vary while the member of the set remains the same. The Euclidean/non-Euclidean geometry is another example. Facts, as usually supposed, work as the same for different observers.
3. Facts do work as checkable by experiment and/or observation in terms of its truthfulness, at least theoretically. One can’t exactly check a mathematical axiom in terms of its truthfulness. One can more or less only evaluate such in terms of its utility. What does it allow us to deduce? What technological/scientific applications get suggested/become easier from out mathematical theory?

Loading...

Reply ↓
clem March 8, 2007 at 7:33 pm

Not quite what I was getting at, Antendren. For example, in the non-well-founded sets case, the question revolves around the nature of extensionality absent the foundation axiom. Under one anti-foundation axiom, if x = {x} and y = {y} then x=y. Under another that is not true. The first axiom (what i refer to as minimizing) essentially considers nwf sets that are equivalence classes of what are distinct nwf sets in the other system (maximizing).

Loading...

Reply ↓
Antendren March 8, 2007 at 8:51 pm

Ah. In that case, what you’re interested in are prime and saturated models. A prime model is one that in some sense has as few elements as possible, while a saturated one has as many as possible at that cardinality. Unfortunately, they only really make sense for complete theories (which ZFC isn’t), and the conditions for when they exist are rather complicated.

Loading...

Reply ↓