Why we need formality in mathematics

The comment thread from my last Cantor crankery post has continued in a way that demonstrates a common issue when dealing with bad math, so I thought it was worth taking the discussion and promoting it to a proper top-level post.

The defender of the Cantor crankery tried to show what he alleged to be the problem with Cantor, by presenting a simple proof:

If we have a unit line, then this line will have an infinite number of points in it. Some of these points will be an irrational distance away from the origin and some will be a rational distance away from the origin.

Premise 1.

To have more irrational points on this line than rational points (plus 1), it is necessary to have at least two irrational points on the line so that there exists no rational point between them.

Premise 2.

It is not possible to have two irrational points on a line so that no rational point exists between them.

Conclusion.

It is not possible to have more irrational points on a line than rational points (plus 1).

This contradicts Cantor’s conclusion, so Cantor must have made a mistake in his reasoning.

(I’ve done a bit of formatting of this to make it look cleaner, but I have not changed any of the content.)

This is not a valid proof. It looks nice on the surface – it intuitively feels right. But it’s not. Why?

Because math isn’t intuition. Math is a formal system. When we’re talking about Cantor’s diagonalization, we’re working in the formal system of set theory. In most modern math, we’re specifically working in the formal system of Zermelo-Fraenkel (ZF) set theory. And that “proof” relies on two premises, which are not correct in ZF set theory. I pointed this out in verbose detail, to which the commenter responded:

I can understand your desire for a proof to be in ZFC, Peano arithmetic and FOPL, it is a good methodology but not the only one, and I am certain that it is not a perfect one. You are not doing yourself any favors if you let any methodology trump understanding. For me it is far more important to understand a proof, than to just know it “works” under some methodology that simply manipulates symbols.

This is the point I really wanted to get to here. It’s a form of complaint that I’ve seen over and over again – not just in the Cantor crankery, but in nearly all of the math posts.

There’s a common belief among crackpots of various sorts that scientists and mathematicians use symbols and formalisms just because we like them, or because we want to obscure things and make simple things seem complicated, so that we’ll look smart.

That’s just not the case. We use formalisms and notation because they are absolutely essential. We can’t do math without the formalisms; we could do it without the notation, but the notation makes things clearer than natural language prose.

The reason for all of that is because we want to be correct.

If we’re working with a definition that contains any vagueness – even the most subtle unintentional kind (or, actually, especially the most subtle unintentional kind!) – then we can easily produce nonsense. There’s a simple “proof” that we’ve discussed before that shows that 0 is equal to 1. It looks correct when you read it. But it contains a subtle error. If we weren’t being careful and formal, that kind of mistake can easily creep in – and once you allow one, single, innocuous looking error into a proof, the entire proof falls apart. The reason for all the formalism and all the notation is to give us a way to unambiguously, precisely state exactly what we mean. The reason that we insist of detailed logical step-by-step proofs is because that’s the only way to make sure that we aren’t introducing errors.

We can’t rely on intuition, because our intuition is frequently not correct. That’s why we use logic. We can’t rely on informal statements, because informal statements lack precision: they can mean many different things, some of which are true, and some of which are not.

In the case of Cantor’s diagonalization, when we’re being carefully precise, we’re not talking about the size of things: we’re talking about the cardinality of sets. That’s an important distinction, because “size” can mean many different things. Cardinality means one, very precise thing.

Similarly, we’re talking about the cardinality of the set of real numbers compared to the cardinality of the set of natural numbers. When I say that, I’m not just hand-waving the real numbers: the real numbers means something very specific: it’s the unique complete totally ordered field (R, +, *, <) up to isomorphism. To understand that, we’re implicitly referencing the formal definition of a field (with all of its sub-definitions) and the formal definitions of the addition, multiplication, and ordering operations.

I’m not just saying that to be pedantic. I’m saying that because we need to know exactly what we’re talking about. It’s very easy to put together an informal definition of the real numbers that’s different from the proper mathematical set of real numbers. For example, you can define a number system consisting of the set of all numbers that can be generated by a finite, non-terminating computer program. Intuitively, it might seem like that’s just another way of describing the real numbers – but it describes a very different set.

Beyond just definitions, we insist on using formal symbolic logic for a similar reason. If we can reduce the argument to symbolic reasoning, then we’ve abstracted away anything that could bias or deceive us. The symbolic logic makes every statement absolutely precise, and every reasoning step pure, precise, and unbiased.

So what’s wrong with the “proof” above? It’s got two premises. Let’s just look at the first one: “To have more irrational points on this line than rational points (plus 1), it is necessary to have at least two irrational points on the line so that there exists no rational point between them.”.

If this statement is true, then Cantor’s proof must be wrong. But is this statement true? The commenter’s argument is that it’s obviously intuitively true.

If we weren’t doing math, that might be OK. But this is math. We can’t just rely on our intuition, because we know that our intuition is often wrong. So we need to ask: can you prove that that’s true?

And how do you prove something like that? Well, you start with the basic rules of your proof system. In a discussion of a set theory proof, that means ZF set theory and first order predicate logic. Then you add in the definitions you need to talk about the objects you’re interested in: so Peano arithmetic, rational numbers, real number theory, and the definition of irrational numbers in real number theory. That gives you a formal system that you can use to talk about the sets of real numbers, rational numbers, and natural numbers.

The problem for our commenter is that you can’t prove that premise using ZF logic, FOPL, and real number theory. It’s not true. It’s based on a faulty understanding of the behavior of infinite sets. It’s taking an assumption that comes from our intuition, which seems reasonable, but which isn’t actually true within the formal system o mathematics.

In particular, it’s trying to say that in set theory, the cardinality of the set of real numbers is equal to the cardinality of the set of natural numbers – but doing so by saying “Ah, Why are you worrying about that set theory nonsense? Sure, it would be nice to prove this statement about set theory using set theory, but you’re just being picky on insisting that.”

Once you really see it in these terms, it’s an absurd statement. It’s equivalent to something as ridiculous as saying that you don’t need to modify verbs by conjugating them when you speak english, because in Chinese, the spoken words don’t change for conjugation.

45 thoughts on “Why we need formality in mathematics”

  1. Okay, using my intuition (which of course you just told me I shouldn’t do, but alas I am not conversant with ZFC. etc.) if I take the infinite decimal representation of any two non-equal irrational numbers, I can always truncate them and then form an infinite number of rational numbers that are between them. And on the same token, given any two non-equal rational numbers, I can always form an infinite number of irrationals between them. This is in accord with his premise 2 in the proof above. But it also shows why his premise 1 is false, but not a requirement since if there are an infinite number of rational and irrationals between any two points, their cardinality might be different. Which is just another way of saying that premise 1 is false and Cantor is still right.

    1. I’m not sure Premise 2 is right either. Consider two real numbers whose decimal representations are the concatenation of two infinite sequences where the first sequence is the same between them but the second is different. For instance, n1=0.000…001111… (an infinite sequence of 0s followed by an infinite sequence of 1s) and n2=0.000…002222… (an infinite sequence of 0s followed by an infinite sequence of 2s). Any finite trucation of these two numbers will yield the same number, with no room for any numbers between them. In fact, I can’t think of any rational numbers that would fit between their non-trucated forms. Yet, they are clearly not equal.

      1. I’m pretty sure both of the premises are wrong; but identifying any error in a premise is enough to trash the entire proof, so I stopped there.

      2. I don’t believe it’s valid to concatenate infinite sequences. Doesn’t “followed by” actually mean “comes after the last element of the previous sequence”? If a sequence doesn’t have a last element, then nothing can come after it.

        1. I’d argue that it’s not even wrong.

          That is, “concatenate” isn’t a meaningful term when you’re talking about infinite sets. There are so many different ways that you could define it. There are some definitions of concatenate in which it would make sense and be true; there are some definitions of concatenate in which it would make sense and be false; and there are some definitions in which it just doesn’t make any sense.

          That’s exactly the point I was trying to make. In math, we don’t use undefined terms. If we wanted to use the word concatenate, we’d give it a specific formal definition which identified what we mean, instead of just using the natural language work and relying on intuition to fill in the meaning.

          1. Sure, “concatenate” has meaning for infinite sequences. The problem is decimal notation requires a sequence of order type at most ω, but red is offering sequences of order type ω+ω!

      3. Premise 2 is indeed wrong, and it’s a fun little recreational mathematics exercise to prove this.

        Hint: Use continued fractions.

        1. Wait, no, the double negative in the premise threw me. The premise is correct. Given any two distinct irrational numbers, it is possible to construct a rational number between them.

          The problem with red’s argument is that 0.”an infinite sequence of 0s followed by an infinite sequence of 1s” is not a decimal numeral.

      4. A decimal expansion is only infinitely long. It doesn’t make sense to talk about “numbers whose decimal representations are the concatenation of two infinite sequences,” because you have to stop after the first infinite sequence. This is a property of the definition of decimal expansions.

        And Premise 2 is correct; it’s simply saying that the rationals are dense in the reals, which is true.

        1. For precision, it’s worth pointing out that you can, in the right model, with the right definition of “concatenation”, talk about something like the concatenation of two infinite sequences.

          The problem is, he’s using the informal word “concatenation” in a way that doesn’t make any sense in the system that he’s talking about. You can’t “concatenate” two infinite decimal expansions and get the properties that he thinks he’s getting.

  2. Hi MarkCC, I’ve meaning to ask you for your views about the learning resource https://betterexplained.com, as the emphasis there is providing “intuitive” learning guides to mathematics. From reading this article, I assume you’re against such teaching methods, no?

    1. I haven’t looked at their site, so I’m talking in the abstract here.

      My point isn’t that intuition is useless; it’s that you need to be able to understand the limits of intuition.

      When I was in high school, I was an honors math student; my older brother, who was a musician, was in the “normal” math classes.

      He was taught strictly rote. He’s a very smart guy, but coming out of those math classes, he had no idea what something like “percent” means. To figure out what a 20% off sale price was, he’d write down a pair of ratios, cross-multiple, and solve. The process didn’t mean anything to him.

      I consider that to be an absolutely criminal failure of the normal math classes. They taught procedures and memorization without any reason behind it – and that meant that their students had absolutely no concept of what any of it meant – which means that they might as well not have spent time learning any of it at all, because it was just meaningless procedures.

      You need to understand what things mean. In elementary math, that means that it’s absolutely essential to teach intuition. But as you get further and further along into deeper math, you reach a point where you have to accept that your intuition has limits, and that while intuition is a useful tool for understanding things, you can’t rely on it.

      1. “…which means that they might as well not have spent time learning any of it at all, because it was just meaningless procedures.”

        Not a very convincing statement, considering you preceded it by stating that your brother can calculate useful results.

        I get what you mean, but generally speaking mathematicians undervalue the worth of simply being able to calculate useful results even without understanding why the algorithms they apply work.

        1. He could produce a useful result, but in an impractical way, without understanding how or why it worked. That’s not actually useful. Knowing how to set up parallel ratios and cross multiply doesn’t do you much good in practice in the real situations where you encounter discounts.

  3. Nice detail. I probably would have said, “Let’s put aside Cantor’s argument for a moment. How do you think the cardinality of the natural numbers compares to that of the even natural numbers.” My guess is that the person would intuit that there are “fewer” even numbers without distinguishing between the notions of cardinality and proper subsets. It may be an easier context in which to make your point about the need for formalism.

    I once had a similar experience with a person who claimed that the product of two negative numbers “should” be negative because that “makes sense”.

    1. And my comment is rendered moot by your previous post on this topic, which I neglected to read first. Sorry about that.

  4. If avoiding small errors so so important, I wonder how mathematicians can be confident in their proofs without any computer checking. I’m pretty sure I couldn’t write a complicated program that works without a compiler and lots of testing, and even then there are bugs.

    But what seems to happen in practice (I’m told) is that mathematicians can work entirely on paper, skipping most of the formalism, while still being able to generate it on demand. And somehow, small errors in proofs usually don’t turn out to be fatal – they can be patched.

    It all seems rather mysterious. Formalism is very important in theory, but somehow not in practice once you get beyond high school geometry?

    This reminds me of my undergraduate days, being asked to prove things without a clear sense of what the rules actually are. It’s not like they’re all written down in a big book that you can refer to. The whole thing seemed suspiciously *in*formal from a computer programmer’s perspective.

    1. Being able to formalise is important; you can’t spot the mistakes if you can’t formalise your proof. The thing about experienced mathematicians is that they’ve trained their intuition such that the things they intuit tend to match up to the things that they can formalise; this, in turn lets them skip the formalism during the exploratory phase of proving something, and lets them add it in at the end.

    2. In my experience, you train on valid proofs and counter-examples until you build a good intuition for that area. Of the two maths areas I know best – differential geometry and logic – the first is much more amenable to intuition than the second.

    3. Actually, no, the papers written by mathematicians contain plenty of formalism. If any formalism is omitted, it is merely that of some minor bits that have been addressed before many times, and so these minor bits are used freely, just like previously proven theorems are used without replication of the proofs. Mathematical papers tend to introduce new definitions and new notations, and these are done very formally in order to avoid errors.

      For example, any paper dealing with real numbers will assume that the real numbers are the smallest complete ordered field containing the field of rational numbers, where the notion of “complete” has already been defined in the mathematical community. This notion of completeness has various equivalent forms (equivalent because we have proven them so): Axiom of Completeness, Nested Interval Principle, Bolzano-Weierstrass Theorem, etc. The equivalence of these is established in an undergraduate course in Analysis. So the author will use notions of completeness, convergence, and other concepts freely. If the author introduces some new property called “foo sets” of real numbers, then they will be required to very formally define a “foo set”.

      Regarding the checking of proofs, this is done by peer review, and any result of great interest will be checked by far more people than merely the journal-assigned referees. Computer-checking is far from fool-proof, and mathematicians have been wary of the use of computers in proofs. Computers must be programmed by fallible humans, and computers have limitations. The concern for mathematicans is whether a human checked the computer! And when a lot of computer work has been done, say in a proof of The Four Color Theorem, some people regard the proof as not really legitimate and continue to seek a traditional proof. Some of this skepticism has relaxed, but it has raised a new issue of requiring computer-assisted proofs to be open-source so that there can still be a peer review.

      1. I remain somewhat skeptical. It’s apparently not uncommon for mathematical texts to have minor errors that that experienced readers just gloss over (because they know what was actually meant) but trip up the uninitiated. Also, mathematical notation can be idiosyncratic and sometimes ambiguous. I’ve also read that making mathematical proofs fully computer-checkable is very tedious, which makes me a bit suspicious. If it were fully formalized in a consistent language, it seems like it shouldn’t be that hard?

        Completely agree that for static analysis tools to be fully trusted, they should be open source, and perhaps have multiple independent implementations.

        I’m not sure it’s necessary though. Just like you can use a spell-checker to find some errors even if you don’t fully trust it on spelling (or for finding any other kind of writing error), it’s possible to use a static analysis tool that sometimes has false positives or false negatives, or is incomplete. And we do that all the time in computer programming with linters.

        This is similar to how we don’t expect perfection from any single *human* reviewer. Holding static analysis tools to that standard seems a bit unfair.

        1. I don’t entirely disagree with you.

          Unambiguous notation is an aspiration, and all too often, we fail to achieve it.

          But even when we do, the problem is often more the fact that we omit things when we present them to other people, and those omissions cause all sorts of problems.

          In terms of automated proof systems, I think that there’s a significant issue of syntax.

          Our mathematical notation is mostly based on handwriting. It’s built to be easy for a human holding a pen to sketch out – not for a computer to process.

          For the most part, latex has become a sort of primary language for many people – but it’s geared towards producing something that looks like the carefully handwritten stuff.

          I’ve had the misfortune of learning a lot about the internals of TeX (a big part of how I worked my way through grad school was technical support for TeX in the CS and EE departments), and I think it’s difficult or even impossible to extract an unambiguous semantic form. I think that it’s equivalent to just getting the output, scanning it, and trying to parse out the notation in the scanned document. (TeX is a particularly ugly Turing complete computing system under the covers.)

          1. Checking a proof to see if each step was a valid application of an inference rule in the logic would not run afoul of incompleteness.

            The point of incompleteness is that there are some statements that cannot be proven. That’s fine – if they can’t be proven, then there are no proofs of them.

            But if you’ve already got a valid proof, then the thing you’re looking at doesn’t fall under incompleteness.

          2. But if you create a system that is sufficiently unambiguous and self-consistent for a machine to check its validity, you will create a system that cannot prove things that are true. I wonder if you might not want to leave a bit of “wiggle room” to prevent that. This begs the question, can we “get around” the incompleteness theorem to some extent by intuition? Or is the human mind of necessity part of the system?

          3. Static analysis tools don’t need to check everything. All you need is some kind of “this step is not checkable by tool X” annotation. This should get the attention of human reviewers who can review that step more carefully.

          4. blu28, you do not understand Gödel’s theorem. Also, pay attention to the word “some”. There will always be “some” statements in a consistent system that contains Peano arithmetic that cannot be proven true or false. That doesn’t mean you can never prove anything.

          5. I certainly never claimed that you couldn’t prove anything with such a system. Nor do I think it wouldn’t be useful to work towards such a system.

          6. There’s a neat trick you can pull off in a static analysis system to get the “not checkable” thing to work; you require the “not checkable” step to become an axiom.

            You can then extract the axioms from the proof, and examine them more closely; mostly, they’ll be standard axioms, and the system simply tells you that all statements its verified are true given these axioms from the standard libraries, plus these assertions made in the proof that cannot be checked.

            The proof user then looks at the axioms and the assertions, and confirms that they’re happy with that set – if they’re not happy, then they know what hypotheses need to be proved within their acceptable set of axioms to get the proof to hold.

        2. “I remain somewhat skeptical. It’s apparently not uncommon for mathematical texts to have minor errors that that experienced readers just gloss over (because they know what was actually meant) but trip up the uninitiated.”

          You just pulled a switcheroo. Do you want to talk about mathematical textbooks or mathematical papers? Choose one. Mathematical textbooks are not representative of mathematical papers. (Actually, in most disciplines textbooks and papers are written very differently.) Mathematical papers are not for the uninitiated.

          1. Also, remember that mathematical papers cite references. It is common for an author to cite a textbook and direct the reader there for some definitions and notation, so they can rest on the formalism of another work rather than duplicate it. Their new contribution, however, will be done formally.

  5. I think that it would very good to see the proof that convinced MarkCC that Cantor’s diagonal argument is indeed true. So Mark, if you don’t mind, would you provide a link to such a proof?

    1. The proof that convinced me Cantor’s diagonal is true is… Cantor’s diagonal. Shocking, huh?

      It’s a very simple proof, each step of which is well-founded by the axioms of ZF/FOPL + Peano arithmetic and real number theory.

      It’s been written thousands of times in thousands of different forms, including an informal presentation several times right here on this blog.

      If you actually care (which I’m pretty sure you don’t), get a good textbook. I’ve got https://www.amazon.com/Set-Theory-Its-Logic-Revised/dp/0674802071/ref=sr_1_8?ie=UTF8&qid=1473546149&sr=8-8&keywords=axiomatic+set+theory, which isn’t the easiest presentation, but it’s quite thorough, and reasonably clear for a formal text.

      1. I now have the book, “Set Theory and Its Logic”, but I am having difficulty finding the formal proof of Cantor’s diagonal argument. Can you tell me which page it is on?

      2. I find it interesting that Mark pays lip service to “formal proofs”, yet when I ask him for the the formal proof that convinced him that Cantor’s diagonal argument is valid, I get nothing, except perhaps a book and a wild goose chase.

        I think that what happens is that these professors get a hold of gullible students and once indoctrinated, the students will never look at these theories with any skepticism at all.

        Below, is a good example, the argument proposed by this UC Davis professor can’t be valid, because it requires the number in the ith place of the ith row to be changed into a number that is not a 0,1,9 or the original number. So, the proof will work only in base 5 or higher. So, unlistability may just be a property of base 5 or larger numbers and not of the set real numbers at all, but these students will go on thinking that Cantor’s diagonal is valid, just like Mark did.

        1. A “wild goose chase” is referring you to the most well-known canonical textbook on set theory.

          But your half-assed informal statements? Those we should just take at face value, right?

          Yeah. Sure.

          1. Mr. McClain must be clairvoyant, as he posted a very relevant reply to this new rant a couple of months ago, referring to a very well known proof that does not use decimal representation. You can also find it in lots of books, like Terry Tao’s. But it is good to remember that there are several ways of proving Cantor’s theorem. It is funny, by the way.

        2. Let me point out a couple of big problems with your argument here.

          To begin with, with this latest thing, all that you’re doing is showing that you really don’t understand number theory – which is pretty damned important, since you’re trying to talk about proofs in number theory. In particular, you don’t seem to understand that numbers exist independent of representation. If Cantor’s proof works for any representation of the set of real numbers, then it’s valid. If the diagonalization only worked on numbers represented in canonical infinite Egyptian fraction form, that would be fine. It wouldn’t matter if Cantor had needed to build a whole new representation that’s only used in the proof – if there’s one representation in which it works, and that representation can be shown to be a valid way of representing the set of real numbers, then the proof is valid.

          This isn’t just an abstract point. If you go and read Gödel’s proof of incompleteness, he produces a whole new representation of logic using natural numbers, called Gödel numbering. To make the proof work, you have to convert all of your logical statements into Gödel’s numeric representation of first order predicate logic. But you can’t say that that means that Gödel’s proof only applies to logical statements written using Gödel numbering. Gödel showed that his numeric representation was a valid representation for the set of all statements in FOPL, and so it remains proven and true, no matter what representation you use.

          Second, you don’t seem to understand what you’re trying to do here. The burden of proof isn’t on me to satisfy you. You’re trying to overturn one of the most foundational proofs of modern mathematics. It’s been taught to, literally, millions of students over the last century. Every mathematician in the world has studied it.

          You’re trying to say that it’s always been wrong, and that every mathematician who’s studied it somehow missed the obvious fact that it’s incorrect.

          That’s not impossible. But it is very unlikely. And so, we fall back to that old saw: extraordinary claims require extraordinary evidence.

          You’re making an extraordinary claim here. And you’re acting like the default assumption should be that you’re right, and it’s everyone else’s job to prove that you’re wrong.

          Sorry, guy, but that’s not how it works. You’re making extraordinary claims, and so it’s incumbent on you to show that your claims are correct. But all that you’ve got backing up your claims is a bunch of logically invalid “proofs”. Commenters here have patiently explained to you what’s wrong with them. Instead of trying to fix the problems, you’re just throwing a little tantrum because no one is giving you everything that you demand.

          You want to see a complete, meticulously annotated proof of Cantor’s diagonalization? Get a textbook and set theory, and read it. If you want to claim that you’re a big enough genius to overthrow more than a century of work by the best mathematical minds in the world, you can look up the proof that you claim to have refuted in a textbook.

  6. Years ago, I was talking with a friend about the nature of mathematics and how we personally view mathematics. I was tending towards more intuitive approach, not really believing the intuition but still being more guided by it. My friend was more formal, thinking of symbolic games, etc. Somehow, I explained, that regardless of whether I really thought it to be true, there was a part of me that wanted the math to point to a grander design, a truth behind truths.

    We also were talking about The Marshmallow Test in psychology. When some abstract idea is becoming something that you can deal with in your brain, it is important to let it build until it is ready. If you try to grasp it too soon, it may slip out of your fringers. I think whenever I was talking with my friend he was so much more able to let things be until all the required parts were in to make something.

    He has a lot more higher IQ than I do. It is just a fact and it shows. He is also one of the most pleasant people to talk about anything as he is able to put himself in the background and be your intelligence when you try to figure something out. Makes me smarter too! Now that is amazing.

  7. Premise 1 even seems to be fallacious even if you ignore the problem of lack of formalism. Why would you assume that infinite things have the same properties as finite things? This is a false equivocation. Hilbert’s Hotel is a simple enough argument even without formalism to dismiss premise 1 as unwarranted. (I know others pointed out issues with the premises, but I felt this one was different enough to warrant posting.)

  8. One of the motivations for developing homotopy type theory was the observation by Vladimir Vodoevsky of mistakes he had made in his own work. He wanted a new foundation to make computerised checking of proofs easier.

    And there are currently problems with the foundations of symplectic geometry becauase some of the pivotal foundational papers have been found to contain errors. (Symplectic geometers have been suspicious for a while, and have reorganised the subject to minimise the damage).

    So you don’t have to be a crankto get informal mathematics wrong. But the damage to mathemetics is greatest if you are a Fields medalist.

Leave a Reply