Infinite and Non-Repeating Does Not Mean Unstructured

So, I got in to work this morning, and saw a tweet with the following image:

pi

Pi is an infinite, non-repeating decimal – meaning that every possible number combination exists somewhere in pi. Converted into ASCII text, somewhere in that string of digits is the name of every person you will ever love, the date, time, and manner of your death, and the answers to all the great questions of the universe. Converted into a bitmap, somewhere in that infinite string of digits is a pixel-perfect representation of the first thing you saw on this earth, the last thing you will see before your life leaves you, and all the moments, momentous and mundane, that will occur between those points.

All information that has ever existed or will ever exist, the DNA of every being in the universe.

EVERYTHING: all contaned in the ratio of a circumference and a diameter.

Things like this, that abuse misunderstandings of math in the service of pseudo-mystical rubbish, annoy the crap out of me.

Before I go into detail, let me start with one simple fact: No one knows whether or not π contains every possible finite-length sequence of digits. There is no proof that it does, and there is no proof that it does not. We don’t know. At the moment, no one does. If someone tells you for certain that it does, they’re bullshitting. If someone tell you for certain that it doesn’t,
they’re also bullshitting.

But that’s not really important. What bothers me about this is that it abuses a common misunderstanding of infinity. π is an irrational number. So is e. So are the square roots of most integers. In fact, so are most integral roots of most integers – cube roots, fourth roots, fifth roots, etc. All of these numbers are irrational.

What it means to be irrational is simple, and it can be stated in two different ways:

  1. An irrational number is a number that cannot be written as a ratio (fraction) of two finite integers.
  2. An irrational number is a number whose precise representation in decimal notation is an infinitely long non-repeating sequence of digits.

There are many infinitely long sequences of digits. Some will eventually include every finite sequence of digits; some will not.

For a simple example of a sequence that will, eventually, contain every possible sequence of digits: 0.010203040506070809010011012013014015016…. That is, take the sequence of natural numbers, and write them down after the decimal point with a 0 between them. This will, eventually, contain every possible natural number as a substring – and every finite string of digits is the representation of a natural number.

For a simple example of a sequence that will not contain every possible sequence of digits, consider 0.01011011101111011111… That is, the sequence of natural numbers written in unary form, separated by 0s. This will never include the number combination “2”. It will never contain the number sequence “4”. It will never even contain the digit sequence for four written in binary, because it will never contain a “1” followed by two “0”s. But it never repeats itself. It goes on and on forever, but it never starts repeating – it keeps adding new combinations that never existed before, in the form of longer and longer sequences of “1”s.

Infinite and non-repeating doesn’t mean without pattern, nor does it mean without structure. All that it means is non-repeating. Both of the infinite sequences I described above are infinitely long and non-repeating, but both are also highly structured and predictable. One of those has the property that the original quote talked about; one doesn’t.

That’s the worst thing about the original quotation: it’s taking a common misunderstanding of infinity, and turning it into an implication that’s incorrect. The fact that something is infinitely long and non-repeating isn’t special: most numbers are infinitely long and non-repeating. It doesn’t imply that the number contains all information, because that’s not true. </p.

Hell, it isn’t even close to true. Here’s a simple piece of information that isn’t contained anywhere in π: the decimal representation of e. e is, like π, represented in decimal form as an infinitely long sequence of non-repeating digits. e and π are, in fact, deeply related, via Euler’s equation: e^{i\pi} + 1 = 0. But the digits of e never occur in π, because they can’t: in decimal form, they’re both different infinitely long sequences of digits, so one cannot be contained in the other.

Numbers like π and e are important, and absolutely fascinating. If you take the time to actually study them and understand them, they’re amazing. I’ve writted about both of them: π here and e here. People have spent their entire lives studying them and their properties, and they’re both interesting and important enough to deserve that degree of attention. We don’t need to make up unproven nonsense to make them interesting. We especially don’t need to make up nonsense that teaches people incorrect “fact” about how infinity works.

26 thoughts on “Infinite and Non-Repeating Does Not Mean Unstructured

  1. Wyrd Smythe

    Thanks! This answers a question I’ve had for a long time: for some text encoding, do the complete works of Shakespeare appear somewhere in the digits of pi? (Obviously just a specific way of asking if a certain very long sequence of digits necessarily appears.)

    The common belief (as expressed in that tweet) seems to be, “Yes!” But I’ve always had my doubts. It does seem like an abuse of “infinite” to me.

    A “straight” encoding (ASCII or Unicode) seems like it would be too structured to ever appear as is, but then I wonder about a compressed encoding, which would lack the patterns of uncompressed text.

    Someone recently pointed me towards Jorge Luis Borges’ Library of Babel, which offers an interesting take on “all possible sequences.”

    And I’ve always been charmed by Carl Sagan’s fanciful idea (from his SF novel Contact) of messages buried in pi and e and other transcendentals. [For those who’ve not read it, buried deep, deep in the digits of pi we find an obvious raster pattern expressing the image of a circle (disc, actually).]

    Reply
    1. Paul

      Two main things to think about there.

      Firstly, if Pi is normal, then the complete works of Shakespeare appear within it, with every single encoding you can think of… eventually. A normal number contains *every* finite sequence of digits, so no matter what digits end up standing for Shakespeare works, they’re in there. Same goes for the obvious circle image.

      But we don’t know whether Pi is normal, so we don’t know that every sub-sequence pops up eventually.

      Thus, no reason to think the circle image has to come up.

      On the other hand, if we’re allowed *any* encoding… the Shakespeare question becomes rather trivial. If 1=the complete works of Shakespeare, then the complete works of Shakespeare are encoded in the 2nd digit.

      If you restrict yourself to widely used encoding and compression algorithms, you’re back to guessing. We have no reason to think it *has* to happen, but also no reason to think it wouldn’t.

      Reply
  2. timkington

    Maybe most people reading that think pi contains everything because its representation is infinitely long, which is wrong.

    Isn’t pi believed to be normal, though? I thought that meant the digits are randomly distributed. If that’s the case, then every finite-length string should appear eventually.

    Reply
  3. Markus

    IIRC Marcus du Sautoy makes the claim that any finite sequence of digits can be found in PI in his The Code documentary.

    Reply
  4. Sean

    Just a minor nitpick: Isn’t “finite integer” redundant? There’s no such thing as an “infinite integer” as far as I know.

    Reply
    1. markcc Post author

      You are correct, but I have frequently had commenters here who didn’t understand that, and argue things like “Of course there’s a 1:1 map between the reals and the integers – just flip the number around the decimal point.”

      Reply
    1. markcc Post author

      Why?

      Unless you want me to be more specific, and say something like “unless one contains the other as a suffix?”

      Reply
      1. John Armstrong

        The soul of mathematics is precision, after all. When you’re dealing with cranks you can’t leave such a giant gap in in your reasoning just because it’s sounds like a more satisfying smackdown.

        Reply
        1. Doug Balcom

          Along the same lines, the following passage also calls for some clarifying revision:

          “That is, take the sequence of natural numbers, and write them down after the decimal point with a 0 between them. This will, eventually, contain every possible natural number as a substring – and every finite string of digits is the representation of a natural number.”

          But not every finite string of digits is a *normalized* (in the sense of no leading zeros) representation of a natural number — thus your explanation here, which seems to conflate these two notions, falls short in its goal.

          Of course, all you need to do to fix the explanation is to point out that any non-normalized decimal representation of a natural number is contained within a normalized one (by simply prepending a leading “1” or any other nonzero digit in front of it), and thus is also guaranteed to show up in the sequence you describe.

          Thus is the soul of mathematics preserved once again.

          Reply
      2. Yiab

        I think a little more specificity wouldn’t hurt. After all, the digits of the decimal expansion of pi may very well contain the digits of the decimal expansion of e as a computable subsequence, though not necessarily as a consecutive subsequence. In fact, it’s trivial to prove that the digits of the binary expansion of pi contains the digits of the binary expansion of e as a subsequence if we simply ignore the radix point (all you need to know is that an irrational number has an infinite nonrepeating expansion when expressed in any integer base greater than 1, not only decimal).

        Reply
  5. Anzel

    So we still have yet to prove that pi is normal. It’s worth pointing out that almost every real number is normal (in the Lebesgue measure sense), so if you just randomly picked a real number you’d have your DNA and a doge meme png and anything else. Now pick again, and again…

    With that said, it’s generally really hard to prove that any given irrational number is normal. This is one of those wonderful (/infuriating) math proofs in the vein of “something exists, but don’t ask me to find a non-trivial example”. And I believe that creating a number that is normal in all bases is apparently a really difficult challenge.

    So to me, pi* would be MORE interesting if it WASN’T normal. And though the idea of trying to hide some inner message in it seems pretty silly to me (how would you possibly make it have any other value?) a la Contact, I personally think a cuter way to “hide” such a message would be to have some number sequence that didn’t exist anywhere in the number. Now that would be a challenge to find.

    *Or tau, the superior constant.

    Reply
    1. Wyrd Smythe

      Does that require that the DNA and PNG also be normal?

      WRT Contact, the implication was that whatever created the universe created it such that messages were hidden in pi and e and other transcendentals. It’s about the only way that would be possible.

      Also a big fan of Tau…. Tau day (6/28) is coming up!

      Reply
      1. markcc Post author

        No.

        Normality, applied to an infinitely long sequence means that all finite-length subsequences are equally likely. If that’s true, then if you follow the sequence out far enough, you’ll be able to find any desired finite sub-sequence.

        The finite-subsequences don’t need to fit any profiles except finiteness. If the sequence is truly normal and infinitely long, then it will, eventually, include every possible subsequence.

        Reply
        1. Wyrd Smythe

          So, just to make sure I’m completely clear on this, IF pi is normal (and we suspect it is), then it does contain any finite sequence of numbers I can imagine, including the raster pattern from Contact, the complete works of Shakespeare directly encoded into ASCII and a sequence of one-million zeros.

          But this isn’t particularly significant, because any normal real number has the same sequences somewhere in its digits.

          Yes?

          Reply
  6. David Starner

    e is found in pi in the sense of the post if you assume its normality. You can write an expression in ASCII (TeX) that describes e exactly.

    Reply
  7. Phil Koop

    That reminds me of a classic example of what constitutes a computable function in the sense of the general theory of algorithms. Suppose that Foo(n) is defined to return 1 if the decimal expansion of pi contains a run of at least n consecutive 7’s, and 0 otherwise. Is Foo() computable? Yes.

    There are two possibilities: either there is a longest run, say of length N, or else there is no longest run. In the first case, the program for Foo() is:

    int Foo(int n)
    {
    return n <= N ? 1 : 0;
    }

    In the second case, the program is:

    int Foo(int n)
    {
    return 1;
    }

    So Foo() is certainly computable because there is a program that computes it. We just don't know which program it is. For some reason, this example always winds people up.

    Reply
  8. ah

    there is this concept that just because something is infinite, it must be ‘more’ in some sense: it must contain everything. this view is not stupid, just misguided. here is an example: in probability theory, when working with continuous time stochastic processes, one abides to this view: you saturate a filtration by making the sigma-algebra right-continuous and include the null sets of the associated natural filtration with respect to the measure. we get something that is ‘complete’ (or saturated) by putting more in.

    when you tell people about something non-infinite (finite) yet totally random, then a little switch goes off: wait.. its countable, i can understand it, but it’s still random? (for example, observing particles at a microscopic scale with measuring equipment, i.e. uncertainty) maybe it’s a case of not knowing randomness!

    to the author: have you read the book “thinking fast and slow” by daniel kahneman?

    Reply
  9. Jon Finstad

    Did you ever consider this: The reason you cannot find what you are looking for in pi or e is because you are trying to guess what a 3D object is by only looking at it from 1D. Each constant is a “side” or a part of a coordinate, if you will. There must be a third irrational infinite constant that would “complete” the picture. You wouldn’t expect the universe to make it that easy, would you? Now if we could only find that constant. lol

    Reply
  10. Rick Cummings

    Your sequence of 01011011101111011111… DOES contain a representation of 2 and of 4; in fact, it contains a representation of all numbers: 10=1, 110=2, 1110=3,11110=4, etc so it can encode the works of Shakespeare

    Reply

Leave a Reply to John Armstrong Cancel reply