# The Math of Vaccinations, Infection Rates, and Herd Immunity

Here in the US, we are, horribly, in the middle of a measles outbreak. And, as usual, anti-vaccine people are arguing that:

• Measles isn’t really that serious;
• Unvaccinated children have nothing to do with the outbreak; and
• More vaccinated people are being infected than unvaccinated, which shows that vaccines don’t help.

A few years back, I wrote a post about the math of vaccines; it seems like this is a good time to update it.

When it comes to vaccines, there’s two things that a lot of people don’t understand. One is herd immunity; the other is probability of infection.

Herd immunity is the fundamental concept behind vaccines.

In an ideal world, a person who’s been vaccinated against a disease would have no chance of catching it. But the real world isn’t ideal, and vaccines aren’t perfect. What a vaccine does is prime the recipient’s immune system in a way that reduces the probability that they’ll be infected.

But even if a vaccine for an illness were perfect, and everyone was vaccinated, that wouldn’t mean that it was impossible for anyone to catch the illness. There are many people who’s immune systems are compromised – people with diseases like AIDS, or people with cancer receiving chemotherapy. (Or people who’ve had the measles within the previous two years!) And that’s not considering the fact that there are people who, for legitimate medical reasons, cannot be vaccinated!

So individual immunity, provided by vaccines, isn’t enough to completely eliminate the spread of a contagious illness. To prevent outbreaks, we rely on an emergent property of a vaccinated population. If enough people are immune to the disease, then even if one person gets infected with it, the disease won’t be able to spread enough to produce a significant outbreak.

We can demonstrate this with some relatively simple math.

Let’s imagine a case of an infection disease. For illustration purposes, we’ll simplify things in way that makes the outbreak more likely to spread than reality. (So this makes herd immunity harder to attain than reality.)

• There’s a vaccine that’s 95% effective: out of every 100 people vaccinated against the disease, 95% are perfectly immune; the remaining 5% have no immunity at all.
• The disease is highly contagious: out of every 100 people who are exposed to the disease, 95% will be infected.

If everyone is immunized, but one person becomes ill with the disease, how many people do they need to expose to the disease for the disease to spread?

Keeping things simple: an outbreak, by definition, is a situation where the number of exposed people is steadily increasing. That can only happen if every sick person, on average, infects more than 1 other person with the illness. If that happens, then the rate of infection can grow exponentially, turning into an outbreak.

In our scheme here, only one out of 20 people is infectable – so, on average, if our infected person has enough contact with 20 people to pass an infection, then there’s a 95% chance that they’d pass the infection on to one other person. (19 of 20 are immune; the one remaining person has a 95% chance of getting infected). To get to an outbreak level – that is, a level where they’re probably going to infect more than one other person, they’d need expose something around 25 people (which would mean that each infected person, on average, could infect roughly 1.2 people). If they’re exposed to 20 other people on average, then on average, each infected person will infect roughly 0.9 other people – so the number of infected will decrease without turning into a significant outbreak.

But what will happen if just 5% of the population doesn’t get vaccinated? Then we’ve got 95% of the population getting vaccinated, with a 95% immunity rate – so roughly 90% of the population has vaccine immunity. Our pool of non-immune people has doubled. In our example scenario, if each person is exposed to 20 other people during their illness, then they will, on average, cause 1.8 people to get sick. And so we have a major outbreak on our hands!

This illustrates the basic idea behind herd immunity. If you can successfully make a large enough portion of the population non-infectable by a disease, then the disease can’t spread through the population, even though the population contains a large number of infectable people. When the population’s immunity rate (either through vaccine, or through prior infection) gets to be high enough that an infection can no longer spread, the population is said to have herd immunity: even individuals who can’t be immunized no longer need to worry about catching it, because the population doesn’t have the capacity to spread it around in a major outbreak.

(In reality, the effectiveness of the measles vaccine really is in the 95 percent range – actually slightly higher than that; various sources estimate it somewhere between 95 and 97 percent effective! And the success rate of the vaccine isn’t binary: 95% of people will be fully immune; the remaining 5% will have a varying degree of immunity And the infectivity of most diseases is lower than the example above. Measles (which is a highly, highly contagious disease, far more contagious than most!) is estimated to infect between 80 and 90 percent of exposed non-immune people. So if enough people are immunized, herd immunity will take hold even if more than 20 people are exposed by every sick person.)

Moving past herd immunity to my second point: there’s a paradox that some antivaccine people (including, recently, Sheryl Atkinson) use in their arguments. If you look at an outbreak of an illness that we vaccinate for, you’ll frequently find that more vaccinated people become ill than unvaccinated. And that, the antivaccine people say, shows that the vaccines don’t work, and the outbreak can’t be the fault of the unvaccinated folks.

Let’s look at the math to see the problem with that.

Let’s use the same numbers as above: 95% vaccine effectiveness, 95% contagion. In addition, let’s say that 2% of people choose to go unvaccinated.

That means thats that 98% of the population has been immunized, and 95% of them are immune. So now 92% of the population has immunity.

If each infected person has contact with 20 other people, then we can expect expect 8% of those 20 to be infectable – or 1.6; and of those, 95% will become ill – or 1.52. So on average, each sick person will infect 1 1/2 other people. That’s enough to cause a significant outbreak. Without the non-immunized people, the infection rate is less than 1 – not enough to cause an outbreak.

The non-immunized population reduced the herd immunity enough to cause an outbreak.

Within the population, how many immunized versus non-immunized people will get sick?

Out of every 100 people, there are 5 who got vaccinated, but aren’t immune. Out of that same 100 people, there are 2 (2% of 100) that didn’t get vaccinated. If every non-immune person is equally likely to become ill, then we’d expect that in 100 cases of the disease, about 70 of them to be vaccinated, and 30 unvaccinated.

The vaccinated population is much, much larger – 50 times larger! – than the unvaccinated.
Since that population is so much larger, we’d expect more vaccinated people to become ill, even though it’s the smaller unvaccinated group that broke the herd immunity!

The easiest way to see that is to take those numbers, and normalize them into probabilities – that is, figure out, within the pool of all vaccinated people, what their likelihood of getting ill after exposure is, and compare that to the likelihood of a non-vaccinated person becoming ill after exposure.

So, let’s start with the vaccinated people. Let’s say that we’re looking at a population of 10,000 people total. 98% were vaccinated; 2% were not.

• The total pool of vaccinated people is 9800, and the total pool of unvaccinated is 200.
• Of the 9800 who were vaccinated, 95% of them are immune, leaving 5% who are not – so
490 infectable people.
• Of the 200 people who weren’t vaccinated, all of them are infectable.
• If everyone is exposed to the illness, then we would expect about 466 of the vaccinated, and 190 of the unvaccinated to become ill.

So more than twice the number of vaccinated people became ill. But:

• The odds of a vaccinated person becoming ill are 466/9800, or about 1 out of every 21
people.
• The odds of an unvaccinated person becoming ill are 190/200 or 19 out of every 20 people! (Note: there was originally a typo in this line, which was corrected after it was pointed out in the comments.)

The numbers can, if you look at them without considering the context, appear to be deceiving. The population of vaccinated people is so much larger than the population of unvaccinated that the total number of infected can give the wrong impression. But the facts are very clear: vaccination drastically reduces an individuals chance of getting ill; and vaccinating the entire population dramatically reduces the chances of an outbreak.

The reality of vaccines is pretty simple.

• Vaccines are highly effective.
• The diseases that vaccines prevent are not benign.
• Vaccines are really, really safe. None of the horror stories told by anti-vaccine people have any basis in fact. Vaccines don’t damage your immune system, they don’t cause autism, and they don’t cause cancer.
• Not vaccinating your children (or yourself!) doesn’t just put you at risk for illness; it dramatically increases the chances of other people becoming ill. Even when more vaccinated people than unvaccinated become ill, that’s largely caused by the unvaccinated population.

In short: everyone who is healthy enough to be vaccinated should get vaccinated. If you don’t, you’re a despicable free-riding asshole who’s deliberately choosing to put not just yourself but other people at risk.

# Zombie Math in the Vortex

Paul Krugman has taken to calling certain kinds of economic ideas zombie economics, because no matter how many times they’re shown to be false, they just keep coming back from the dead. I certainly don’t have stature that compares in any way to Krugmant, but I’m still going to use his terminology for some bad math. There are some crackpot ideas that you just can’t kill.

For example, vortex math. I wrote about vortex math for the first time in 2012, again in early 2013, and again in late 2013. But like a zombie in a bad movie, it’s fans won’t let it stay dead. There must have been a discussion on some vortex-math fan forum recently, because over the last month, I’ve been getting comments on the old posts, and emails taking me to task for supposedly being unfair, closed-minded, ignorant, and generally a very nasty person.

Before I look at any of their criticisms, let’s start with a quick refresher. What is vortex math?

We’re going to create a pattern of single-digit numbers using multiples of 2. Take the number 1. Multiply it by 2, and you get 2. Multiple it by 2, and you get 4. Again, you get 8. Again, and you get 16. 16 is two digits, but we only want one-digit numbers, so we add them together, getting 7. Double, you get 14, so add the digits, and you get 5. Double, you get 10, add the digits, and you get 1. So you’ve got a repeating sequence: 1, 2, 4, 8, 7, 5, …

Take the numbers 1 through 9, and put them at equal distances around the perimeter of a circle. Draw an arrow from a number to its single-digit double. You end up with something that looks kinda-sorta like the infinity symbol. You can also fit those numbers onto the surface of a torus.

That’s really all there is to vortex math. This guy named Marco Rodin discovered that there’s a repeating pattern, and if you draw it on a circle, it looks kinda-like the infinity symbol, and that there must be something incredibly profound and important about it. Launching from there, he came up with numerous claims about what that means. According to vortex math, there’s something deeply significant about that pattern:

1. If you make metallic windings on a toroidal surface according to that pattern and use it as a generator, it will generate free energy.
2. Take that same coil, and run a current through it, and you have a perfect, reactionless space drive (called “the flux thruster atom pulsar electrical ventury space time implosion field generator coil”).
3. If you use those numbers as a pattern in a medical device, it will cure cancer, as well as every other disease.
4. If you use that numerical pattern, you can devise better compression algorithms that can compress any string of bits.
5. and so on…

Essentially, according to vortex math, that repeated pattern of numbers defines a “vortex”, which is the deepest structure in the universe, and it’s the key to understanding all of math, all of physics, all of metaphysics, all of medicine. It’s the fundamental pattern of everything, and by understanding it, you can do absolutely anything.

As a math geek, the problem with stuff like vortex math is that it’s difficult to refute mathematically, because even though Rodin calls it math, there’s really no math to it. There’s a pattern, and therefore magic! Beyond the observation that there’s a pattern, there’s nothing but claims of things that must be true because there’s a pattern, without any actual mathematical argument.

Let me show you an example, from one of Rodin’s followers, named Randy Powell.

I call my discovery the ABHA Torus. It is now the full completion of how to engineer Marko Rodin’s Vortex Based Mathematics. The ABHA Torus as I have discovered it is the true and perfect Torus and it has the ability to reveal in 3-D space any and all mathematical/geometric relationships possible allowing it to essentially accomplish any desired functional application in the world of technology. This is because the ABHA Torus provides us a mathematical framework where the true secrets of numbers (qualitative relationships based on angle and ratio) are revealed in fullness.

This is why I believe that the ABHA Torus as I have calculated is the most powerful mathematical tool in existence because it presents proof that numbers are not just flat imaginary things. To the contrary, numbers are stationary vector interstices that are real and exhibiting at all times spatial, temporal, and volumetric qualities. Being stationary means that they are fixed constants. In the ABHA Torus the numbers never move but the functions move through the numbers modeling vibration and the underlying fractal circuitry that natures uses to harness living energy.

The ABHA Torus as revealed by the Rodin/Powell solution displays a perfectly symmetrical spin array of numbers (revealing even prime number symmetry), a feat that has baffled countless scientists and mathematicians throughout the ages. It even uncovers the secret of bilateral symmetry as actually being the result of a diagonal motion along the surface and through the internal volume of the torus in an expanding and contracting polarized logarithmic spiral diamond grain reticulation pattern produced by the interplay of a previously unobserved Positive Polarity Energetic Emanation (so-called ‘dark’ or ‘zero-point’ energy) and a resulting Negative Polarity Back Draft Counter Space (gravity).

If experimentally proven correct such a model would for example replace the standard approach to toroidal coils used in energy production today by precisely defining all the proportional and angular relationships existent in a moving system and revealing not only the true pathway that all accelerated motion seeks (be it an electron around the nucleus of an atom or water flowing down a drain) but in addition revealing this heretofore unobserved, undefined point energetic source underlying all space-time, motion, and vibration.

Lots of impressive sounding words, strung together in profound sounding ways, but what does it mean? Sure, gravity is a “back draft” of an unobserved “positive polarity energetic emanatation”, and therefore we’ve unified dark energy and gravity, and unified all of the forces of our universe. That sounds terrific, except that it doesn’t mean anything! How can you test that? What evidence would be consistent with it? What evidence would be inconsistent with it? No one can answer those questions, because none of it means anything.

As I’ve said lots of times before: there’s a reason for the formal framework of mathematics. There’s a reason for the painful process of mathematical proof. There’s a reason why mathematicians and scientists have devised an elaborate language and notation for expressing mathematical ideas. And that reason is because it’s easy to string together words in profound sounding ways. It’s easy to string together reasoning in ways that look like they might be compelling if you took the time to understand them. But to do actual mathematics or actual science, you need to do more that string together something that sounds good. You need to put together something that is precise. The point of mathematical notation and mathematical reasoning is to take complex ideas and turn them into precisely defined, unambiguous structures that have the same meaning to everyone who looks at them.

“positive polarity energetic emanation” is a bunch of gobbledegook wordage that doesn’t mean anything to anyone. I can’t refute the claim that gravity is a back-draft negative polarity energetic reaction to dark energy. I can’t support that claim, either. I can’t do much of anything with it, because Randy Powell hasn’t said anything meaningful. It’s vague and undefined in ways that make it impossible to reason about in any way.

And that’s the way that things go throughout all of vortex math. There’s this cute pattern, and it must mean something! Therefore… endless streams of words, without any actual mathematical, physical, or scientific argument.

There’s so much wrong with vortex math, but it all comes down to the fact that it takes some arbitrary artifacts of human culture, and assigns them deep, profound meaning for no reason.

There’s this pattern in the doubling of numbers and reducing them to one digit. Why multiple by two? Because we like it, and it produces a pretty pattern. Why not use 3? Well, because in base-10, it won’t produce a good pattern: [1, 3, 9, 9, 9, 9, ….] But we can pick another number like 7: [1, 7, 5, 8, 2, 5, 8, 2, 5, ….], and get a perfectly good series: why is that series less compelling than [1, 4, 8, 7, 2, 5]?

There’s nothing magical about base-10. We can do the same thing in base-8: [1, 2, 4, 1, 2, 4…] How about base-12, which was used for a lot of stuff in Egypt? [1, 2, 4, 8, 5, 10, 9, 7, 3, 6, 1] – that gives us a longer pattern! What makes base-10 special? Why does the base-10 pattern mean something that other bases, or other numerical representations, don’t? The vortex math folks can’t answer that. (Note: I made an arithmetic error in the initial version of the base-12 sequence above. It was pointed out in comments by David Wallace. Thanks!)

If we plot the numbers on a circle, we get something that looks kind-of like an infinity symbol! What does that mean? Why should the infinity symbal (which was invented in the 17th century, and chosen because it looked sort of like a number, and sort-of like the last letter of the greek alphabet) have any intrinsic meaning to the universe?

It’s giving profound meaning to arbitrary things, for unsupported reasons.

So what’s in the recent flood of criticism from the vortex math guys?

Well, there’s a lot of “You’re mean, so you’re wrong.” And there’s a lot of “Why don’t you prove that they’re wrong instead of making fun of them?”. And last but not least, there’s a lot of “Yeah, well, the fibonacci series is just a pattern of numbers too, but it’s really important”.

On the first: Yeah, fine, I’m mean. But I get pretty pissed at seeing people get screwed over by charlatans. The vortex math guys use this stuff to take money from “investors” based on their claims about producing limitless free energy, UFO space drives, and cancer cures. This isn’t abstract: this kind of nonsense hurts people. They people who are pushing these scams deserve to be mocked, without mercy. They don’t deserve kindness or respect, and they’re not going to get it from me.

I’d love to be proved wrong on this. One of my daughter’s friends is currently dying of cancer. I’d give up nearly anything to be able to stop her, and other children like her, from dying an awful death. If the vortex math folks could do anything for this poor kid, I would gladly grovel and humiliate myself at their feet. I would dedicate the rest of my life to nothing but helping them in their work.

But the fact is, when they talk about the miraculous things vortex math can do? At best, they’re delusional; more likely, they’re just lying. There is no cure for cancer in [1, 2, 4, 8, 7, 5, 1].

As for the Fibonacci series: well. It’s an interesting pattern. It does appear to show up in some interesting places in nature. But there are two really important differences.

1. The Fibonacci series shows up in every numeric notation, in every number base, no matter how you do numbers.
2. It does show up in nature. This is key: there’s more to it than just words and vague assertions. You can really find fragments of the Fibonacci series in nature. By doing a careful mathematical analysis, you can find the Fibonacci series in numerous places in mathematics, such as the solutions to a range of interesting dynamic optimization problems. When you find a way of observing the vortex math pattern in nature, or a way of producing actual numeric solutions for real problems, in a way that anyone can reproduce, I’ll happily give it another look.
3. The Fibonacci series does appear in nature – but it’s also been used by numerous crackpots to make ridiculous assertions about how the world must work!

# Understanding Global Warming Scale Issues

Aside from the endless stream of Cantor cranks, the next biggest category of emails I get is from climate “skeptics”. They all ask pretty much the same question. For example, here’s one I received today:

My personal analysis, and natural sceptisism tells me, that there are something fundamentally wrong with the entire warming theory when it comes to the CO2.

If a gas in the atmosphere increase from 0.03 to 0.04… that just cant be a significant parameter, can it?

I generally ignore it, because… let’s face it, the majority of people who ask this question aren’t looking for a real answer. But this one was much more polite and reasonable than most, so I decided to answer it. And once I went to the trouble of writing a response, I figured that I might as well turn it into a post as well.

The current figures – you can find them in a variety of places from wikipedia to the US NOAA – are that the atmosphere CO2 has changed from around 280 parts per million in 1850 to 400 parts per million today.

Why can’t that be a significant parameter?

There’s a couple of things to understand to grasp global warming: how much energy carbon dioxide can trap in the atmosphere, and hom much carbon dioxide there actually is in the atmosphere. Put those two facts together, and you realize that we’re talking about a massive quantity of carbon dioxide trapping a massive amount of energy.

The problem is scale. Humans notoriously have a really hard time wrapping our heads around scale. When numbers get big enough, we aren’t able to really grasp them intuitively and understand what they mean. The difference between two numbers like 300 and 400ppm is tiny, we can’t really grasp how in could be significant, because we aren’t good at taking that small difference, and realizing just how ridiculously large it actually is.

If you actually look at the math behind the greenhouse effect, you find that some gasses are very effective at trapping heat. The earth is only habitable because of the carbon dioxide in the atmosphere – without it, earth would be too cold for life. Small amounts of it provide enough heat-trapping effect to move us from a frozen rock to the world we have. Increasing the quantity of it increases the amount of heat it can trap.

Let’s think about what the difference between 280 and 400 parts per million actually means at the scale of earth’s atmosphere. You hear a number like 400ppm – that’s 4 one-hundreds of one percent – that seems like nothing, right? How could that have such a massive effect?!

But like so many other mathematical things, you need to put that number into the appropriate scale. The earths atmosphere masses roughly 5 times 10^21 grams. 400ppm of that scales to 2 times 10^18 grams of carbon dioxide. That’s 2 billion trillion kilograms of CO2. Compared to 100 years ago, that’s about 800 million trillion kilograms of carbon dioxide added to the atmosphere over the last hundred years. That’s a really, really massive quantity of carbon dioxide! scaled to the number of particles, that’s something around 10^40th (plus or minus a couple of powers of ten – at this scale, who cares?) additional molecules of carbon dioxide in the atmosphere. It’s a very small percentage, but it’s a huge quantity.

When you talk about trapping heat, you also have to remember that there’s scaling issues there, too. We’re not talking about adding 100 degrees to the earths temperature. It’s a massive increase in the quantity of energy in the atmosphere, but because the atmosphere is so large, it doesn’t look like much: just a couple of degrees. That can be very deceptive – 5 degrees celsius isn’t a huge temperature difference. But if you think of the quantity of extra energy that’s being absorbed by the atmosphere to produce that difference, it’s pretty damned huge. It doesn’t necessarily look like all that much when you see it stated at 2 degrees celsius – but if you think of it terms of the quantity of additional energy being trapped by the atmosphere, it’s very significant.

Calculating just how much energy a molecule of CO2 can absorb is a lot trickier than calculating the mass-change of the quantity of CO2 in the atmosphere. It’s a complicated phenomenon which involves a lot of different factors – how much infrared is absorbed by an atom, how quickly that energy gets distributed into the other molecules that it interacts with… I’m not going to go into detail on that. There’s a ton of places, like here, where you can look up a detailed explanation. But when you consider the scale issues, it should be clear that there’s a pretty damned massive increase in the capacity to absorb energy in a small percentage-wise increase in the quantity of CO2.

# Polls and Sampling Errors in the Presidental Debate Results

My biggest pet peeve is press coverage of statistics. As someone who is mathematically literate, I’m constantly infuriated by it. Basic statistics isn’t that hard, but people can’t be bothered to actually learn a tiny bit in order to understand the meaning of the things they’re covering.

My twitter feed has been exploding with a particularly egregious example of this. After monday night’s presidential debate, there’s been a ton of polling about who “won” the debate. One conservative radio host named Bill Mitchell has been on a rampage about those polls. Here’s a sample of his tweets:

Statistical analysis has a very simple point. We’re interested in understanding the properties of a large population of things. For whatever reason, we can’t measure the properties of every object in that population.

The exact reason can vary. In political polling, we can’t ask every single person in the country who they’re going to vote for. (Even if we could, we simply don’t know who’s actually going to show up and vote!) For a very different example, my first exposure to statistics was through my father, who worked in semiconductor manufacturing. They’d produce a run of 10,000 chips for use in Satellites. They needed to know when, on average, a chip would fail from exposure to radiation. If they measured that in every chip, they’d end up with nothing to sell.)

Anyway: you can’t measure every element of the population, but you still want to take measurements. So what you do is randomly select a collection of representative elements from the population, and you measure those. Then you can say that with a certain probability, the result of analyzing that representative subset will match the result that you’d get if you measured the entire population.

How close can you get? If you’ve really selected a random sample of the population, then the answer depends on the size of the sample. We measure that using something called the “margin of error”. “Margin of error” is actually a terrible name for it, and that’s the root cause of one of the most common problems in reporting about statistics. The margin of error is a probability measurement that says “there is an $N$% probability that the value for the full population lies within the margin of error of the measured value of the sample.”.

Right away, there’s a huge problem with that. What is that variable doing in there? The margin of error measures the probability that the full population value is within a confidence interval around the measured sample value. If you don’t say what the confidence interval is, the margin of error is worthless. Most of the time – but not all of the time – we’re talking about a 95% confidence interval.

But there are several subtler issues with the margin of error, both due to the name.

1. The “true” value for the full population is not guaranteed to be within the margin of error of the sampled value. It’s just a probability. There is no hard bound on the size of the error: just a high probability of it being within the margin..
2. The margin of error only includes errors due to sample size. It does not incorporate any other factor – and there are many! – that may have affected the result.
3. The margin of error is deeply dependent on the way that the underlying sample was taken. It’s only meaningful for a random sample. That randomness is critically important: all of sampled statistics is built around the idea that you’ve got a randomly selected subset of your target population.

Let’s get back to our friend the radio host, and his first tweet, because he’s doing a great job of illustrating some of these errors.

The quality of a sampled statistic is entirely dependent on how well the sample matches the population. The sample is critical. It doesn’t matter how big the sample size is if it’s not random. A non-random sample cannot be treated as a representative sample.

So: an internet poll, where a group of people has to deliberately choose to exert the effort to participate cannot be a valid sample for statistical purposes. It’s not random.

It’s true that the set of people who show up to vote isn’t a random sample. But that’s fine: the purpose of an election isn’t to try to divine what the full population thinks. It’s to count what the people who chose to vote think. It’s deliberately measuring a full population: the population of people who chose to vote.

But if you’re trying to statistically measure something about the population of people who will go and vote, you need to take a randomly selected sample of people who will go to vote. The set of voters is the full population; you need to select a representative sample of that population.

Internet polls do not do that. At best, they measure a different population of people. (At worst, with ballot stuffing, they measure absolutely nothing, but we’ll give them this much benefit of the doubt.) So you can’t take much of anything about the sample population and use it to reason about the full population.

And you can’t say anything about the margin of error, either. Because the margin of error is only meaningful for a representative sample. You cannot compute a meaningful margin of error for a non-representative sample, because there is no way of knowing how that sampled population compares to the true full target population.

And that brings us to the second tweet. A properly sampled random population of 500 people can produce a high quality result with a roughly 5% margin of error and a 95% confidence interval. (I’m doing a back-of-the-envelope calculation here, so that’s not precise.) That means that if the population were randomly sampled, we could say there is in 19 out of 20 polls of that size, the full population value would be within +/- 4% of value measured by the poll. For a non-randomly selected sample of 10 million people, the margin of error cannot be measured, because it’s meaningless. The random sample of 500 people tells us a reasonable estimate based on data; the non-random sample of 10 million people tells us nothing.

And with that, on to the third tweet!

In a poll like this, the margin of error only tells us one thing: what’s the probability that the sampled population will respond to the poll in the same way that the full population would?

There are many, many things that can affect a poll beyond the sample size. Even with a truly random and representative sample, there are many things that can affect the outcome. For a couple of examples:

How, exactly, is the question phrased? For example, if you ask people “Should police shoot first and ask questions later?”, you’ll get a very different answer from “Should police shoot dangerous criminal suspects if they feel threatened?” – but both of those questions are trying to measure very similar things. But the phrasing of the questions dramatically affects the outcome.

What context is the question asked in? Is this the only question asked? Or is it asked after some other set of questions? The preceding questions can bias the answers. If you ask a bunch of questions about how each candidate did with respect to particular issues before you ask who won, those preceding questions will bias the answers.

When you’re looking at a collection of polls that asked different questions in different ways, you expect a significant variation between them. That doesn’t mean that there’s anything wrong with any of them. They can all be correct even though their results vary by much more than their margins of error, because the margin of error has nothing to do with how you compare their results: they used different samples, and measured different things.

The problem with the reporting is the same things I mentioned up above. The press treats the margin of error as an absolute bound on the error in the computed sample statistics (which it isn’t); and the press pretends that all of the polls are measuring exactly the same thing, when they’re actually measuring different (but similar) things. They don’t tell us what the polls are really measuring; they don’t tell us what the sampling methodology was; and they don’t tell us the confidence interval.

Which leads to exactly the kind of errors that Mr. Mitchell made.

And one bonus. Mr. Mitchell repeatedly rants about how many polls show a “bias” by “over-sampling< democratic party supporters. This is a classic mistake by people who don't understand statistics. As I keep repeating, for a sample to be meaningful, it must be random. You can report on all sorts of measurements of the sample, but you cannot change it.

If you’re randomly selecting phone numbers and polling the respondents, you cannot screen the responders based on their self-reported party affiliation. If you do, you are biasing your sample. Mr. Mitchell may not like the results, but that doesn’t make them invalid. People report what they report.

In the last presidential election, we saw exactly this notion in the idea of “unskewing” polls, where a group of conservative folks decided that the polls were all biased in favor of the democrats for exactly the reasons cited by Mr. Mitchell. They recomputed the poll results based on shifting the samples to represent what they believed to be the “correct” breakdown of party affiliation in the voting population. The results? The actual election results closely tracked the supposedly “skewed” polls, and the unskewers came off looking like idiots.

We also saw exactly this phenomenon going on in the Republican primaries this year. Randomly sampled polls consistently showed Donald Trump crushing his opponents. But the political press could not believe that Donald Trump would actually win – and so they kept finding ways to claim that the poll samples were off: things like they were off because they used land-lines which oversampled older people, and if you corrected for that sampling error, Trump wasn’t actually winning. Nope: the randomly sampled polls were correct, and Donald Trump is the republican nominee.

If you want to use statistics, you must work with random samples. If you don’t, you’re going to screw up the results, and make yourself look stupid.

# Why we need formality in mathematics

The comment thread from my last Cantor crankery post has continued in a way that demonstrates a common issue when dealing with bad math, so I thought it was worth taking the discussion and promoting it to a proper top-level post.

The defender of the Cantor crankery tried to show what he alleged to be the problem with Cantor, by presenting a simple proof:

If we have a unit line, then this line will have an infinite number of points in it. Some of these points will be an irrational distance away from the origin and some will be a rational distance away from the origin.

Premise 1.

To have more irrational points on this line than rational points (plus 1), it is necessary to have at least two irrational points on the line so that there exists no rational point between them.

Premise 2.

It is not possible to have two irrational points on a line so that no rational point exists between them.

Conclusion.

It is not possible to have more irrational points on a line than rational points (plus 1).

This contradicts Cantor’s conclusion, so Cantor must have made a mistake in his reasoning.

(I’ve done a bit of formatting of this to make it look cleaner, but I have not changed any of the content.)

This is not a valid proof. It looks nice on the surface – it intuitively feels right. But it’s not. Why?

Because math isn’t intuition. Math is a formal system. When we’re talking about Cantor’s diagonalization, we’re working in the formal system of set theory. In most modern math, we’re specifically working in the formal system of Zermelo-Fraenkel (ZF) set theory. And that “proof” relies on two premises, which are not correct in ZF set theory. I pointed this out in verbose detail, to which the commenter responded:

I can understand your desire for a proof to be in ZFC, Peano arithmetic and FOPL, it is a good methodology but not the only one, and I am certain that it is not a perfect one. You are not doing yourself any favors if you let any methodology trump understanding. For me it is far more important to understand a proof, than to just know it “works” under some methodology that simply manipulates symbols.

This is the point I really wanted to get to here. It’s a form of complaint that I’ve seen over and over again – not just in the Cantor crankery, but in nearly all of the math posts.

There’s a common belief among crackpots of various sorts that scientists and mathematicians use symbols and formalisms just because we like them, or because we want to obscure things and make simple things seem complicated, so that we’ll look smart.

That’s just not the case. We use formalisms and notation because they are absolutely essential. We can’t do math without the formalisms; we could do it without the notation, but the notation makes things clearer than natural language prose.

The reason for all of that is because we want to be correct.

If we’re working with a definition that contains any vagueness – even the most subtle unintentional kind (or, actually, especially the most subtle unintentional kind!) – then we can easily produce nonsense. There’s a simple “proof” that we’ve discussed before that shows that 0 is equal to 1. It looks correct when you read it. But it contains a subtle error. If we weren’t being careful and formal, that kind of mistake can easily creep in – and once you allow one, single, innocuous looking error into a proof, the entire proof falls apart. The reason for all the formalism and all the notation is to give us a way to unambiguously, precisely state exactly what we mean. The reason that we insist of detailed logical step-by-step proofs is because that’s the only way to make sure that we aren’t introducing errors.

We can’t rely on intuition, because our intuition is frequently not correct. That’s why we use logic. We can’t rely on informal statements, because informal statements lack precision: they can mean many different things, some of which are true, and some of which are not.

In the case of Cantor’s diagonalization, when we’re being carefully precise, we’re not talking about the size of things: we’re talking about the cardinality of sets. That’s an important distinction, because “size” can mean many different things. Cardinality means one, very precise thing.

Similarly, we’re talking about the cardinality of the set of real numbers compared to the cardinality of the set of natural numbers. When I say that, I’m not just hand-waving the real numbers: the real numbers means something very specific: it’s the unique complete totally ordered field $(R, +, *, <)$ up to isomorphism. To understand that, we’re implicitly referencing the formal definition of a field (with all of its sub-definitions) and the formal definitions of the addition, multiplication, and ordering operations.

I’m not just saying that to be pedantic. I’m saying that because we need to know exactly what we’re talking about. It’s very easy to put together an informal definition of the real numbers that’s different from the proper mathematical set of real numbers. For example, you can define a number system consisting of the set of all numbers that can be generated by a finite, non-terminating computer program. Intuitively, it might seem like that’s just another way of describing the real numbers – but it describes a very different set.

Beyond just definitions, we insist on using formal symbolic logic for a similar reason. If we can reduce the argument to symbolic reasoning, then we’ve abstracted away anything that could bias or deceive us. The symbolic logic makes every statement absolutely precise, and every reasoning step pure, precise, and unbiased.

So what’s wrong with the “proof” above? It’s got two premises. Let’s just look at the first one: “To have more irrational points on this line than rational points (plus 1), it is necessary to have at least two irrational points on the line so that there exists no rational point between them.”.

If this statement is true, then Cantor’s proof must be wrong. But is this statement true? The commenter’s argument is that it’s obviously intuitively true.

If we weren’t doing math, that might be OK. But this is math. We can’t just rely on our intuition, because we know that our intuition is often wrong. So we need to ask: can you prove that that’s true?

And how do you prove something like that? Well, you start with the basic rules of your proof system. In a discussion of a set theory proof, that means ZF set theory and first order predicate logic. Then you add in the definitions you need to talk about the objects you’re interested in: so Peano arithmetic, rational numbers, real number theory, and the definition of irrational numbers in real number theory. That gives you a formal system that you can use to talk about the sets of real numbers, rational numbers, and natural numbers.

The problem for our commenter is that you can’t prove that premise using ZF logic, FOPL, and real number theory. It’s not true. It’s based on a faulty understanding of the behavior of infinite sets. It’s taking an assumption that comes from our intuition, which seems reasonable, but which isn’t actually true within the formal system o mathematics.

In particular, it’s trying to say that in set theory, the cardinality of the set of real numbers is equal to the cardinality of the set of natural numbers – but doing so by saying “Ah, Why are you worrying about that set theory nonsense? Sure, it would be nice to prove this statement about set theory using set theory, but you’re just being picky on insisting that.”

Once you really see it in these terms, it’s an absurd statement. It’s equivalent to something as ridiculous as saying that you don’t need to modify verbs by conjugating them when you speak english, because in Chinese, the spoken words don’t change for conjugation.

# Cantor Crankery is Boring

Sometimes, I think that I’m being punished.

I’ve written about Cantor crankery so many times. In fact, it’s one of the largest categories in this blog’s index! I’m pretty sure that I’ve covered pretty much every anti-Cantor argument out there. And yet, not a week goes by when another idiot doesn’t pester me with their “new” refutation of Cantor. The “new” argument is always, without exception, a variation on one of the same old boring ones.

But I haven’t written any Cantor stuff in quite a while, and yet another one landed in my mailbox this morning. So, what the heck. Let’s go down the rabbit-hole once again.

The argument that the cranks hate is called Cantor’s diagonalization. Cantor’s diagonalization as argument that according to the axioms of set theory, the cardinality (size) of the set of real numbers is strictly larger than the cardinality of the set of natural numbers.

The argument is based on the set theoretic definition of cardinality. In set theory, two sets are the same size if and only if there exists a one-to-one mapping between the two sets. If you try to create a mapping between set A and set B, and in every possible mapping, every A is mapped onto a unique B, but there are leftover Bs that no element of A maps to, then the cardinality of B is larger than the cardinality of A.

When you’re talking about finite sets, this is really easy to follow. If I is the set {1, 2, 3}, and B is the set {4, 5, 6, 7}, then it’s pretty obvious that there’s no one to one mapping from A to B: there are more elements in B than there are in A. You can easily show this by enumerating every possible mapping of elements of A onto elements of B, and then showing that in every one, there’s an element of B that isn’t mapped to by an element of A.

With infinite sets, it gets complicated. Intuitively, you’d think that the set of even natural numbers is smaller than the set of all natural numbers: after all, the set of evens is a strict subset of the naturals. But your intuition is wrong: there’s a very easy one to one mapping from the naturals to the evens: {n → 2n }. So the set of even natural numbers is the same size as the set of all natural numbers.

To show that one infinite set has a larger cardinality than another infinite set, you need to do something slightly tricky. You need to show that no matter what mapping you choose between the two sets, some elements of the larger one will be left out.

In the classic Cantor argument, what he does is show you how, given any purported mapping between the natural numbers and the real numbers, to find a real number which is not included in the mapping. So no matter what mapping you choose, Cantor will show you how to find real numbers that aren’t in the mapping. That means that every possible mapping between the naturals and the reals will omit members of the reals – which means that the set of real numbers has a larger cardinality than the set of naturals.

Cantor’s argument has stood since it was first presented in 1891, despite the best efforts of people to refute it. It is an uncomfortable argument. It violates our intuitions in a deep way. Infinity is infinity. There’s nothing bigger than infinity. What does it even mean to be bigger than infinity? That’s a non-sequiter, isn’t it?

What it means to be bigger than infinity is exactly what I described above. It means that if you have a two infinitely large sets of objects, and there’s no possible way to map from one to the other without missing elements, then one is bigger than the other.

There are legitimate ways to dispute Cantor. The simplest one is to reject set theory. The diagonalization is an implication of the basic axioms of set theory. If you reject set theory as a basis, and start from some other foundational axioms, you can construct a version of mathematics where Cantor’s proof doesn’t work. But if you do that, you lose a lot of other things.

You can also argue that “cardinality” is a bad abstraction. That is, that the definition of cardinality as size is meaningless. Again, you lose a lot of other things.

If you accept the axioms of set theory, and you don’t dispute the definition of cardinality, then you’re pretty much stuck.

Ok, background out of the way. Let’s look at today’s crackpot. (I’ve reformatted his text somewhat; he sent this to me as plain-text email, which looks awful in my wordpress display theme, so I’ve rendered it into formatted HTML. Any errors introduced are, of course, my fault, and I’ll correct them if and when they’re pointed out to me.)

We have been told that it is not possible to put the natural numbers into a one to one with the real numbers. Well, this is not true. And the argument, to show this, is so simple that I am absolutely surprised that this argument does not appear on the internet.

We accept that the set of real numbers is unlistable, so to place them into a one to one with the natural numbers we will need to make the natural numbers unlistable as well. We can do this by mirroring them to the real numbers.

Given any real number (between 0 and 1) it is possible to extract a natural number of any length that you want from that real number.

Ex: From π-3 we can extract the natural numbers 1, 14, 141, 1415, 14159 etc…

We can form a set that associates the extracted number with the real number that it was extracted from.

Ex: 1 → 0.14159265…

Then we can take another real number (in any arbitrary order) and extract a natural number from it that is not in our set.

Ex: 1 → 0.14159266… since 1 is already in our set we must extract the next natural number 14.

Since 14 is not in our set we can add the pair 14 → 0.1415926l6… to our set.

We can do the same thing with some other real number 0.14159267… since 1 and 14 is already in our set we will need to extract a 3 digit natural number, 141, and place it in our set. And so on.

So our set would look something like this…

 A) 1 → 0.14159265… B) 14 → 0.14159266… C) 141 → 0.14159267… D) 1410 → 0.141 E) 14101 → 0.141013456789… F) 5 → 0.567895… G) 55 → 0.5567891… H) 555 → 0.555067891… … …

Since the real numbers are infinite in length (some terminate in an infinite string of zero’s) then we can always extract a natural number that is not in the set of pairs since all the natural numbers in the set of pairs are finite in length. Even if we mutate the diagonal of the real numbers, we will get a real number not on the list of real numbers, but we can still find a natural number, that is not on the list as well, to correspond with that real number.

Therefore it is not possible for the set of real numbers to have a larger cardinality than the set of natural numbers!

This is a somewhat clever variation on a standard argument.

Over and over and over again, we see arguments based on finite prefixes of real numbers. The problem with them is that they’re based on finite prefixes. The set of all finite prefixes of the real numbers is that there’s an obvious correspondence between the natural numbers and the finite prefixes – but that still doesn’t mean that there are no real numbers that aren’t in the list.

In this argument, every finite prefix of π corresponds to a natural number. But π itself does not. In fact, every real number that actually requires an infinite number of digits has no corresponding natural number.

This piece of it is, essentially, the same thing as John Gabriel’s crankery.

But there’s a subtler and deeper problem. This “refutation” of Cantor contains the conclusion as an implicit premise. That is, it’s actually using the assumption that there’s a one-to-one mapping between the naturals and the reals to prove the conclusion that there’s a one-to-one mapping between the naturals and the reals.

If you look at his procedure for generating the mapping, it requires an enumeration of the real numbers. You need to take successive reals, and for each one in the sequence, you produce a mapping from a natural number to that real. If you can’t enumerate the real numbers as a list, the procedure doesn’t work.

If you can produce a sequence of the real numbers, then you don’t need this procedure: you’ve already got your mapping. 0 to the first real, 1 to the second real, 2 to the third read, 3 to the fourth real, and so on.

So, once again: sorry Charlie: your argument doesn’t work. There’s no Fields medal for you today.

One final note. Like so many other bits of bad math, this is a classic example of what happens when you try to do math with prose. There’s a reason that mathematicians have developed formal notations, formal language, detailed logical inference, and meticulous citation. It’s because in prose, it’s easy to be sloppy. You can accidentally introduce an implicit premise without meaning to. If you need to carefully state every premise, and cite evidence of its truth, it’s a whole lot harder to make this kind of mistake.

That’s the real problem with this whole argument. It’s built on the fact that the premise “you can enumerate the real numbers” is built in. If you wrote it in careful formal mathematics, you wouldn’t be able to get away with that.

# UD Creationists and Proof

A reader sent me a link to a comment on one of my least favorite major creationist websites, Uncommon Descent (No link, I refuse to link to UD). It’s dumb enough that it really deserves a good mocking.

Barry Arrington, June 10, 2016 at 2:45 pm

daveS:
“That 2 + 3 = 5 is true by definition can be verified in a purely mechanical, absolutely certain way.”

This may be counter intuitive to you dave, but your statement is false. There is no way to verify that statement. It is either accepted as self-evidently true, or not. Think about it. What more basic steps of reasoning would you employ to verify the equation? That’s right; there are none. You can say the same thing in different ways such as || + ||| = ||||| or “a set with a cardinality of two added to a set with cardinality of three results in a set with a cardinality of five.” But they all amount to the same statement.

That is another feature of a self-evident truth. It does not depend upon (indeed cannot be) “verified” (as you say) by a process of “precept upon precept” reasoning. As WJM has been trying to tell you, a self-evident truth is, by definition, a truth that is accepted because rejection would be upon pain of patent absurdity.

2+3=5 cannot be verified. It is accepted as self-evidently true because any denial would come at the price of affirming an absurdity.

It’s absolutely possible to verify the statement “2 + 3 = 5”. It’s also absolutely possible to prove that statement. In fact, both of those are more than possible: they’re downright easy, provided you accept the standard definitions of arithmetic. And frankly, only a total idiot who has absolutely no concept of what verification or proof mean would ever claim otherwise.

Verification is the process of testing a hypothesis to determine if it correctly predicts the outcome. Here’s how you verify that 2+3=5:

1. Get two pennies, and put them in a pile.
2. Get three pennies, and put them in a pile.
3. Put the pile of 2 pennies on top of the pile of 3 pennies.
4. Count the resulting pile of pennies.
5. If there are 5 pennies, then you have verified that 2+3=5.

Verification isn’t perfect. It’s the result of a single test that confirms what you expect. But verification is repeatable: you can repeat that experiment as many times as you want, and you’ll always get the same result: the resulting pile will always have 5 pennies.

Proof is something different. Proof is a process of using a formal system to demonstrate that within that formal system, a given statement necessarily follows from a set of premises. If the formal system has a valid model, and you accept the premises, then the proof shows that the conclusion must be true.

In formal terms, a proof operates within a formal system called a logic. The logic consists of:

1. A collection of rules (called syntax rules or formation rules) that define how to construct a valid statement are in the logical language.
2. )

3. A collection of rules (called inference rules) that define how to use true statements to determine other true statements.
4. A collection of foundational true statements called axioms.

Note that “validity”, as mentioned in the syntax rules, is a very different thing from “truth”. Validity means that the statement has the correct structural form. A statement can be valid, and yet be completely meaningless. “The moon is made of green cheese” is a valid sentence, which can easily be rendered in valid logical form, but it’s not true. The classic example of a meaningless statement is “Colorless green ideas sleep furiously”, which is syntactically valid, but utterly meaningless.

Most of the time, when we’re talking about logic and proofs, we’re using a system of logic called first order predicate logic, and a foundational system of axioms called ZFC set theory. Built on those, we define numbers using a collection of definitions called Peano arithmetic.

In Peano arithmetic, we define the natural numbers (that is, the set of non-negative integers) by defining 0 (the cardinality of the empty set), and then defining the other natural numbers using the successor function. In this system, the number zero can be written as $z$; one is $s(z)$ (the successor of $z$); two is the successor of 1: $s(1) = s(s(z))$. And so on.

Using Peano arithmetic, addition is defined recursively:

1. For any number $x$, $x + 0 = x$.
2. For any number numbers x and y: $s(x)+y=x+s(y)$.

So, using peano arithmetic, here’s how we can prove that $2+3=5$:

1. In Peano arithemetic form, $2+3$ means $s(s(z)) + s(s(s(z)))$.
2. From rule 2 of addition, we can infer that $s(s(z)) + s(s(s(z)))$ is the same as $s(z) + s(s(s(s(z))))$. (In numerical syntax, 2+3 is the same as 1+4.)
3. Using rule 2 of addition again, we can infer that $s(z) + s(s(s(s(z)))) = z + s(s(s(s(s(z)))))$ (1+4=0+5); and so, by transitivity, that 2+3=0+5.
4. Using rule 1 of addition, we can then infer that $0+5=5$; and so, by transitivity, 2+3=5.

You can get around this by pointing out that it’s certainly not a proof from first principles. But I’d argue that if you’re talking about the statement “2+3=5” in the terms of the quoted discussion, that you’re clearly already living in the world of FOPL with some axioms that support peano arithmetic: if you weren’t, then the statement “2+3=5” wouldn’t have any meaning at all. For you to be able to argue that it’s true but unprovable, you must be living in a world in which arithmetic works, and that means that the statement is both verifiable and provable.

If you want to play games and argue about axioms, then I’ll point at the Principia Mathematica. The Principia was an ultimately misguided effort to put mathematics on a perfect, sound foundation. It started with a minimal form of predicate logic and a tiny set of inarguably true axioms, and attempted to derive all of mathematics from nothing but those absolute, unquestionable first principles. It took them a ton of work, but using that foundation, you can derive all of number theory – and that’s what they did. It took them 378 pages of dense logic, but they ultimately build a rock-solid model of the natural numbers, and used that to demonstrate the validity of Peano arithmetic, and then in turn used that to prove, once and for all, that 1+1=2. Using the same proof technique, you can show from first principles, that 2+3=5.

But in a world in which we don’t play semantic games, and we accept the basic principle of Peano arithmetic as a given, it’s a simple proof. It’s a simple proof that can be found in almost any textbook on foundational mathematics or logic. But note how Arrington responds to it: by playing word-games, rephrasing the question in a couple of different ways to show off how much he knows, while completely avoiding the point.

What does it take to conclude that you can’t verify or prove something like 2+3=5? Profound, utter ignorance. Anyone who’s spent any time learning math should know better.

But over at Uncommon Descent? They don’t think they need to actually learn stuff. They don’t need to refer to ungodly things like textbook. They’ve got their God, and that’s all they need to know.

To be clear, I’m not anti-religion. I’m a religious Jew. Uncommon Descent and their rubbish don’t annoy me because I have a grudge against theism. I have a grudge against ignorance. And UD is a huge promoter of arrogant, dishonest ignorance.

# Elon Musk’s Techno-Religion

A couple of people have written to me asking me to say something about Elon Musk’s simulation argument.

Unfortunately, I haven’t been able to find a verbatim quote from Musk about his argument, and I’ve seen a couple of slightly different arguments presented as being what Musk said. So I’m not really going to focus so much on Musk, but instead, just going to try to take the basic simulation argument, and talk about what’s wrong with it from a mathematical perspective.

The argument isn’t really all that new. I’ve found a couple of sources that attribute it to a paper published in 2003. That 2003 paper may have been the first academic publication, and it might have been the first to present the argument in formal terms, but I definitely remember discussing this in one of my philophy classes in college in the late 1980s.

Here’s the argument:

1. Any advanced technological civilization is going to develop massive computational capabilities.
2. With immense computational capabilities, they’ll run very detailed simulations of their own ancestors in order to understand where they came from.
3. Once it is possible to run simulations, they will run many of them to explore how different parameters will affect the simulated universe.
4. That means that advanced technological civilization will run many simulations of universes where their ancestors evolved.
5. Therefore the number of simulated universes with intelligent life will be dramatically larger than the number of original non-simulated civilizations.

If you follow that reasoning, then the odds are, for any given form of intelligent life, it’s more likely that they are living in a simulation than in an actual non-simulated universe.

As an argument, it’s pretty much the kind of crap you’d expect from a bunch of half drunk college kids in a middle-of-the-night bullshit session.

Let’s look at a couple of simple problems with it.

The biggest one is a question of size and storage. The heart of this argument is the assumption that for an advanced civilization, nearly infinite computational capability will effectively become free. If you actually try to look at that assumption in detail, it’s not reasonable.

The problem is, we live in a quantum universe. That is, we live in a universe made up of discrete entities. You can take an object, and cut it in half only a finite number of times, before you get to something that can’t be cut into smaller parts. It doesn’t matter how advanced your technology gets; it’s got to be made of the basic particles – and that means that there’s a limit to how small it can get.

Again, it doesn’t matter how advanced your computers get; it’s going to take more than one particle in the real universe to simulate the behavior of a particle. To simulate a universe, you’d need a computer bigger than the universe you want to simulate. There’s really no way around that: you need to maintain state information about every particle in the universe. You need to store information about everything in the universe, and you need to also have some amount of hardware to actually do the simulation with the state information. So even with the most advanced technology that you can possible imagine, you can’t possible to better than one particle in the real universe containing all of the state information about a particle in the simulated universe. If you did, then you’d be guaranteeing that your simulated universe wasn’t realistic, because its particles would have less state than particles in the real universe.

This means that to simulate something in full detail, you effectively need something bigger than the thing you’re simulating.

That might sound silly: we do lots of things with tiny computers. I’ve got an iPad in my computer bag with a couple of hundred books on it: it’s much smaller than the books it simulates, right?

The “in full detail” is the catch. When my iPad simulates a book, it’s not capturing all the detail. It doesn’t simulate the individual pages, much less the individual molecules that make up those pages, the individual atoms that make up those molecules, etc.

But when you’re talking about perfectly simulating a system well enough to make it possible for an intelligent being to be self-aware, you need that kind of detail. We know, from our own observations of ourselves, that the way our cells operates is dependent on incredibly fine-grained sub-molecular interactions. To make our bodies work correctly, you need to simulate things on that level.

You can’t simulate the full detail of a universe bigger that the computer that simulates it. Because the computer is made of the same things as the universe that it’s simulating.

There’s a lot of handwaving you can do about what things you can omit from your model. But at the end of the day, you’re looking at an incredibly massive problem, and you’re stuck with the simple fact that you’re talking, at least, about building a computer that can simulate an entire planet and its environs. And you’re trying to do it in a universe just like the one you’re simulating.

But OK, we don’t actually need to simulate the whole universe, right? I mean, you’re really interested in developing a single species like yourself, so you only care about one planet.

But to make that planet behave absolutely correctly, you need to be able to correctly simulate everything observable from that planet. Its solar system, you need to simulate pretty precisely. The galaxy around it needs less precision, but it still needs a lot of work. Even getting very far away, you’ve got an awful lot of stuff to simulate, because your simulated intelligences, from their little planet, are going to be able to observe an awful lot.

To simulate a planet and its environment with enough precision to get life and intelligence and civilization, and to do it at a reasonable speed, you pretty much need to have a computer bigger than the planet. You can cheat a little bit, and maybe abstract parts of the planet; but you’ve got to do pretty good simulations of lots of stuff outside the planet.

It’s possible, but it’s not particularly useful. Because you need to run that simulation. And since it’s made up of the same particles as the things it’s simulating, it can’t move faster than the universe it simulates. To get useful results, you’d need to build it to be massively parallel. And that means that your computer needs to be even larger – something like a million times bigger.

If technology were to get good enough, you could, in theory, do that. But it’s not going to be something you do a lot of: no matter how advanced technology gets, building a computer that can simulate an entire planet and its people in full detail is going to be a truly massive undertaking. You’re not going to run large numbers of simulations.

You can certainly wave you hands and say that the “real” people live in a universe without the kind of quantum limit that we live with. But if you do, you’re throwing other assumptions out the window. You’re not talking about ancestor simulation any more. And you’re pretending that you can make predictions based on our technology about the technology of people living in a universe with dramatically different properties.

This just doesn’t make any sense. It’s really just techno-religion. It’s based on the belief that technology is going to continue to develop computational capability without limit. That the fundamental structure of the universe won’t limit technology and computation. Essentially, it’s saying that technology is omnipotent. Technology is God, and just as in any other religion, it’s adherents believe that you can’t place any limits on it.

Rubbish.

# One plus one equals Two?

My friend Dr24hours sent me a link via twitter to a new piece of mathematical crackpottery. It’s the sort of thing that’s so trivial that I might lust ignore it – but it’s also a good example of something that someone commented on in my previous post.

This comes from, of all places, Rolling Stone magazine, in a puff-piece about an actor named Terrence Howard. When he’s not acting, Mr. Howard believes that he’s a mathematical genius who’s caught on to the greatest mathematical error of all time. According to Mr. Howard, the product of one times one is not one, it’s two.

After high school, he attended Pratt Institute in Brooklyn, studying chemical engineering, until he got into an argument with a professor about what one times one equals. “How can it equal one?” he said. “If one times one equals one that means that two is of no value because one times itself has no effect. One times one equals two because the square root of four is two, so what’s the square root of two? Should be one, but we’re told it’s two, and that cannot be.” This did not go over well, he says, and he soon left school. “I mean, you can’t conform when you know innately that something is wrong.”

I don’t want to harp on Mr. Howard too much. He’s clueless, but sadly, he’s a not too atypical student of american schools. I’ll take a couple of minutes to talk about what’s wrong with his stuff, but in context of a discussion of where I think this kind of stuff comes from.

In American schools, math is taught largely by rote. When I was a kid, set theory came into vogue, but by and large math teachers didn’t understand it – so they’d draw a few meaningless Venn diagrams, and then switch into pure procedure.

An example of this from my own life involves my older brother. My brother is not a dummy – he’s a very smart guy. He’s at least as smart as I am, but he’s interested in very different things, and math was never one of his interests.

I barely ever learned math in school. My father noticed pretty early on that I really enjoyed math, and so he did math with me for fun. He taught me stuff – not as any kind of “they’re not going to teach it right in school”, but just purely as something fun to do with a kid who was interested. So I learned a lot of math – almost everything up through calculus – from him, not from school. My brother didn’t – because he didn’t enjoy math, and so my dad did other things with him.

When we were in high school, my brother got a job at a local fast food joint. At the end of the year, he had to do his taxes, and my dad insisted that he do it himself. When he needed to figure out how much tax he owed on his income, he needed to compute a percentage. I don’t know the numbers, but for the sake of the discussion, let’s say that he made $5482 that summer, and the tax rate on that was 18%. He wrote down a pair of ratios: $\frac{18}{100} = \frac{x}{5482}$ And then he cross-multiplied, getting: $18 \times 5482 = 100 \times x$ $98676 = 100 \times x$ and so $x = 986.76$. My dad was shocked by this – it’s such a laborious way of doing it. So he started pressing at my brother. He asked him, if you went to a store, and they told you there was a 20% off sale on a pair of jeans that cost$18, how much of a discount would you get? He didn’t know. The only way he knew to figure it out was to do the whole ratios, cross-multiply, and solve. If you told him that 20% off of $18 was$5, he would have believed you. Because percentages just didn’t mean anything to him.

Now, as I said: my brother isn’t a dummy. But none of his math teachers had every taught him what percentages meant. He had no concept of their meaning: he knew a procedure for getting the value, but it was a completely blind procedure, devoid of meaning. And that’s what everything he’d learned about math was like: meaningless procedures performed by rote, without any comprehension.

That’s where nonsense like Terence Howard’s stuff comes from: math education that never bothered to teach students what anything means. If anyone had attempted to teach any form of meaning for arithmetic, the ridiculous of Mr. Howard’s supposed mathematics would be obvious.

For understanding basic arithmetic, I like to look at a geometric model of numbers.

Put a dot on a piece of paper. Label it “0”. Draw a line starting at zero, and put tick-marks on the line separated by equal distances. Starting at the first mark after 0, label the tick-marks 1, 2, 3, 4, 5, ….

In this model, the number one is the distance from 0 (the start of the line) to 1. The number two is the distance from 0 to 2. And so on.

Addition is just stacking lines, one after the other. Suppose you wanted to add 3 + 2. You draw a line that’s 3 tick-marks long. Then, starting from the end of that line, you draw a second line that’s 2 tick-marks long. 3 + 2 is the length of the resulting line: by putting it next to the original number-line, we can see that it’s five tick-marks long, so 3 + 2 = 5.

Multiplication is a different process. In multiplication, you’re not putting lines tip-to-tail: you’re building rectangles. If you want to multiply 3 * 2, what you do is draw a rectangle who’s width is 3 tick-marks long, and whose height is 2 tick-marks long. Now divide that into squares that are 1 tick-mark by one tick-mark. How many squares can you fit into that rectangle? 6. So 3*2 = 6.

Why does 1 times 1 equal 1? Because if you draw a rectangle that’s one hash-mark wide, and one hash-mark high, it forms exactly one 1×1 square. 1 times 1 can’t be two: it forms one square, not two.

If you think about the repercussions of the idea that 1*1=2, as long as you’re clear about meanings, it’s pretty obvious that 1*1=2 has a disastrously dramatic impact on math: it turns all of math into a pile of gibberish.

What’s 1*2? 2. 1*1=2 and 1*2=2, therefore 1=2. If 1=2, then 2=3, 3=4, 4=5: all integers are equal. If that’s true, then… well, numbers are, quite literally, meaningless. Which is quite a serious problem, unless you already believe that numbers are meaningless anyway.

In my last post, someone asked why I was so upset about the error in a math textbook. This is a good example of why. The new common core math curriculum, for all its flaws, does a better job of teaching understanding of math. But when the book teaches “facts” that are wrong, what they’re doing becomes the opposite. It doesn’t make sense – if you actually try to understand it, you just get more confused.

That teaches you one of two things. Either it teaches you that understanding this stuff is futile: that all you can do is just learn to blindly reproduce the procedures that you were taught, without understanding why. Or it teaches you that no one really understands any of it, and that therefore nothing that anyone tells you can possibly be trusted.

# Bad Math Books and Cantor Cardinality

A bunch of readers sent me a link to a tweet this morning from Professor Jordan Ellenberg:

The tweet links to the following image:

(And yes, this is real. You can see it in context here.)

This is absolutely infuriating.

This is a photo of a problem assignment in a math textbook published by an imprint of McGraw-Hill. And it’s absolutely, unquestionably, trivially wrong. No one who knew anything about math looked at this before it was published.

The basic concept underneath this is fundamental: it’s the cardinality of sets from Cantor’s set theory. It’s an extremely important concept. And it’s a concept that’s at the root of a huge amount of misunderstandings, confusion, and frustration among math students.

Cardinality, and the notion of cardinality relations between infinite sets, are difficult concepts, and they lead to some very un-intuitive results. Infinity isn’t one thing: there are different sizes of infinities. That’s a rough concept to grasp!

Here on this blog, I’ve spent more time dealing with people who believe that it must be wrong – a subject that I call Cantor crackpottery – than with any other bad math topic. This error teaches students something deeply wrong, and it encourages Cantor crackpottery!

Let’s review.

Cantor said that two collections of things are the same size if it’s possible to create a one-to-one mapping between the two. Imagine you’ve got a set of 3 apples and a set of 3 oranges. They’re the same size. We know that because they both have 3 elements; but we can also show it by setting aside pairs of one apple and one orange – you’ll get three pairs.

The same idea applies when you look at infinitely large sets. The set of positive integers and the set of negative integers are the same size. They’re both infinite – but we can show how you can create a one-to-one relation between them: you can take any positive integer $i$, and map it to exactly one negative integer, $0 - i$.

That leads to some unintuitive results. For example, the set of all natural numbers and the set of all even natural numbers are the same size. That seems crazy, because the set of all even natural numbers is a strict subset of the set of natural numbers: how can they be the same size?

But they are. We can map each natural number $i$ to exactly one even natural number $2i$. That’s a perfect one-to-one map between natural numbers and even natural numbers.

Where it gets uncomfortable for a lot of people is when we start thinking about real numbers. The set of real numbers is infinite. Even the set of real numbers between 0 and 1 is infinite! But it’s also larger than the set of natural numbers, which is also infinite. How can that be?

The answer is that Cantor showed that for any possible one-to-one mapping between the natural numbers and the real numbers between 0 and 1, there’s at least one real number that the mapping omitted. No matter how you do it, all of the natural numbers are mapped to one value in the reals, but there’s at least one real number which is not in the mapping!

In Cantor set theory, that means that the size of the set of real numbers between 0 and 1 is strictly larger than the set of all natural numbers. There’s an infinity bigger than infinity.

I think that this is what the math book in question meant to say: that there’s no possible mapping between the natural numbers and the real numbers. But it’s not what they did say: what they said is that there’s no possible map between the integers and the fractions. And that is not true.

Here’s how you generate the mapping between the integers and the rational numbers (fractions) between 0 and 1, written as a pseudo-Python program:

 i = 0
for denom in Natural:
for num in 1 .. denom:
if num is relatively prime with denom:
print("%d => %d/%d" % (i, num, denom))
i += 1


It produces a mapping (0 => 0, 1 => 1, 2 => 1/2, 3 => 1/3, 4 => 2/3, 5 => 1/4, 6 => 3/4, …). It’ll never finish running – but you can easily show that for any possible fraction, there’ll be exactly one integer that maps to it.

That means that the set of all rational numbers between 0 and 1 is the same size as the set of all natural numbers. There’s a similar way of producing a mapping between the set of all fractions and the set of natural numbers – so the set of all fractions is the same size as the set of natural numbers. But both are smaller than the set of all real numbers, because there are many, many real numbers that cannot be written as fractions. (For example, $\pi$. Or the square root of 2. Or $e$. )

This is terrible on multiple levels.

1. It’s a math textbook written and reviewed by people who don’t understand the basic math that they’re writing about.
2. It’s teaching children something incorrect about something that’s already likely to confuse them.
3. It’s teaching something incorrect about a topic that doesn’t need to be covered at all in the textbook. This is an algebra-2 textbook. You don’t need to cover Cantor’s infinite cardinalities in Algebra-2. It’s not wrong to cover it – but it’s not necessary. If the authors didn’t understand cardinality, they could have just left it out.
4. It’s obviously wrong. Plenty of bright students are going to come up with the the mapping between the fractions and the natural numbers. They’re going to come away believing that they’ve disproved Cantor.

I’m sure some people will argue with that last point. My evidence in support of it? I came up with a proof of that in high school. Fortunately, my math teacher was able to explain why it was wrong. (Thanks Mrs. Stevens!) Since I write this blog, people assume I’m a mathematician. I’m not. I’m just an engineer who really loves math. I was a good math student, but far from a great one. I’d guess that every medium-sized high school has at least one math student every year who’s better than I was.

The proof I came up with is absolutely trivial, and I’d expect tons of bright math-geek kids to come up with something like it. Here goes:

1. The set of fractions is a strict subset of the set of ordered pairs of natural numbers.
2. So: if there’s a one-to-one mapping between the set of ordered pairs and the naturals, then there must be a one-to-one mapping between the fractions and the naturals.
3. On a two-d grid, put the natural numbers across, and then down.
4. Zigzag diagonally through the grid, forming pairs of the horizontal position and the vertical position: (0,0), (1, 0), (0, 1), (2, 0), (1, 1), (0, 2), (3, 0), (2, 1), (1, 2), (0, 3).
5. This will produce every possible ordered pair of natural numbers. For each number in the list, produce a mapping between the position in the list, and the pair. So (0, 0) is 0, (2, 0) is 3, etc.

As a proof, it’s sloppy – but it’s correct. And plenty of high school students will come up with something like it. How many of them will walk away believing that they just disproved Cantor?