The other day, I received an email that actually excited me! It’s a question related to Cantor’s diagonalization, but there’s absolutely nothing cranky about it! It’s something interesting and subtle. So without further ado:

Cantor’s diagonalization says that you can’t put the reals into 1 to 1 correspondence with the integers. The well-ordering theorem seems to suggest that you can pick a least number from every set including the reals, so why can’t you just keep picking least elements to put them into 1 to 1 correspondence with the reals. I understand why Cantor says you can’t. I just don’t see what is wrong with the other arguments (other than it must be wrong somehow). Apologies for not being able to state the argument in formal maths, I’m around 20 years out of practice for formal maths.

As we’ve seen in too many discussions of Cantor’s diagonalization, it’s a proof that shows that it is impossible to create a one-to-one correspondence between the natural numbers and the real numbers.

The Well-ordering says something that seems innoccuous at first, but which, looked at in depth, really does appear to contradict Cantor’s diagonalization.

A set $S$ is well-ordered if there exists a total ordering $<=$ on the set, with the additional property that for any subset $T \subseteq S$, $T$ has a smallest element.

The well-ordering theorem says that every non-empty set can be well-ordered. Since the set of real numbers is a set, that means that there exists a well-ordering relation over the real numbers.

The problem with that is that it appears that that tells you a way of producing an enumeration of the reals! It says that the set of all real numbers has a least element: Bingo, there’s the first element of the enumeration! Now you take the set of real numbers excluding that one, and it has a least element under the well-ordering relation: there’s the second element. And so on. Under the well-ordering theorem, then, every set has a least element; and every element has a unique successor! Isn’t that defining an enumeration of the reals?

The solution to this isn’t particularly satisfying on an intuitive level.

The well-ordering theorem is, mathematically, equivalent to the axiom of choice. And like the axiom of choice, it produces some very ugly results. It can be used to create “existence” proofs of things that, in a practical sense, don’t exist in a usable form. It proves that something exists, but it doesn’t prove that you can ever produce it or even identify it if it’s handed to you.

So there is an enumeration of the real numbers under the well ordering theorem. Only the less-than relation used to define the well-ordering is not the standard real-number less than operation. (It obviously can’t be, because under well-ordering, every set has a least element, and standard real-number less-than doesn’t have a least element.) In fact, for any ordering relation $\le_x$ that you can define, describe, or compute, $\le_x$ is not the well-ordering relation for the reals.

Under the well-ordering theorem, the real numbers have a well-ordering relation, only you can’t ever know what it is. You can’t define any element of it; even if someone handed it to you, you couldn’t tell that you had it.

It’s very much like the Banach-Tarski paradox: we can say that there’s a way of doing it, only we can’t actually do it in practice. In the B-T paradox, we can say that there is a way of cutting a sphere into these strange pieces – but we can’t describe anything about the cut, other than saying that it exists. The well-ordering of the reals is the same kind of construct.

How does this get around Cantor? It weasels its way out of Cantor by the fact that while the well-ordering exists, it doesn’t exist in a form that can be used to produce an enumeration. You can’t get any kind of handle on the well-ordering relation. You can’t produce an enumeration from something that you can’t create or identify – just like you can’t ever produce any of the pieces of the Banach-Tarski cut of a sphere. It exists, but you can’t use it to actually produce an enumeration. So the set of real numbers remains non-enumerable even though it’s well-ordered.

If that feels like a cheat, well… That’s why a lot of people don’t like the axiom of choice. It produces cheatish existence proofs. Connecting back to something I’ve been trying to write about, that’s a big part of the reason why intuitionistic type theory exists: it’s a way of constructing math without stuff like this. In an intuitionistic type theory (like the Martin-Lof theory that I’ve been writing about), it doesn’t exist if you can’t construct it.

# Understanding Global Warming Scale Issues

Aside from the endless stream of Cantor cranks, the next biggest category of emails I get is from climate “skeptics”. They all ask pretty much the same question. For example, here’s one I received today:

My personal analysis, and natural sceptisism tells me, that there are something fundamentally wrong with the entire warming theory when it comes to the CO2.

If a gas in the atmosphere increase from 0.03 to 0.04… that just cant be a significant parameter, can it?

I generally ignore it, because… let’s face it, the majority of people who ask this question aren’t looking for a real answer. But this one was much more polite and reasonable than most, so I decided to answer it. And once I went to the trouble of writing a response, I figured that I might as well turn it into a post as well.

The current figures – you can find them in a variety of places from wikipedia to the US NOAA – are that the atmosphere CO2 has changed from around 280 parts per million in 1850 to 400 parts per million today.

Why can’t that be a significant parameter?

There’s a couple of things to understand to grasp global warming: how much energy carbon dioxide can trap in the atmosphere, and hom much carbon dioxide there actually is in the atmosphere. Put those two facts together, and you realize that we’re talking about a massive quantity of carbon dioxide trapping a massive amount of energy.

The problem is scale. Humans notoriously have a really hard time wrapping our heads around scale. When numbers get big enough, we aren’t able to really grasp them intuitively and understand what they mean. The difference between two numbers like 300 and 400ppm is tiny, we can’t really grasp how in could be significant, because we aren’t good at taking that small difference, and realizing just how ridiculously large it actually is.

If you actually look at the math behind the greenhouse effect, you find that some gasses are very effective at trapping heat. The earth is only habitable because of the carbon dioxide in the atmosphere – without it, earth would be too cold for life. Small amounts of it provide enough heat-trapping effect to move us from a frozen rock to the world we have. Increasing the quantity of it increases the amount of heat it can trap.

Let’s think about what the difference between 280 and 400 parts per million actually means at the scale of earth’s atmosphere. You hear a number like 400ppm – that’s 4 one-hundreds of one percent – that seems like nothing, right? How could that have such a massive effect?!

But like so many other mathematical things, you need to put that number into the appropriate scale. The earths atmosphere masses roughly 5 times 10^21 grams. 400ppm of that scales to 2 times 10^18 grams of carbon dioxide. That’s 2 billion trillion kilograms of CO2. Compared to 100 years ago, that’s about 800 million trillion kilograms of carbon dioxide added to the atmosphere over the last hundred years. That’s a really, really massive quantity of carbon dioxide! scaled to the number of particles, that’s something around 10^40th (plus or minus a couple of powers of ten – at this scale, who cares?) additional molecules of carbon dioxide in the atmosphere. It’s a very small percentage, but it’s a huge quantity.

When you talk about trapping heat, you also have to remember that there’s scaling issues there, too. We’re not talking about adding 100 degrees to the earths temperature. It’s a massive increase in the quantity of energy in the atmosphere, but because the atmosphere is so large, it doesn’t look like much: just a couple of degrees. That can be very deceptive – 5 degrees celsius isn’t a huge temperature difference. But if you think of the quantity of extra energy that’s being absorbed by the atmosphere to produce that difference, it’s pretty damned huge. It doesn’t necessarily look like all that much when you see it stated at 2 degrees celsius – but if you think of it terms of the quantity of additional energy being trapped by the atmosphere, it’s very significant.

Calculating just how much energy a molecule of CO2 can absorb is a lot trickier than calculating the mass-change of the quantity of CO2 in the atmosphere. It’s a complicated phenomenon which involves a lot of different factors – how much infrared is absorbed by an atom, how quickly that energy gets distributed into the other molecules that it interacts with… I’m not going to go into detail on that. There’s a ton of places, like here, where you can look up a detailed explanation. But when you consider the scale issues, it should be clear that there’s a pretty damned massive increase in the capacity to absorb energy in a small percentage-wise increase in the quantity of CO2.

# Okonomilatkes!

I’m working on some type theory posts, but it’s been slow going.

In the meantime, it’s Chanukah time. Every year, my family makes me cook potato latkes for Chanukah. The problem with that is, I don’t particularly like potato latkes. This year, I came up with the idea of trying to tweak them into something that I’d actually enjoy eating. What I came up with is combining a latke with another kind of fried savory pancake that I absolutely love: the japanese Okonomiyaki. The result? Okonomilatkes.

Ingredients:

• 1/2 head green cabbage, finely shredded.
• 1 1/2 pounds potatoes
• 1/2 cup flour
• 1/2 cup water
• 1 beaten egg
• 1/2 pound crabstick cut into small pieces
• Tonkatsu sauce (buy it at an asian grocery store in the japanese section. The traditional brand has a bulldog logo on the bottle.)
• Katsubuoshi (shredded bonito)
• Japanese mayonaise (sometimes called kewpie mayonaise. You can find it in squeeze bottles in any asian grocery. Don’t substitute American mayo – Japanese mayo is thinner, less oily, a bit tart, sweeter, and creamier. It’s really pretty different.)
• 1 teaspoon salt
• 1/2 teaspoon baking powder.

Instructions

1. In a very hot pan, add about a tablespoon of oil, and when it’s nearly smoking, add the cabbage. Saute until the cabbage wilts and starts to brown. Remove from the heat, and set aside to cool.
2. Using either the grater attachment of a food processor, or the coarse side of a box grater, shred the potatoes. (I leave the skins on, but if that bugs you, peel them first).
3. Squeeze as much water as you can out of the shredded potatoes.
4. Mix together the water, flour, baking powder, egg, and salt into a thin batter.
5. Add the potatoes, cabbage, and crabstick to the batter, and stir together.
6. Split this mixture into four portions.
7. Heat a nonstick pan on medium high heat, add a generous amount of oil, and add one quarter of the batter. Let it cook until nicely browned, then flip, and cook the other side. On my stove, it takes 3-5 minutes per side. Add oil as needed while it’s cooking.
8. Repeat with the other 3 portions
9. To serve, put a pancake on a plate. Squeeze a bunch of stripes of mayonaise, then add a bunch of the tonkatsu sauce, and sprinkle with the katsubuoshi.

# Polls and Sampling Errors in the Presidental Debate Results

My biggest pet peeve is press coverage of statistics. As someone who is mathematically literate, I’m constantly infuriated by it. Basic statistics isn’t that hard, but people can’t be bothered to actually learn a tiny bit in order to understand the meaning of the things they’re covering.

My twitter feed has been exploding with a particularly egregious example of this. After monday night’s presidential debate, there’s been a ton of polling about who “won” the debate. One conservative radio host named Bill Mitchell has been on a rampage about those polls. Here’s a sample of his tweets:

Statistical analysis has a very simple point. We’re interested in understanding the properties of a large population of things. For whatever reason, we can’t measure the properties of every object in that population.

The exact reason can vary. In political polling, we can’t ask every single person in the country who they’re going to vote for. (Even if we could, we simply don’t know who’s actually going to show up and vote!) For a very different example, my first exposure to statistics was through my father, who worked in semiconductor manufacturing. They’d produce a run of 10,000 chips for use in Satellites. They needed to know when, on average, a chip would fail from exposure to radiation. If they measured that in every chip, they’d end up with nothing to sell.)

Anyway: you can’t measure every element of the population, but you still want to take measurements. So what you do is randomly select a collection of representative elements from the population, and you measure those. Then you can say that with a certain probability, the result of analyzing that representative subset will match the result that you’d get if you measured the entire population.

How close can you get? If you’ve really selected a random sample of the population, then the answer depends on the size of the sample. We measure that using something called the “margin of error”. “Margin of error” is actually a terrible name for it, and that’s the root cause of one of the most common problems in reporting about statistics. The margin of error is a probability measurement that says “there is an $N$% probability that the value for the full population lies within the margin of error of the measured value of the sample.”.

Right away, there’s a huge problem with that. What is that variable doing in there? The margin of error measures the probability that the full population value is within a confidence interval around the measured sample value. If you don’t say what the confidence interval is, the margin of error is worthless. Most of the time – but not all of the time – we’re talking about a 95% confidence interval.

But there are several subtler issues with the margin of error, both due to the name.

1. The “true” value for the full population is not guaranteed to be within the margin of error of the sampled value. It’s just a probability. There is no hard bound on the size of the error: just a high probability of it being within the margin..
2. The margin of error only includes errors due to sample size. It does not incorporate any other factor – and there are many! – that may have affected the result.
3. The margin of error is deeply dependent on the way that the underlying sample was taken. It’s only meaningful for a random sample. That randomness is critically important: all of sampled statistics is built around the idea that you’ve got a randomly selected subset of your target population.

Let’s get back to our friend the radio host, and his first tweet, because he’s doing a great job of illustrating some of these errors.

The quality of a sampled statistic is entirely dependent on how well the sample matches the population. The sample is critical. It doesn’t matter how big the sample size is if it’s not random. A non-random sample cannot be treated as a representative sample.

So: an internet poll, where a group of people has to deliberately choose to exert the effort to participate cannot be a valid sample for statistical purposes. It’s not random.

It’s true that the set of people who show up to vote isn’t a random sample. But that’s fine: the purpose of an election isn’t to try to divine what the full population thinks. It’s to count what the people who chose to vote think. It’s deliberately measuring a full population: the population of people who chose to vote.

But if you’re trying to statistically measure something about the population of people who will go and vote, you need to take a randomly selected sample of people who will go to vote. The set of voters is the full population; you need to select a representative sample of that population.

Internet polls do not do that. At best, they measure a different population of people. (At worst, with ballot stuffing, they measure absolutely nothing, but we’ll give them this much benefit of the doubt.) So you can’t take much of anything about the sample population and use it to reason about the full population.

And you can’t say anything about the margin of error, either. Because the margin of error is only meaningful for a representative sample. You cannot compute a meaningful margin of error for a non-representative sample, because there is no way of knowing how that sampled population compares to the true full target population.

And that brings us to the second tweet. A properly sampled random population of 500 people can produce a high quality result with a roughly 5% margin of error and a 95% confidence interval. (I’m doing a back-of-the-envelope calculation here, so that’s not precise.) That means that if the population were randomly sampled, we could say there is in 19 out of 20 polls of that size, the full population value would be within +/- 4% of value measured by the poll. For a non-randomly selected sample of 10 million people, the margin of error cannot be measured, because it’s meaningless. The random sample of 500 people tells us a reasonable estimate based on data; the non-random sample of 10 million people tells us nothing.

And with that, on to the third tweet!

In a poll like this, the margin of error only tells us one thing: what’s the probability that the sampled population will respond to the poll in the same way that the full population would?

There are many, many things that can affect a poll beyond the sample size. Even with a truly random and representative sample, there are many things that can affect the outcome. For a couple of examples:

How, exactly, is the question phrased? For example, if you ask people “Should police shoot first and ask questions later?”, you’ll get a very different answer from “Should police shoot dangerous criminal suspects if they feel threatened?” – but both of those questions are trying to measure very similar things. But the phrasing of the questions dramatically affects the outcome.

What context is the question asked in? Is this the only question asked? Or is it asked after some other set of questions? The preceding questions can bias the answers. If you ask a bunch of questions about how each candidate did with respect to particular issues before you ask who won, those preceding questions will bias the answers.

When you’re looking at a collection of polls that asked different questions in different ways, you expect a significant variation between them. That doesn’t mean that there’s anything wrong with any of them. They can all be correct even though their results vary by much more than their margins of error, because the margin of error has nothing to do with how you compare their results: they used different samples, and measured different things.

The problem with the reporting is the same things I mentioned up above. The press treats the margin of error as an absolute bound on the error in the computed sample statistics (which it isn’t); and the press pretends that all of the polls are measuring exactly the same thing, when they’re actually measuring different (but similar) things. They don’t tell us what the polls are really measuring; they don’t tell us what the sampling methodology was; and they don’t tell us the confidence interval.

Which leads to exactly the kind of errors that Mr. Mitchell made.

And one bonus. Mr. Mitchell repeatedly rants about how many polls show a “bias” by “over-sampling< democratic party supporters. This is a classic mistake by people who don't understand statistics. As I keep repeating, for a sample to be meaningful, it must be random. You can report on all sorts of measurements of the sample, but you cannot change it.

If you’re randomly selecting phone numbers and polling the respondents, you cannot screen the responders based on their self-reported party affiliation. If you do, you are biasing your sample. Mr. Mitchell may not like the results, but that doesn’t make them invalid. People report what they report.

In the last presidential election, we saw exactly this notion in the idea of “unskewing” polls, where a group of conservative folks decided that the polls were all biased in favor of the democrats for exactly the reasons cited by Mr. Mitchell. They recomputed the poll results based on shifting the samples to represent what they believed to be the “correct” breakdown of party affiliation in the voting population. The results? The actual election results closely tracked the supposedly “skewed” polls, and the unskewers came off looking like idiots.

We also saw exactly this phenomenon going on in the Republican primaries this year. Randomly sampled polls consistently showed Donald Trump crushing his opponents. But the political press could not believe that Donald Trump would actually win – and so they kept finding ways to claim that the poll samples were off: things like they were off because they used land-lines which oversampled older people, and if you corrected for that sampling error, Trump wasn’t actually winning. Nope: the randomly sampled polls were correct, and Donald Trump is the republican nominee.

If you want to use statistics, you must work with random samples. If you don’t, you’re going to screw up the results, and make yourself look stupid.

# Why we need formality in mathematics

The comment thread from my last Cantor crankery post has continued in a way that demonstrates a common issue when dealing with bad math, so I thought it was worth taking the discussion and promoting it to a proper top-level post.

The defender of the Cantor crankery tried to show what he alleged to be the problem with Cantor, by presenting a simple proof:

If we have a unit line, then this line will have an infinite number of points in it. Some of these points will be an irrational distance away from the origin and some will be a rational distance away from the origin.

Premise 1.

To have more irrational points on this line than rational points (plus 1), it is necessary to have at least two irrational points on the line so that there exists no rational point between them.

Premise 2.

It is not possible to have two irrational points on a line so that no rational point exists between them.

Conclusion.

It is not possible to have more irrational points on a line than rational points (plus 1).

This contradicts Cantor’s conclusion, so Cantor must have made a mistake in his reasoning.

(I’ve done a bit of formatting of this to make it look cleaner, but I have not changed any of the content.)

This is not a valid proof. It looks nice on the surface – it intuitively feels right. But it’s not. Why?

Because math isn’t intuition. Math is a formal system. When we’re talking about Cantor’s diagonalization, we’re working in the formal system of set theory. In most modern math, we’re specifically working in the formal system of Zermelo-Fraenkel (ZF) set theory. And that “proof” relies on two premises, which are not correct in ZF set theory. I pointed this out in verbose detail, to which the commenter responded:

I can understand your desire for a proof to be in ZFC, Peano arithmetic and FOPL, it is a good methodology but not the only one, and I am certain that it is not a perfect one. You are not doing yourself any favors if you let any methodology trump understanding. For me it is far more important to understand a proof, than to just know it “works” under some methodology that simply manipulates symbols.

This is the point I really wanted to get to here. It’s a form of complaint that I’ve seen over and over again – not just in the Cantor crankery, but in nearly all of the math posts.

There’s a common belief among crackpots of various sorts that scientists and mathematicians use symbols and formalisms just because we like them, or because we want to obscure things and make simple things seem complicated, so that we’ll look smart.

That’s just not the case. We use formalisms and notation because they are absolutely essential. We can’t do math without the formalisms; we could do it without the notation, but the notation makes things clearer than natural language prose.

The reason for all of that is because we want to be correct.

If we’re working with a definition that contains any vagueness – even the most subtle unintentional kind (or, actually, especially the most subtle unintentional kind!) – then we can easily produce nonsense. There’s a simple “proof” that we’ve discussed before that shows that 0 is equal to 1. It looks correct when you read it. But it contains a subtle error. If we weren’t being careful and formal, that kind of mistake can easily creep in – and once you allow one, single, innocuous looking error into a proof, the entire proof falls apart. The reason for all the formalism and all the notation is to give us a way to unambiguously, precisely state exactly what we mean. The reason that we insist of detailed logical step-by-step proofs is because that’s the only way to make sure that we aren’t introducing errors.

We can’t rely on intuition, because our intuition is frequently not correct. That’s why we use logic. We can’t rely on informal statements, because informal statements lack precision: they can mean many different things, some of which are true, and some of which are not.

In the case of Cantor’s diagonalization, when we’re being carefully precise, we’re not talking about the size of things: we’re talking about the cardinality of sets. That’s an important distinction, because “size” can mean many different things. Cardinality means one, very precise thing.

Similarly, we’re talking about the cardinality of the set of real numbers compared to the cardinality of the set of natural numbers. When I say that, I’m not just hand-waving the real numbers: the real numbers means something very specific: it’s the unique complete totally ordered field $(R, +, *, <)$ up to isomorphism. To understand that, we’re implicitly referencing the formal definition of a field (with all of its sub-definitions) and the formal definitions of the addition, multiplication, and ordering operations.

I’m not just saying that to be pedantic. I’m saying that because we need to know exactly what we’re talking about. It’s very easy to put together an informal definition of the real numbers that’s different from the proper mathematical set of real numbers. For example, you can define a number system consisting of the set of all numbers that can be generated by a finite, non-terminating computer program. Intuitively, it might seem like that’s just another way of describing the real numbers – but it describes a very different set.

Beyond just definitions, we insist on using formal symbolic logic for a similar reason. If we can reduce the argument to symbolic reasoning, then we’ve abstracted away anything that could bias or deceive us. The symbolic logic makes every statement absolutely precise, and every reasoning step pure, precise, and unbiased.

So what’s wrong with the “proof” above? It’s got two premises. Let’s just look at the first one: “To have more irrational points on this line than rational points (plus 1), it is necessary to have at least two irrational points on the line so that there exists no rational point between them.”.

If this statement is true, then Cantor’s proof must be wrong. But is this statement true? The commenter’s argument is that it’s obviously intuitively true.

If we weren’t doing math, that might be OK. But this is math. We can’t just rely on our intuition, because we know that our intuition is often wrong. So we need to ask: can you prove that that’s true?

And how do you prove something like that? Well, you start with the basic rules of your proof system. In a discussion of a set theory proof, that means ZF set theory and first order predicate logic. Then you add in the definitions you need to talk about the objects you’re interested in: so Peano arithmetic, rational numbers, real number theory, and the definition of irrational numbers in real number theory. That gives you a formal system that you can use to talk about the sets of real numbers, rational numbers, and natural numbers.

The problem for our commenter is that you can’t prove that premise using ZF logic, FOPL, and real number theory. It’s not true. It’s based on a faulty understanding of the behavior of infinite sets. It’s taking an assumption that comes from our intuition, which seems reasonable, but which isn’t actually true within the formal system o mathematics.

In particular, it’s trying to say that in set theory, the cardinality of the set of real numbers is equal to the cardinality of the set of natural numbers – but doing so by saying “Ah, Why are you worrying about that set theory nonsense? Sure, it would be nice to prove this statement about set theory using set theory, but you’re just being picky on insisting that.”

Once you really see it in these terms, it’s an absurd statement. It’s equivalent to something as ridiculous as saying that you don’t need to modify verbs by conjugating them when you speak english, because in Chinese, the spoken words don’t change for conjugation.

# Noodles with Dried Shrimp and Scallion Oil

During July, my kids go away to camp. So my wife and I have the opportunity to try new restaurants without having to drag the munchkins around. This year, we tried out a new chinese place in Manhattan, called Hao noodle house. One of the dishes we had was a simple noodle dish: noodles lightly dressed with soy sauce and scallion oil, and then topped with a scattering of scallion and dried shrimp.

Dried shrimp are, in my opinion, a very undervalued and underused ingredient. They’re very traditional in a lot of real Chinese cooking, and they give things a really nice taste. They’ve also got an interesting, pleasant chewy texture. So when there was a dried shrimp dish on the menu, I wanted it. (The restaurant also had dan dan noodles, which are a favorite of my wife, but she was kind, and let me indulge.)

The dish was absolutely phenomenal. So naturally I wanted to figure out how to make it at home. I finally got around to doing it tonight, and I got really lucky: everything worked out perfectly, and it turned out almost exactly like the restaurant. My wife picked the noodles at the chinese grocery that looked closest, and they were exactly right. I guessed at the ingredients from the flavors, and somehow managed to get them spot on on the first try.

That kind of thing almost never happens! It always takes a few tries to nail down a recipe. But this one just turned out the first try!

So what’s the dish like? It’s very Chinese, and very different from what most Americans would expect. If you’ve had a mazeman ramen before, I’d say that’s the closest thing to it. It’s a light, warm, lightly dressed noodle dish. The sauce is very strong if you taste it on its own, but when it’s dressed onto hot noodles, it mellows quite a bit. The dried shrimp are briney and shrimpey, but not overly fishy. All I can say is, try it!

There are two parts to the sauce: a soy mixture, and a scallion oil. The scallion oil should be made a day in advance, and allowed to stand overnight. So we’ll start with that.

• one large bunch scallions
• 1 1/2 cups canola oil
• two slices crushed ginger
• generous pinch salt
1. Coarsely chop the scallions – whites and greens.
2. Put the scallions, ginger, and salt into a food processor, and pulse until they’re well chopped.
3. Add the oil, and let the processor run on high for about a minute. You should end up with a thick pasty pale green goo.
4. Put it in the refrigerator, and let it sit overnight.
5. The next day, push through a sieve, to separate the oil from the scallion pulp. Discard the scallions. You should be left with an amazing smelling translucent green oil.

Next, the noodles and sauce.

• Noodles. We used a kind of noodle called guan miao noodle. If you can’t find that,
then white/wheat soba or ramen would be a good substitute.
• 1/2 cup soy sauce
• 2 tablespoons sugar
• 1 cup chicken stock
• 2 slices ginger
• one clove garlic
• 2 tablespoons dried shrimp
1. Cover the dried shrimp with cold water in a bowl, and let sit for 1/2 hour.
2. Put the dried shrimp, soy sauce, sugar, chicken stock, ginger, and garlic into a saucepan, and simmer on low heat for five minutes. Then remove the garlic and ginger.
3. For each portion, take about 2 tablespoons of the soy, and two tablespoons of the scallion oil, and whisk together to form something like a vinaigrette.
4. Cook the noodles according to the package. (For the guan miao noodles, they boiled in unsalted water for 3 minutes.)
5. Toss with the soy/oil mixture.
6. Serve the dressed noodles into bowls, and put a few of the simmered dried shrimp on top.
7. Drizzle another teaspoon each of the scallion oil and soy sauce over each serving.
8. Scatter a few fresh scallions on top.

And eat!

# Cantor Crankery is Boring

Sometimes, I think that I’m being punished.

I’ve written about Cantor crankery so many times. In fact, it’s one of the largest categories in this blog’s index! I’m pretty sure that I’ve covered pretty much every anti-Cantor argument out there. And yet, not a week goes by when another idiot doesn’t pester me with their “new” refutation of Cantor. The “new” argument is always, without exception, a variation on one of the same old boring ones.

But I haven’t written any Cantor stuff in quite a while, and yet another one landed in my mailbox this morning. So, what the heck. Let’s go down the rabbit-hole once again.

The argument that the cranks hate is called Cantor’s diagonalization. Cantor’s diagonalization as argument that according to the axioms of set theory, the cardinality (size) of the set of real numbers is strictly larger than the cardinality of the set of natural numbers.

The argument is based on the set theoretic definition of cardinality. In set theory, two sets are the same size if and only if there exists a one-to-one mapping between the two sets. If you try to create a mapping between set A and set B, and in every possible mapping, every A is mapped onto a unique B, but there are leftover Bs that no element of A maps to, then the cardinality of B is larger than the cardinality of A.

When you’re talking about finite sets, this is really easy to follow. If I is the set {1, 2, 3}, and B is the set {4, 5, 6, 7}, then it’s pretty obvious that there’s no one to one mapping from A to B: there are more elements in B than there are in A. You can easily show this by enumerating every possible mapping of elements of A onto elements of B, and then showing that in every one, there’s an element of B that isn’t mapped to by an element of A.

With infinite sets, it gets complicated. Intuitively, you’d think that the set of even natural numbers is smaller than the set of all natural numbers: after all, the set of evens is a strict subset of the naturals. But your intuition is wrong: there’s a very easy one to one mapping from the naturals to the evens: {n → 2n }. So the set of even natural numbers is the same size as the set of all natural numbers.

To show that one infinite set has a larger cardinality than another infinite set, you need to do something slightly tricky. You need to show that no matter what mapping you choose between the two sets, some elements of the larger one will be left out.

In the classic Cantor argument, what he does is show you how, given any purported mapping between the natural numbers and the real numbers, to find a real number which is not included in the mapping. So no matter what mapping you choose, Cantor will show you how to find real numbers that aren’t in the mapping. That means that every possible mapping between the naturals and the reals will omit members of the reals – which means that the set of real numbers has a larger cardinality than the set of naturals.

Cantor’s argument has stood since it was first presented in 1891, despite the best efforts of people to refute it. It is an uncomfortable argument. It violates our intuitions in a deep way. Infinity is infinity. There’s nothing bigger than infinity. What does it even mean to be bigger than infinity? That’s a non-sequiter, isn’t it?

What it means to be bigger than infinity is exactly what I described above. It means that if you have a two infinitely large sets of objects, and there’s no possible way to map from one to the other without missing elements, then one is bigger than the other.

There are legitimate ways to dispute Cantor. The simplest one is to reject set theory. The diagonalization is an implication of the basic axioms of set theory. If you reject set theory as a basis, and start from some other foundational axioms, you can construct a version of mathematics where Cantor’s proof doesn’t work. But if you do that, you lose a lot of other things.

You can also argue that “cardinality” is a bad abstraction. That is, that the definition of cardinality as size is meaningless. Again, you lose a lot of other things.

If you accept the axioms of set theory, and you don’t dispute the definition of cardinality, then you’re pretty much stuck.

Ok, background out of the way. Let’s look at today’s crackpot. (I’ve reformatted his text somewhat; he sent this to me as plain-text email, which looks awful in my wordpress display theme, so I’ve rendered it into formatted HTML. Any errors introduced are, of course, my fault, and I’ll correct them if and when they’re pointed out to me.)

We have been told that it is not possible to put the natural numbers into a one to one with the real numbers. Well, this is not true. And the argument, to show this, is so simple that I am absolutely surprised that this argument does not appear on the internet.

We accept that the set of real numbers is unlistable, so to place them into a one to one with the natural numbers we will need to make the natural numbers unlistable as well. We can do this by mirroring them to the real numbers.

Given any real number (between 0 and 1) it is possible to extract a natural number of any length that you want from that real number.

Ex: From π-3 we can extract the natural numbers 1, 14, 141, 1415, 14159 etc…

We can form a set that associates the extracted number with the real number that it was extracted from.

Ex: 1 → 0.14159265…

Then we can take another real number (in any arbitrary order) and extract a natural number from it that is not in our set.

Ex: 1 → 0.14159266… since 1 is already in our set we must extract the next natural number 14.

Since 14 is not in our set we can add the pair 14 → 0.1415926l6… to our set.

We can do the same thing with some other real number 0.14159267… since 1 and 14 is already in our set we will need to extract a 3 digit natural number, 141, and place it in our set. And so on.

So our set would look something like this…

 A) 1 → 0.14159265… B) 14 → 0.14159266… C) 141 → 0.14159267… D) 1410 → 0.141 E) 14101 → 0.141013456789… F) 5 → 0.567895… G) 55 → 0.5567891… H) 555 → 0.555067891… … …

Since the real numbers are infinite in length (some terminate in an infinite string of zero’s) then we can always extract a natural number that is not in the set of pairs since all the natural numbers in the set of pairs are finite in length. Even if we mutate the diagonal of the real numbers, we will get a real number not on the list of real numbers, but we can still find a natural number, that is not on the list as well, to correspond with that real number.

Therefore it is not possible for the set of real numbers to have a larger cardinality than the set of natural numbers!

This is a somewhat clever variation on a standard argument.

Over and over and over again, we see arguments based on finite prefixes of real numbers. The problem with them is that they’re based on finite prefixes. The set of all finite prefixes of the real numbers is that there’s an obvious correspondence between the natural numbers and the finite prefixes – but that still doesn’t mean that there are no real numbers that aren’t in the list.

In this argument, every finite prefix of π corresponds to a natural number. But π itself does not. In fact, every real number that actually requires an infinite number of digits has no corresponding natural number.

This piece of it is, essentially, the same thing as John Gabriel’s crankery.

But there’s a subtler and deeper problem. This “refutation” of Cantor contains the conclusion as an implicit premise. That is, it’s actually using the assumption that there’s a one-to-one mapping between the naturals and the reals to prove the conclusion that there’s a one-to-one mapping between the naturals and the reals.

If you look at his procedure for generating the mapping, it requires an enumeration of the real numbers. You need to take successive reals, and for each one in the sequence, you produce a mapping from a natural number to that real. If you can’t enumerate the real numbers as a list, the procedure doesn’t work.

If you can produce a sequence of the real numbers, then you don’t need this procedure: you’ve already got your mapping. 0 to the first real, 1 to the second real, 2 to the third read, 3 to the fourth real, and so on.

So, once again: sorry Charlie: your argument doesn’t work. There’s no Fields medal for you today.

One final note. Like so many other bits of bad math, this is a classic example of what happens when you try to do math with prose. There’s a reason that mathematicians have developed formal notations, formal language, detailed logical inference, and meticulous citation. It’s because in prose, it’s easy to be sloppy. You can accidentally introduce an implicit premise without meaning to. If you need to carefully state every premise, and cite evidence of its truth, it’s a whole lot harder to make this kind of mistake.

That’s the real problem with this whole argument. It’s built on the fact that the premise “you can enumerate the real numbers” is built in. If you wrote it in careful formal mathematics, you wouldn’t be able to get away with that.