Category Archives: Bad Math

Back to an old topic: Bad Vaccine Math

The very first Good Math/Bad Math post ever was about an idiotic bit of antivaccine rubbish. I haven’t dealt with antivaccine stuff much since then, because the bulk of the antivaccine idiocy has nothing to do with math. But the other day, a reader sent me a really interesting link from what my friend Orac calls a “wretched hive of scum and quackery”, naturalnews.com, in which they try to argue that the whooping cough vaccine is an epic failure:

(NaturalNews) The utter failure of the whooping cough (pertussis) vaccine to provide any real protection against disease is once again on display for the world to see, as yet another major outbreak of the condition has spread primarily throughout the vaccinated community. As it turns out, 90 percent of those affected by an ongoing whooping cough epidemic that was officially declared in the state of Vermont on December 13, 2012, were vaccinated against the condition — and some of these were vaccinated two or more times in accordance with official government recommendations.

As reported by the Burlington Free Press, at least 522 cases of whooping cough were confirmed by Vermont authorities last month, which was about 10 times the normal amount from previous years. Since that time, nearly 100 more cases have been confirmed, bringing the official total as of January 15, 2013, to 612 cases. The majority of those affected, according to Vermont state epidemiologist Patsy Kelso, are in the 10-14-year-old age group, and 90 percent of those confirmed have already been vaccinated one or more times for pertussis.

Even so, Kelso and others are still urging both adults and children to get a free pertussis shot at one of the free clinics set up throughout the state, insisting that both the vaccine and the Tdap booster for adults “are 80 to 90 percent effective.” Clearly this is not the case, as evidenced by the fact that those most affected in the outbreak have already been vaccinated, but officials are apparently hoping that the public is too naive or disengaged to notice this glaring disparity between what is being said and what is actually occurring.

It continues in that vein. The gist of the argument is:

  1. We say everyone needs to be vaccinated, which will protect them from getting the whooping cough.
  2. The whooping cough vaccine is, allagedly, 80 to 90% effective.
  3. 90% of the people who caught whooping cough were properly vaccinated.
  4. Therefore the vaccine can’t possibly work.

What they want you to do is look at that 80 to 90 percent effective rate, and see that only 10-20% of vaccinated people should be succeptible to the whooping cough, and compare that 10-20% to the 90% of actual infected people that were vaccinated. 20% (the upper bound of the succeptible portion of vaccinated people according to the quoted statistic) is clearly much smaller than 90% – therefore it’s obvious that the vaccine doesn’t work.

Of course, this is rubbish. It’s a classic apple to orange-grove comparison. You’re comparing percentages, when those percentages are measuring different groups – groups with wildly difference sizes.

Take a pool of 1000 people, and suppose that 95% are properly vaccinated (the current DTAP vaccination rate in the US is around 95%). That gives you 950 vaccinated people and 50 unvaccinated people who are unvaccinated.

In the vaccinated pool, let’s assume that the vaccine was fully effective on 90% of them (that’s the highest estimate of effectiveness, which will result in the lowest number of succeptible vaccinated – aka the best possible scenario for the anti-vaxers). That gives us 95 vaccinated people who are succeptible to the whooping cough.

There’s the root of the problem. Using numbers that are ridiculously friendly to the anti-vaxers, we’ve still got a population of twice as many succeptible vaccinated people as unvaccinated. so we’d expect, right out of the box, that better than 2/3rds of the cases of whooping cough would be among the vaccinated people.

In reality, the numbers are much worse for the antivax case. The percentage of people who were ever vaccinated is around 95%, because you need the vaccination to go to school. But that’s just the childhood dose. DTAP is a vaccination that needs to be periodically boosted or the immunity wanes. And the percentage of people who’ve had boosters is extremely low. Among adolescents, according to the CDC, only a bit more than half have had DTAP boosters; among adults, less that 10% have had a booster within the last 5 years.

What’s your succeptibility if you’ve gone more than 5 years without vaccination? Somewhere 40% of people who didn’t have boosters in the last five years are succeptible.

So let’s just play with those numbers a bit. Assume, for simplicity, than 50% of the people are adults, and 50% children, and assume that all of the children are fully up-to-date on the vaccine. Then you’ve got 10% of the children (10% of 475), 10% of the adults that are up-to-date (10% of 10% of 475), and 40% of the adults that aren’t up-to-date (40% of 90% of 475) is the succeptible population. That works out to 266 succeptible people among the vaccinated, which is 85%: so you’d expect 85% of the actual cases of whooping cough to be among people who’d been vaccinated. Suddenly, the antivaxers case doesn’t look so good, does it?

Consider, for a moment, what you’d expect among a non-vaccinated population. Pertussis is highly contagious. If someone in your household has pertussis, and you’re succeptible, you’ve got a better than 90% chance of catching it. It’s that contagious. Routine exposure – not sharing a household, but going to work, to the store, etc., with people who are infected still gives you about a 50% chance of infection if you’re succeptible.

In the state of Vermont, where NaturalNews is claiming that the evidence shows that the vaccine doesn’t work, how many cases of Pertussis have they seen? Around 600, out of a state population of 600,000 – an infection rate of one tenth of one percent. 0.1 percent, from a virulently contagious disease.

That’s the highest level of Pertussis that we’ve seen in the US in a long time. But at the same time, it’s really a very low number for something so contagious. To compare for a moment: there’s been a huge outbreak of Norovirus in the UK this year. Overall, more than one million people have caught it so far this winter, out of a total population of 62 million, for a rate of about 1.6% or sixteen times the rate of infection of pertussis.

Why is the rate of infection with this virulently contagious disease so different from the rate of infection with that other virulently contagious disease? Vaccines are a big part of it.

Vortex Garbage

A reader who saw my earlier post on the Vortex math talk at a TEDx conference sent me a link to an absolutely dreadful video that features some more crackpottery about the magic of vortices.

It says:

The old heliocentric model of our solar system,
planets rotating around the sun, is not only boring,
but also incorrect.

Our solar system moves through space at 70,000km/hr.
Now picture this instead:

(Image of the sun with a rocket/comet trail propelling
it through space, with the planets swirling around it.)

The sun is like a comet, dragging the planets in its wake.
Can you say “vortex”?

The science of this is terrible. The sun is not a rocket. It does not propel itself through space. It does not have a tail. It does not leave a significant “wake”. (There is interstellar material, and the sun moving through it does perturb it, but it’s not a wake: the interstellar material is orbiting the galactic center just like the sun. Gravitational effects do cause pertubations, but it’s not like a boat moving through still water, producing a wake.) Even if you stretch the definition of “wake”, the sun certainly does not leave a wake large enough to “drag” the planets. In fact, if you actually look at the solar system, the plane the ecliptic – the plane where the planets orbit the sun – is at a roughly 60 degree angle to the galactic ecliptic. If planetary orbits were a drag effect, then you would expect the orbits to be perpendicular to the galactic ecliptic. But they aren’t.

If you look at it mathematically, it’s even worse. The video claims to be making a distinction between the “old heliocentric” model of the solar system, and their new “vortex” model. But in fact, mathematically, they’re exactly the same thing. Look at it from a heliocentric point of view, and you’ve got the heliocentric model. Look at the exact same system from point that’s not moving relative to galactic center, and you get the vortex. They’re the same thing. The only difference is how you look at it.

And that’s just the start of the rubbish. Once they get past their description of their “vortex” model, they go right into the woo. Vortex is life! Vortex is sprirituality! Oy.

If you follow their link to their website, it gets even sillier, and you can start to see just how utterly clueless the author of this actually is:

(In reference to a NASA image showing the interstellar “wind” and the heliopause)

Think about this for a minute. In this diagram it seems the Solar System travel to the left. When the Earth is also traveling to the left (for half a year) it must go faster than the Sun. Then in the second half of the year, it travels in a “relative opposite direction” so it must go slower than the Sun. Then, after completing one orbit, it must increase speed to overtake the Sun in half a year. And this would go for all the planets. Just like any point you draw on a frisbee will not have a constant speed, neither will any planet.

See, it’s a problem that the planets aren’t moving at a constant speed. They speed up and slow down! Oh, the horror! The explanation is that they’re caught by the sun’s wake! So they speed up when they get dragged, until they pass the sun (how does being dragged by the sun ever make them faster than the sun? Who knows!), and then they’re not being dragged anymore, so they slow down.

This is ignorance of physics and of the entire concept of frame of reference and symmetry that is absolutely epic.

There’s quite a bit more nonsense, but that’s all I can stomach this evening. Feel free to point out more in the comments!

Types Gone Wild! SKI at Compile-Time

Over the weekend, a couple of my Foursquare coworkers and I were chatting on twitter, and one of my smartest coworkers, a great guy named Jorge Ortiz, pointed out that type inference in Scala (the language we use at Foursquare, and also pretty much my favorite language) is Turing complete.

Somehow, I hadn’t seen this before, and it absolutely blew my mind. So I asked Jorge for a link to the proof. The link he sent me is a really beautiful blog post. It doesn’t just prove that Scala type inference is Turing complete, but it does it in a remarkably beautiful way.

Before I get to the proof, what does this mean?

A system is Turing complete when it can perform any possible computation that could be performed on any other computing device. The Turing machine is, obviously, Turing complete. So is lambda calculus, the Minsky machine, the Brainfuck computing model, and the Scala programming language itself.

If type inference is Turing complete, then that means that you can write a Scala program where, in order to type-check the program, the compiler has to run an arbitrary program to completion. It means that there are, at least theoretically, Scala programs where the compiler will take forever – literally forever – to determine whether or not a given program contains a type error. Needless to say, I consider this to be a bad thing. Personally, I’d really prefer to see the type system be less flexible. In fact, I’d go so far as to say that this is a fundamental error in the design of Scala, and a big strike against it as a language. Having a type-checking system which isn’t guaranteed to terminate is bad.

But let’s put that aside: Scala is pretty entrenched in the community that uses it, and they’ve accepted this as a tradeoff. How did the blog author, Michael Dürig, prove that Scala type checking is Turing complete? By showing how to implement a variant of lambda calculus called SKI combinator calculus entirely with types.

SKI calculus is seriously cool. We know that lambda calculus is Turing complete. It turns out that for any lambda calculus expression, there’s a way rewriting it without any variables, and without any lambdas at all, using three canonical master functions. If you’ve got those three, then you can write anything, anything at all. The three are called S, K, and I.

  • The S combinator is: S x y z = x z (y z).
  • The K combinator is: K x y = x.
  • The I combinator is: I x = x.

They come from intuitionistic logic, where they’re fundamental axioms that describe how intuitionistic implication works. K is the rule A Rightarrow (B Rightarrow A); S is the rule (A Rightarrow (B Rightarrow C)) Rightarrow ((A Rightarrow B) Rightarrow C); and I is (A Rightarrow A).

Given any lambda calculus expression, you can rewrite it as a chain of SKIs. (If you’re interested in seeing how, please just ask in the comments; if enough people are interested, I’ll write it up.) What the author of the post id is show how to implement the S, K, and I combinators in Scala types.

trait Term {
  type ap[x <: Term] <: Term
  type eval <: Term
}

He’s created a type Term, which is the supertype of any computable fragment written in this type-SKI. Since everything is a function, all terms have to have two methods: one of them is a one-parameter “function” which applies the term to a parameter, and the second is a “function” which simplifies the term into canonical form.

He implements the S, K, and I combinators as traits that extend Term. We’ll start with the simplest one, the I combinator.

trait I extends Term {
  type ap[x <: Term] = I1[x]
  type eval = I
}

trait I1[x <: Term] extends Term {
  type ap[y <: Term] = eval#ap[y]
  type eval = x#eval
}

I needs to take a parameter, so its apply type-function takes a parameter x, and returns a new type I1[x] which has the parameter encoded into it. Evaluating I1[x] does exactly what you’d want from the I combinator with its parameter – it returns it.

The apply “method” of I1 looks strange. What you have to remember is that in lambda calculus (and in the SKI combinator calculus), everything is a function – so even after evaluating I.ap[x] to some other type, it’s still a type function. So it still needs to be applicable. Applying it is exactly the same thing as applying its parameter.

So if have any type A, if you write something like var a : I.ap[A].eval, the type of a will evaluate to A. If you apply I.ap[A].ap[Z], it’s equivalent to taking the result of evaluating I.ap[A], giving you A, and then applying that to Z.

The K combinator is much more interesting:

// The K combinator
trait K extends Term {
  type ap[x <: Term] = K1[x]
  type eval = K
}

trait K1[x <: Term] extends Term {
  type ap[y <: Term] = K2[x, y]
  type eval = K1[x]
}

trait K2[x <: Term, y <: Term] extends Term {
  type ap[z <: Term] = eval#ap[z]
  type eval = x#eval
}

It's written in curried form, so it's a type trait K, which returns a type trait K1, which takes a parameter and returns a type trait K2.

The implementation is a whole lot trickier, but it's the same basic mechanics. Applying K.ap[X] gives you K1[X]. Applying that to Y with K1[X].ap[Y] gives you K2[K, Y]. Evaluating that gives you X.

The S combinator is more of the same.

// The S combinator
trait S extends Term {
  type ap[x <: Term] = S1[x]
  type eval = S
}

trait S1[x <: Term] extends Term {
  type ap[y <: Term] = S2[x, y]
  type eval = S1[x]
}

trait S2[x <: Term, y <: Term] extends Term {
  type ap[z <: Term] = S3[x, y, z]
  type eval = S2[x, y]
}

trait S3[x <: Term, y <: Term, z <: Term] extends Term {
  type ap[v <: Term] = eval#ap[v]
  type eval = x#ap[z]#ap[y#ap[z]]#eval
}


Michid then goes on to show examples of how to use these beasts. He implements equality testing, and then shows how to test if different type-expressions evaluate to the same thing. And all of this happens at compile time. If the equality test fails, then it's a type error at compile time!

It's a brilliant little proof. Even if you can't read Scala syntax, and you don't really understand Scala type inference, as long as you know SKI, you can look at the equality comparisons, and see how it works in SKI. It's really beautiful.

Static Typing: Give me a break!

I’m a software engineer. I write code for a living. I’m also a programming language junkie. I love programming languages. I’m obsessed with programming languages. I’ve taught myself more programming languages than any sane person has any reason to know.

Learning that many languages, I’ve developed some pretty strong opinions about what makes a good language, and what kind of things I really want to see in the languages that I use.

My number one preference: strong static typing. That’s part of a more general preference, for preserving information. When I’m programming, I know what kind of thing I expect in a parameter, and I know what I expect to return. When I’m programming in a weakly typed language, I find that I’m constantly throwing information away, because I can’t actually say what I know about the program. I can’t put my knowledge about the expected behavior into the code. I don’t think that that’s a good thing.

But… (you knew there was a but coming, didn’t you?)

This is my preference. I believe that it’s right, but I also believe that reasonable people can disagree. Just because you don’t think the same way that I do doesn’t mean that you’re an idiot. It’s entirely possible for someone to know as much as I do about programming languages and have a different opinion. We’re talking about preferences.

Sadly, that kind of attitude is something that is entirely too uncommon. I seriously wonder somethings if I’m crazy, because it seems like everywhere I look, on every topic, no matter how trivial, most people absolutely reject the idea that it’s possible for an intelligent, knowledgeable person to disagree with them. It doesn’t matter what the subject is: politics, religion, music, or programming languages.

What brought this little rant on is that someone sent me a link to a comic, called “Cartesian Closed Comic”. It’s a programming language geek comic. But what bugs me is this comic. Which seems to be utterly typical of the kind of attitude that I’m griping about.

See, this is a pseudo-humorous way of saying “Everyone who disagrees with me is an idiot”. It’s not that reasonable people can disagree. It’s that people who disagree with me only disagree because they’re ignorant. If you like static typing, you probably know type theory. If you don’t like static typing, that there’s almost no chance that you know anything about type theory. So the reason that those stupid dynamic typing people don’t agree with people like me is because they just don’t know as much as I do. And the reason that arguments with them don’t go anywhere isn’t because we have different subjective preferences: it’s because they’re just too ignorant to understand why I’m right and they’re wrong.

Everything about this argument is crap, and it pisses me off.

Most programmers – whether they prefer static typing or not – don’t know type theory. Most of the arguments about whether to use static or dynamic typing aren’t based on type theory. It’s just the same old snobbery, the “you can’t disagree with me unless you’re an idiot”.

Among intelligent skilled engineers, the static versus dynamic typing thing really comes down to a simple, subjective argument:

Static typing proponents believe that expressing intentions in a static checkable form is worth the additional effort of making all of the code type-correct.

Dynamic typing proponents believe that it’s not: that strong typing creates an additional hoop that the programmer needs to jump through in order to get a working system.

Who’s right? In fact, I don’t think that either side is universally right. Building a real working system is a complex thing. There’s a complex interplay of design, implementation, and testing. What static typing really does is take some amount of stuff that could be checked with testing, and allows the compiler the check it in an abstract way, instead of with specific tests.

Is it easier to write code with type declarations, or with additional tests? Depends on the engineers and the system that they’re building.

Sloppy Dualism Denies Free Will?

When I was an undergrad in college, I was a philosophy minor. I spent countless hours debating ideas about things like free will. My final paper was a 60 page rebuttal to what I thought was a sloppy argument against free will. Now, it’s been more years since I wrote that than I care to admit – and I still keep seeing the same kind of sloppy arguments, that I argue are ultimately circular, because they’re hiding their conclusion in their premises.

There’s an argument against free will that I find pretty compelling. I don’t agree with it, but I do think that it’s a solid argument:

Everything in our experience of the universe ultimately comes down to physics. Every phenomenon that we can observe is, ultimately, the result of particles interacting according to basic physical laws. Thermodynamics is the ultimate, fundamental ruler of the universe: everything that we observe is a result of a thermodynamic process. There are no exceptions to that.

Our brain is just another physical device. It’s another complex system made of an astonishing number of tiny particles, interacting in amazingly complicated ways. But ultimately, it’s particles interacting the way that particles interact. Our behavior is an emergent phenomenon, but ultimately, we don’t have any ability to make choice, because there’s no mechanism that allows us free choice. Our choice is determined by the physical interactions, and our consciousness of those results is just a side-effect of that.

If you want to argue that free will doesn’t exist, that argument is rock solid.

But for some reason, people constantly come up with other arguments – in fact, much weaker arguments that come from what I call sloppy dualism. Dualism is the philosophical position that says that a conscious being has two different parts: a physical part, and a non-physical part. In classical terms, you’ve got a body which is physical, and a mind/soul which is non-physical.

In this kind of argument, you rely on that implicit assumption of dualism, essentially asserting that whatever physical process we can observe isn’t really you, and that therefore by observing any physical process of decision-making, you infer that you didn’t really make the decision.

For example…

And indeed, this is starting to happen. As the early results of scientific brain experiments are showing, our minds appear to be making decisions before we’re actually aware of them — and at times by a significant degree. It’s a disturbing observation that has led some neuroscientists to conclude that we’re less in control of our choices than we think — at least as far as some basic movements and tasks are concerned.

This is something that I’ve seen a lot lately: when you do things like functional MRI, you can find that our brains settled on a decision before we consciously became aware of making the choice.

Why do I call it sloppy dualism? Because it’s based on the idea that somehow the piece of our brain that makes the decision is different from the part of our brain that is our consciousness.

If our brain is our mind, then everything that’s going on in our brain is part of our mind. Taking a piece of our brain, saying “Whoops, that piece of your brain isn’t you, so when it made the decision, it was deciding for you instead of it being you deciding.

By starting with the assumption that the physical process of decision-making we can observe is something different from your conscious choice of the decision, this kind of argument is building the conclusion into the premises.

If you don’t start with the assumption of sloppy dualism, then this whole argument says nothing. If we don’t separate our brain from our mind, then this whole experiment says nothing about the question of free will. It says a lot of very interesting things about how our brain works: it shows that there are multiple levels to our minds, and that we can observe those different levels in how our brains function. That’s a fascinating thing to know! But does it say anything about whether we can really make choices? No.

The Investors vs. the Tabby

There’s an amusing article making its rounds of the internet today, about the successful investment strategy of a cat named Orlando..

A group of people at the Observer put together a fun experiment.
They asked three groups to pretend that they had 5000 pounds, and asked each of them to invest it, however they wanted, in stocks listed on the FTSE. They could only change their investments at the end of a calendar quarter. At the end of the year, they compared the result of the three groups.

Who were the three groups?

  1. The first was a group of professional investors – people who are, at least in theory, experts at analyzing the stock market and using that analysis to make profitable investments.
  2. The second was a classroom of students, who are bright, but who have no experience at investment.
  3. The third was an orange tabby cat named Orlando. Orlando chose stocks by throwing his toy mouse at a
    targetboard randomly marked with investment choices.

As you can probably guess by the fact that we’re talking about this, Orlando the tabby won, by a very respectable margin. (Let’s be honest: if the professional investors came in first, and the students came in second, no one would care.) At the end of the year, the students had lost 160 pounds on their investments. The professional investors ended with a profit of 176 pounds. And the cat ended with a profit of 542 pounds – more than triple the profit of the professionals.

Most people, when they saw this, had an immediate reaction: “see, those investors are a bunch of idiots. They don’t know anything! They were beaten by a cat!”
And on one level, they’re absolutely right. Investors and bankers like to present themselves as the best of the best. They deserve their multi-million dollar earnings, because, so they tell us, they’re more intelligent, more hard-working, more insightful than the people who earn less. And yet, despite their self-alleged brilliance, professional investors can’t beat a cat throwing a toy mouse!

It gets worse, because this isn’t a one-time phenomenon: there’ve been similar experiments that selected stocks by throwing darts at a news-sheet, or by rolling dice, or by picking slips of paper from a hat. Many times, when people have done these kinds of experiments, the experts don’t win. There’s a strong implication that “expert investors” are not actually experts.

Does that really hold up? Partly yes, partly no. But mostly no.

Before getting to that, there’s one thing in the article that bugged the heck out of me: the author went out of his/her way to make sure that they defended the humans, presenting their performance as if positive outcomes were due to human intelligence, and negative ones were due to bad luck. In fact, I think that in this experiment, it was all luck.

For example, the authors discuss how the professionals were making more money than the cat up to the last quarter of the year, and it’s presented as the human intelligence out-performing the random cat. But there’s no reason to believe that. There’s no evidence that there’s anything qualitatively different about the last quarter that made it less predictable than the first three.

The headmaster at the student’s school actually said “The mistakes we made earlier in the year were based on selecting companies in risky areas. But while our final position was disappointing, we are happy with our progress in terms of the ground we gained at the end and how our stock-picking skills have improved.” Again, there’s absolutely no reason to believe that the students stock picking skills miraculously improved in the final quarter; much more likely that they just got lucky.

The real question that underlies this is: is the performance of individual stocks in a stock market actually predictable, or is it dominantly random. Most of the evidence that I’ve seen suggests that there’s a combination; on a short timescale, it’s predominantly random, but on longer timescales it becomes much more predictable.

But people absolutely do not want to believe that. We humans are natural pattern-seekers. It doesn’t matter whether we’re talking about financial markets, pixel-patterns in a bitmap, or answers on a multiple choice test: our brains look for patterns. If you randomly generate data, and you look at it long enough, with enough possible strategies,
you’ll find a pattern that fits. But it’s an imposed pattern, and it has no predictive value. It’s like the images of jesus on toast: we see patterns in noise. So people see patterns in the market, and they want to believe that it’s predictable.

Second, people want to take responsibility for good outcomes, and excuse bad ones. If you make a million dollars betting on a horse, you’re going to want to say that it was your superiour judgement of the horses that led to your victory. When an investor makes a million dollars on a stock, of course he wants to say that he made that money because he made a smart choice, not because he made a lucky choice. But when that same investor loses a million dollars, he doesn’t want to say that the lost a million dollars because he’s stupid; he wants to say that he lost money because of bad luck, of random factors beyond his control that he couldn’t predict.

The professional investors were doing well during part of the year: therefore, during that part of the year, they claim that their good performance was because they did a good job judging which stocks to buy. But when they lost money during the last quarter? Bad luck. But overall, their knowledge and skills paid off! What evidence do we have to support that? Nothing: but we want to assert that we have control, that experts understand what’s going on, and are able to make intelligent predictions.

The students performance was lousy, and if they had invested real money, they would have lost a tidy chunk of it. But their teacher believes that their performance in the last quarter wasn’t luck – it was that their skills had improved. Nonsense! They were lucky.

On the general question: Are “experts” useless for managing investments?

It’s hard to say for sure. In general, experts do perform better than random, but not by a huge margin, certainly not by as much as they’d like us to believe. The Wall Street Journal used to do an experiment where they compared dartboard stock selection against human experts, and against passive investment in the Dow Jones Index stocks over a one-year period. The pros won 60% of the time. That’s better than chance: the experts knowledge/skills were clearly benefiting them. But: blindly throwing darts at a wall could beat experts 2 out of 5 times!

When you actually do the math and look at the data, it appears that human judgement does have value. Taken over time, human experts do outperform random choices, by a small but significant margin.

What’s most interesting is a time-window phenomenon. In most studies, the human performance relative to random choice is directly related to the amount of time that the investment strategy is followed: the longer the timeframe, the better the humans perform. In daily investments, like day-trading, most people don’t do any better than random. The performance of day-traders is pretty much in-line with what you’d expect from probability from random choice. Monthly, it’s still mostly a wash. But if you look at yearly performance, you start to see a significant difference: humans do typically outperform random choice by a small but definitely margin. If you look at longer time-frames, like 5 or ten years, then you start to see really sizeable differences. The data makes it look like daily fluctuations of the market are chaotic and unpredictable, but that there are long-term trends that we can identify and exploit.

A Bad Mathematical Refutation of Atheism

At some point a few months ago, someone (sadly I lost their name and email) sent me a link to yet another Cantor crank. At the time, I didn’t feel like writing another Cantor crankery post, so I put it aside. Now, having lost it, I was using Google to try to find the crank in question. I didn’t, but I found something really quite remarkably idiotic.

(As a quick side-comment, my queue of bad-math-crankery is, sadly, empty. If you’ve got any links to something yummy, please shoot it to me at markcc@gmail.com.)

The item in question is this beauty. It’s short, so I’ll quote the whole beast.

MYTH: Cantor’s Set Theorem disproves divine omniscience

God is omniscient in the sense that He knows all that is not impossible to know. God knows Himself, He knows and does, knows every creature ideally, knows evil, knows changing things, and knows all possibilites. His knowledge allows free will.

Cantor’s set theorem is often used to argue against the possibility of divine omniscience and therefore against the existence of God. It can be stated:

  1. If God exists, then God is omniscient.
  2. If God is omniscient, then, by definition, God knows the set of all truths.
  3. If Cantor’s theorem is true, then there is no set of all truths.
  4. But Cantor’s theorem is true.
  5. Therefore, God does not exist.

However, this argument is false. The non-existence of a set of all truths does not entail that it is impossible for God to know all truths. The consistency of a plausible theistic position can be established relative to a widely accepted understanding of the standard model of Cantorian set theorem. The metaphysical Cantorian premises imply that Cantor’s theorem is inapplicable to the things that God knows. A set of all truths, if it exists, must be non-Cantorian.

The attempted disproof of God’s omniscience is, from a meta-mathematical standpoint, is inadequate to the extent that it doesn’t explain well-known mathematical contexts in which Cantor’s theorem is invalid. The “disproof” doesn’t acknowledge standard meta-mathematical conceptions that can analogically be used to establish the relative consistency of certain theistic positions. The metaphysical assertions concerning a set of all truths in the atheistic argument above imply that Cantor’s theorem is inapplicable to a set of all truths.

This is an absolute masterwork of crankery! It’s remarkably silly argument on so many levels.

  1. The first problem is just figuring out what the heck he’s talking about! When you say “Cantor’s theorem”, what I think of is one of Cantor’s actual theorems: “For any set S, the powerset of S is larger than S.” But that is clearly not what he’s referring to. I did a bit of searching to make sure that this wasn’t my error, but I can’t find anything else called Cantor’s theorem.
  2. So what the heck does he mean by “Cantor’s set theorem”? From his text, it appears to be a statement something like: “there is no set of all truths”. The closest actual mathematical statement that I can come up with to match that is Gödel’s incompleteness theorem. If that’s what he means, then he’s messed it up pretty badly. The closest I can come to stating incompleteness informally is: “In any formal mathematical system that’s powerful enough to express Peano arithmetic, there will be statements that are true, but which cannot be proven”. It’s long, complex, not particularly intuitive, and it’s still not a particularly good statement of incompleteness.

    Incompleteness is a difficult concept, and as I’ve written about before, it’s almost impossible to state incompleteness in an informal way. When you try to do that, it’s inevitable that you’re going to miss some of its subtleties. When you try to take an informal statement of incompleteness, and reason from it, the results are pretty much guaranteed to be garbage – as he’s done. He’s using a mis-statement of incompleteness,and trying to reason from it. It doesn’t matter what he says: he’s trying to show how “Cantor’s set theorem” doesn’t disprove his notion of theism. Whether it does or not doesn’t matter: for any statement X, no matter what X is, you can’t prove that “Cantor’s set theorem” or Gödel’s incompleteness theorem, or anything else disproves X if you’re arguing against something that isn’t X.

  3. Ignoring his mis-identification of the supposed theorem, the way that he stated it is actually meaningless. When we talk about sets, we’re using the word set in the sense of either ZFC or NBG set theory. Mathematical set theory defines what a set is, using first order predicate logic. His version of “Cantor’s set theorem” talks about a set which cannot be a set!

    He wants to create a set of truths. In set theory terms, that’s something you’d define with the axiom of specification: you’d use a predicate ranging over your objects to select the ones in the set. What’s your predicate? Truth. At best, that’s going to be a second-order predicate. You can’t form sets using second-order predicates! The entire idea of “the set of truths” isn’t something that can be expressed in set theory.

  4. Let’s ignore the problems with his “Cantor’s theorem” for the moment. Let’s pretend that the “set of all truths” was well-defined and meaningful. How does his argument stand up? It doesn’t: it’s a terrible argument. It’s ultimately nothing more than “Because I say so!” hidden behind a collection of impressive-sounding words. The argument, ultimately, is that the set of all truths as understood in set theory isn’t the same thing as the set of all truths in theology (because he says that they’re different), therefore you can’t use a statement about the set of all truths from set theory to talk about the set of all truths in theology.
  5. I’ve saved what I think is the worst for last. The entire thing is a strawman. As a religious science blogger, I get almost as much mail from atheists trying to convince me that my religion is wrong as I do from Christians trying to convert me. After doing this blogging thing for six years, I’m pretty sure that I’ve been pestered with every argument, both pro- and anti-theistic that you’ll find anywhere. But I’ve never actually seen this argument used anywhere except in articles like this one, which purport to show why it’s wrong. The entire argument being refuted is a total fake: no one actually argues that you should be an atheist using this piece of crap. It only exists in the minds of crusading religious folk who prop it up and then knock it down to show how smart they supposedly are, and how stupid the dirty rotten atheists are.

Let's Get Rid of Zero!

One of my tweeps sent me a link to a delightful pile of rubbish: a self-published “paper” by a gentleman named Robbert van Dalen that purports to solve the “problem” of zero. It’s an amusing pseudo-technical paper that defines a new kind of number which doesn’t work without the old numbers, and which gets rid of zero.

Before we start, why does Mr. van Dalen want to get rid of zero?

So what is the real problem with zeros? Zeros destroy information.

That is why they don’t have a multiplicative inverse: because it is impossible to rebuilt something you have just destroyed.

Hopefully this short paper will make the reader consider the author’s firm believe that: One should never destroy anything, if one can help it.

We practically abolished zeros. Should we also abolish simplifications? Not if we want to stay practical.

There’s nothing I can say to that.

So what does he do? He defines a new version of both integers and rational numbers. The new integers are called accounts, and the new rationals are called super-rationals. According to him, these new numbers get rid of that naughty information-destroying zero. (He doesn’t bother to define real numbers in his system; I assume that either he doesn’t know or doesn’t care about them.)

Before we can get to his definition of accounts, he starts with something more basic, which he calls “accounting naturals”.

He doesn’t bother to actually define them – he handwaves his way through, and sort-of defines addition and multiplication, with:

Addition
a + b == a concat b
Multiplication
a * b = a concat a concat a … (with b repetitions of a)

So… a sloppy definition of positive integer addition, and a handwave for multiplication.

What can we take from this introduction? Well, our author can’t be bothered to define basic arithmetic properly. What he really wants to say is, roughly, Peano arithmetic, with 0 removed. But my guess is that he has no idea what Peano arithmetic actually is, so he handwaves. The real question is, why did he bother to include this at all? My guess is that he wanted to pretend that he was writing a serious math paper, and he thinks that real math papers define things like this, so he threw it in, even though it’s pointless drivel.

With that rubbish out of the way, he defines an “Account” as his new magical integer, as a pair of “account naturals”. The first member of the pair is called a the credit, and the second part is the debit. If the credit is a and the debit is b, then the account is written (a%b). (He used backslash instead of percent; but that caused trouble for my wordpress config, so I switched to percent-sign.)

Addition:
a%b ++ c%d = (a+c)%(b+d)
Multiplication
a%b ** c%d = ((a*c)+(b*d))%((a*d)+(b*c))
Negation
– a%b = b%a

So… for example, consider 5*6. We need an “account” for each: We’ll use (7%2) for 5, and (9%3) for 6, just to keep things interesting. That gives us: 5*6 = (7%2)*(9%3) = (63+6)%(21+18) = 69%39, or 30 in regular numbers.

Yippee, we’ve just redefined multiplication in a way that makes us use good old natural number multiplication, only now we need to do it four times, plus 2 additions to multiply two numbers! Wow, progress! (Of a sort. I suppose that if you’re a cloud computing provider, where you’re renting CPUs, then this would be progress.

Oh, but that’s not all. See, each of these “accounts” isn’t really a number. The numbers are equivalence classes of accounts. So once you get the result, you “simplify” it, to make it easier to work with.

So make that 4 multiplications, 2 additions, and one subtraction. Yeah, this is looking nice, huh?

So… what does it give us?

As far as I can tell, absolutely nothing. The author promises that we’re getting rid of zero, but it sure likes like this has zeros: 1%1 is zero, isn’t it? (And even if we pretend that there is no zero, Mr. van Dalen specifically doesn’t define division on accounts, we don’t even get anything nice like closure.)

But here’s where it gets really rich. See, this is great, cuz there’s no zero. But as I just said, it looks like 1%1 is 0, right? Well it isn’t. Why not? Because he says so, that’s why! Really. Here’s a verbatim quote:

An Account is balanced when Debit and Credit are equal. Such a balanced Account can be interpreted as (being in the equivalence class of) a zero but we won’t.

Yeah.

But, according to him, we don’t actually get to see these glorious benefits of no zero until we add rationals. But not just any rationals, dum-ta-da-dum-ta-da! super-rationals. Why super-rationals, instead of account rationals? I don’t know. (I’m imagining a fraction with blue tights and a red cape, flying over a building. That would be a lot more fun than this little “paper”.)

So let’s look as the glory that is super-rationals. Suppose we have two accounts, e = a%b, and f = c%d. Then a “super-rational” is a ratio like e/f.

So… we can now define arithmetic on the super-rationals:

Addition
e/f +++ g/h = ((e**h)++(g**f))/(f**h); or in other words, pretty much exactly what we normally do to add two fractions. Only now those multiplications are much more laborious.
Multiplication
e/f *** g/h = (e**g)/(f**h); again, standard rational mechanics.
Multiplication Inverse (aka Reciprocal)
`e/f = f/e; (he introduces this hideous notation for no apparent reason – backquote is reciprocal. Why? I guess for the same reason that he did ++ and +++ – aka, no particularly good reason.

So, how does this actually help anything?

It doesn’t.

See, zero is now not really properly defined anymore, and that’s what he wants to accomplish. We’ve got the simplified integer 0 (aka “balance”), defined as 1%1. We’ve got a whole universe of rational pseudo-zeros – 0/1, 0/2, 0/3, 0/4, all of which are distinct. In this system, (1%1)/(4%2) (aka 0/2) is not the same thing as (1%1)/(5%2) (aka 0/3)!

The “advantage” of this is that if you work through this stupid arithmetic, you essentially get something sort-of close to 0/0 = 0. Kind-of. (There’s no rule for converting a super-rational to an account; assuming that if the denominator is 1, you can eliminate it, you get 1/0 = 0:

I’m guessing that he intends identities to apply, so: (4%1)/(1%1) = ((4%1)/(2%1)) *** `((2%1)/(1%1)) = ((4%1)/(2%1)) *** ((1%1)/(2%1)) = (1%1)/(2%1). So 1/0 = 0/1 = 0… If you do the same process with 2/0, you end up getting the result being 0/2. And so on. So we’ve gotten closure over division and reciprocal by getting rid of zero, and replacing it with an infinite number of non-equal pseudo-zeros.

What’s his answer to that? Of course, more hand-waving!

Note that we also can decide to simplify a Super- Rational as we would a Rational by calculating the Greatest Common Divisor (GCD) between Numerator and Denominator (and then divide them by their GCD). There is a catch, but we leave that for further research.

The catch that he just waved away? Exactly what I just pointed out – an infinite number of pseudo-0s, unless, of course, you admit that there is a zero, in which case they all collapse down to be zero… in which case this is all pointless.

Essentially, this is all a stupidly overcomplicated way of saying something simple, but dumb: “I don’t like the fact that you can’t divide by zero, and so I want to define x/0=0.”

Why is that stupid? Because dividing by zero is undefined for a reason: it doesn’t mean anything! The nonsense of it becomes obvious when you really think about identities. If 4/2 = 2, then 2*2=4; if x/y=z, then x=z*y. But mix zero in to that: if 4/0 = 0, then 0*0=4. That’s nonsense.

You can also see it by rephrasing division in english. Asking “what is four divided by two” is asking “If I have 4 apples, and I want to distribute them into 2 equal piles, how many apples will be in each pile?”. If I say that with zero, “I want to distribute 4 apples into 0 piles, how many apples will there be in each pile?”: you’re not distributing the apples into piles. You can’t, because there’s no piles to distribute them to. That’s exactly the point: you can’t divide by zero.

If you do as Mr. van Dalen did, and basically define x/0 = 0, you end up with a mess. You can handwave your way around it in a variety of ways – but they all end up breaking things. In the case of this account nonsense, you end up replacing zero with an infinite number of pseudo-zeros which aren’t equal to each other. (Or, if you define the pseudo-zeros as all being equal, then you end up with a different mess, where (2/0)/(4/0) = 2/4, or other weirdness, depending on exactly how you defie things.)

The other main approach is another pile of nonsense I wrote about a while ago, called nullity. Zero is an inevitable necessity to make numbers work. You can hate the fact that division by zero is undefined all you want, but the fact is, it’s both necessary and right. Division by zero doesn’t mean anything, so mathematically, division by zero is undefined.

For every natural number N, there's a Cantor Crank C(n)

More crankery? of course! What kind? What else? Cantor crankery!

It’s amazing that so many people are so obsessed with Cantor. Cantor just gets under peoples’ skin, because it feels wrong. How can there be more than one infinity? How can it possibly make sense?

As usual in math, it all comes down to the axioms. In most math, we’re working from a form of set theory – and the result of the axioms of set theory are quite clear: the way that we define numbers, the way that we define sizes, this is the way it is.

Today’s crackpot doesn’t understand this. But interestingly, the focus of his problem with Cantor isn’t the diagonalization. He thinks Cantor went wrong way before that: Cantor showed that the set of even natural numbers and the set of all natural numbers are the same size!

Unfortunately, his original piece is written in Portuguese, and I don’t speak Portuguese, so I’m going from a translation, here.

The Brazilian philosopher Olavo de Carvalho has written a philosophical “refutation” of Cantor’s theorem in his book “O Jardim das Aflições” (“The Garden of Afflictions”). Since the book has only been published in Portuguese, I’m translating the main points here. The enunciation of his thesis is:

Georg Cantor believed to have been able to refute Euclid’s fifth common notion (that the whole is greater than its parts). To achieve this, he uses the argument that the set of even numbers can be arranged in biunivocal correspondence with the set of integers, so that both sets would have the same number of elements and, thus, the part would be equal to the whole.

And his main arguments are:

It is true that if we represent the integers each by a different sign (or figure), we will have a (infinite) set of signs; and if, in that set, we wish to highlight with special signs, the numbers that represent evens, then we will have a “second” set that will be part of the first; and, being infinite, both sets will have the same number of elements, confirming Cantor’s argument. But he is confusing numbers with their mere signs, making an unjustifiable abstraction of mathematical properties that define and differentiate the numbers from each other.

The series of even numbers is composed of evens only because it is counted in twos, i.e., skipping one unit every two numbers; if that series were not counted this way, the numbers would not be considered even. It is hopeless here to appeal to the artifice of saying that Cantor is just referring to the “set” and not to the “ordered series”; for the set of even numbers would not be comprised of evens if its elements could not be ordered in twos in an increasing series that progresses by increments of 2, never of 1; and no number would be considered even if it could be freely swapped in the series of integeres.

He makes two arguments, but they both ultimately come down to: “Cantor contradicts Euclid, and his argument just can’t possibly make sense, so it must be wrong”.

The problem here is: Euclid, in “The Elements”, wrote severaldifferent collections of axioms as a part of his axioms. One of them was the following five rules:

  1. Things which are equal to the same thing are also equal to one another.
  2. If equals be added to equals, the wholes are equal.
  3. If equals be subtracted from equals, the remainders are equal.
  4. Things which coincide with one another are equal to one another.
  5. The whole is greater that the part.

The problem that our subject has is that Euclid’s axiom isn’t an axiom of mathematics. Euclid proposed it, but it doesn’t work in number theory as we formulate it. When we do math, the axioms that we start with do not include this axiom of Euclid.

In fact, Euclid’s axioms aren’t what modern math considers axioms at all. These aren’t really primitive ground statements. Most of them are statements that are provable from the actual axioms of math. For example, the second and third axioms are provable using the axioms of Peano arithmetic. The fourth one doesn’t appear to be a statement about numbers at all; it’s a statement about geometry. And in modern terms, the fifth one is either a statement about geometry, or a statement about measure theory.

The first argument is based on some strange notion of signs distinct from numbers. I can’t help but wonder if this is an error in translation, because the argument is so ridiculously shallow. Basically, it concedes that Cantor is right if we’re considering the representations of numbers, but then goes on to draw a distinction between representations (“signs”) and the numbers themselves, and argues that for the numbers, the argument doesn’t work. That’s the beginning of an interesting argument: numbers and the representations of numbers are different things. It’s definitely possible to make profound mistakes by confusing the two. You can prove things about representations of numbers that aren’t true about the numbers themselves. Only he doesn’t actually bother to make an argument beyond simply asserting that Cantor’s proof only works for the representations.

That’s particularly silly because Cantor’s proof that the even naturals and the naturals have the same cardinality doesn’t talk about representation at all. It shows that there’s a 1 to 1 mapping between the even naturals and the naturals. Period. No “signs”, no representations.

The second argument is, if anything, even worse. It’s almost the rhetorical equivalent of sticking his fingers in his ears and shouting “la la la la la”. Basically – he says that when you’re producing the set of even naturals, you’re skipping things. And if you’re skipping things, those things can’t possible be in the set that doesn’t include the skipped things. And if there are things that got skipped and left out, well that means that it’s ridiculous to say that the set that included the left out stuff is the same size as the set that omitted the left out stuff, because, well, stuff got left out!!!.

Here’s the point. Math isn’t about intuition. The properties of infinitely large sets don’t make intuitive sense. That doesn’t mean that they’re wrong. Things in math are about formal reasoning: starting with a valid inference system and a set of axioms, and then using the inference to reason. If we look at set theory, we use the axioms of ZFC. And using the axioms of ZFC, we define the size (or, technically, the cardinality) of sets. Using that definition, two sets have the same cardinality if and only if there is a one-to-one mapping between the elements of the two sets. If there is, then they’re the same size. Period. End of discussion. That’s what the math says.

Cantor showed, quite simply, that there is such a mapping:

{ (i rightarrow itimes 2) | i in N }

There it is. It exists. It’s simple. It works, by the axioms of Peano arithmetic and the axiom of comprehension from ZFC. It doesn’t matter whether it fits your notion of “the whole is greater than the part”. The entire proof is that set comprehension. It exists. Therefore the two sets have the same size.

Everyone should program, or Programming is Hard? Both!

I saw something on twitter a couple of days ago, and I promised to write this blog post about it. As usual, I’m behind on all the stuff I want to do, so it took longer to write than I’d originally planned.

My professional specialty is understanding how people write programs. Programming languages, development environment, code management tools, code collaboration tools, etc., that’s my bread and butter.

So, naturally, this ticked me off.

The article starts off by, essentially, arguing that most of the programming tutorials on the web stink. I don’t entirely agree with that, but to me, it’s not important enough to argue about. But here’s where things go off the rails:

But that’s only half the problem. Victor thinks that programming itself is broken. It’s often said that in order to code well, you have to be able to “think like a computer.” To Victor, this is absurdly backwards– and it’s the real reason why programming is seen as fundamentally “hard.” Computers are human tools: why can’t we control them on our terms, using techniques that come naturally to all of us?

And… boom! My head explodes.

For some reason, so many people have this bizzare idea that programming is this really easy thing that programmers just make difficult out of spite or elitism or clueless or something, I’m not sure what. And as long as I’ve been in the field, there’s been a constant drumbeat from people to say that it’s all easy, that programmers just want to make it difficult by forcing you to think like a machine. That what we really need to do is just humanize programming, and it will all be easy and everyone will do it and the world will turn into a perfect computing utopia.

First, the whole “think like a machine” think is a verbal shorthand that attempts to make programming as we do it sound awful. It’s not just hard to program, but those damned engineers are claiming that you need to dehumanize yourself to do it!

To be a programmer, you don’t need to think like a machine. But you need to understand how machines work. To program successfully, you do need to understand how machines work – because what you’re really doing is building a machine!

When you’re writing a program, on a theoretical level, what you’re doing is designing a machine that performs some mechanical task. That’s really what a program is: it’s a description of a machine. And what a programming language is, at heart, is a specialized notation for describing a particular kind of machine.

No one will go to an automotive engineer, and tell him that there’s something wrong with the way transmissions are designed, because they make you understand how gears work. But that’s pretty much exactly the argument that Victor is making.

How hard is it to program? That all depends on what you’re tring to do. Here’s the thing: The complexity of the machine that you need to build is what determines the complexity of the program. If you’re trying to build a really complex machine, then a program describing it is going to be really complex.

Period. There is no way around that. That is the fundamental nature of programming.

In the usual argument, one thing that I constantly see is something along the lines of “programming isn’t plumbing: everyone should be able to do it”. And my response to that is: of course so. Just like everyone should be able to do their own plumbing.

That sounds like an amazingly stupid thing to say. Especially coming from me: the one time I tried to fix my broken kitchen sink, I did over a thousand dollars worth of damage.

But: plumbing isn’t just one thing. It’s lots of related but different things:

  • There are people who design plumbing systems for managing water distribution and waste disposal for an entire city. That’s one aspect of plubing. And that’s an incredibly complicated thing to do, and I don’t care how smart you are: you’re not going to be able to do it well without learning a whole lot about how plumbing works.
  • Then there are people who design the plumbing for a single house. That’s plumbing, too. That’s still hard, and requires a lot of specialized knowledge, most of which is pretty different from the city designer.
  • There are people who don’t design plumbing, but are able to build the full plumbing system for a house from scratch using plans drawn by a designer. Once again, that’s still plumbing. But it’s yet another set of knowledge and skills.
  • There are people who can come into a house when something isn’t working, and without ever seeing the design, and figure out what’s wrong, and fix it. (There’s a guy in my basement right now, fixing a drainage problem that left my house without hot water, again! He needed to do a lot of work to learn how to do that, and there’s no way that I could do it myself.) That’s yet another set of skills and knowledge – and it’s still plumbing.
  • There are non-professional people who can fix leaky pipes, and replace damaged bits. With a bit of work, almost anyone can learn to do it. Still plumbing. But definitely: everyone really should be able to do at least some of this.

  • And there are people like me who can use a plumbing snake and a plunger when the toilet clogs. That’s still plumbing, but it requires no experience and no training, and absolutely everyone should be able to do it, without question.

All of those things involve plumbing, but they require vastly different amounts and kinds of training and experience.

Programming is exactly the same. There are different kinds of programming, which require different kinds of skills and knowledge. The tools and training methods that we use are vastly different for those different kinds of programming – so different that for many of them, people don’t even realize that they are programming. Almost everyone who uses computers does do some amount of programming:

  • When someone puts together a presentation in powerpoint, with things that move around, appear, and disappear on your command: that is programming.
  • When someone puts formula into a spreadsheet: that is programming.
  • When someone builds a website – even a simple one – and use either a set of tools, or CSS and HTML to put the site together: that is programming.
  • When someone writes a macro in Word or Excel: that is programming.
  • When someone sets up an autoresponder to answer their email while they’re on vacation: that is programming.

People like Victor completely disregard those things as programming, and then gripe about how all programming is supercomplexmagicalsymbolic gobbledygook. Most people do write programs without knowing about it, precisely because they’re doing it with tools that present the programming task as something that’s so natural to them that they don’t even recognize that they are programming.

But on the other hand, the idea that you should be able to program without understanding the machine you’re using or the machine that you’re building: that’s also pretty silly.

When you get beyond the surface, and start to get to doing more complex tasks, programming – like any skill – gets a lot harder. You can’t be a plumber without understanding how pipe connections work, what the properties of the different pipe materials are, and how things flow through them. You can’t be a programmer without understanding something about the machine. The more complicated the kind of programming task you want to do, the more you need to understand.

Someone who does Powerpoint presentations doesn’t need to know very much about the computer. Someone who wants to write spreadsheet macros needs to understand something about how the computer processes numbers, what happens to errors in calculations that use floating point, etc. Someone who wants to build an application like Word needs to know a whole lot about how a single computer works, including details like how the computer displays things to people. Someone who wants to build Google doesn’t need to know how computers render text clearly on the screen, but they do need to know how computers work, and also how networks and communications work.

To be clear, I don’t think that Victor is being dishonest. But the way that he presents things often does come off as dishonest, which makes it all the worse. To give one demonstration, he presents a comparison of how we teach programming to cooking. In it, he talks about how we’d teach people to make a soufflee. He shows a picture of raw ingredients on one side, and a fully baked soufflee on the other, and says, essentially: “This is how we teach people to program. We give them the raw ingredients, and say fool around with them until you get the soufflee.”

The thing is: that’s exactly how we really teach people to cook – taken far out of context. If we want them to be able to prepare exactly one recipe, then we give them complete, detailed, step-by-step instructions. But once they know the basics, we don’t do that anymore. We encourage them to start fooling around. “Yeah, that soufflee is great. But what would happen if I steeped some cardamom in the cream? What if I left out the vanilla? Would it turn out as good? Would that be better?” In fact, if you never do that experimentation, you’ll probably never learn to make a truly great soufflee! Because the ingredients are never exactly the same, and the way that it turns out is going to depend on the vagaries of your oven, the weather, the particular batch of eggs that you’re using, the amount of gluten in the flour, etc.

To write complicated programs is complicated. To write programs that manipulate symbolic data, you need to understand how the data symbolizes things. To write a computer that manipulates numbers, you need to understand how the numbers work, and how the computer represents them. To build a machine, you need to understand the machine that you’re building. It’s that simple.