Monthly Archives: July 2006

Using Good Math to Study Evolution Using Fitness Landscapes

Via [Migrations][migrations], I’ve found out about a really beautiful computational biology paper that very elegantly demonstrates how, contrary to the [assertions of bozos like Dembski][dembski-nfl], an evolutionary process can adapt to a fitness landscape. The paper was published in the PLOS journal “Computational Biology”, and it titled [“Evolutionary Potential of a Duplicated Repressor-Operator Pair: Simulating Pathways Using Mutation Data”][plos].
Here’s their synopsis of the paper:
>The evolution of a new trait critically depends on the existence of a path of
>viable intermediates. Generally speaking, fitness decreasing steps in this path
>hamper evolution, whereas fitness increasing steps accelerate it.
>Unfortunately, intermediates are hard to catch in action since they occur only
>transiently, which is why they have largely been neglected in evolutionary
>studies.
>
>The novelty of this study is that intermediate phenotypes can be predicted
>using published measurements of Escherichia coli mutants. Using this approach,
>the evolution of a small genetic network is simulated by computer. Following
>the duplication of one of its components, a new protein-DNA interaction
>develops via the accumulation of point mutations and selection. The resulting
>paths reveal a high potential to obtain a new regulatory interaction, in which
>neutral drift plays an almost negligible role. This study provides a
>mechanistic rationale for why such rapid divergence can occur and under which
>minimal selective conditions. In addition it yields a quantitative prediction
>for the minimum number of essential mutations.
And one more snippet, just to show where they’re going, and to try to encourage you to make the effort to get through the paper. This isn’t an easy read, but it’s well worth the effort.
>Here we reason that many characteristics of the adaptation of real protein-DNA
>contacts are hidden in the extensive body of mutational data that has been
>accumulated over many years (e.g., [12-14] for the Escherichia coli lac
>system). These measured repression values can be used as fitness landscapes, in
>which pathways can be explored by computing consecutive rounds of single base
>pair substitutions and selection. Here we develop this approach to study the
>divergence of duplicate repressors and their binding sites. More specifically,
>we focus on the creation of a new and unique protein-DNA recognition, starting
>from two identical repressors and two identical operators. We consider
>selective conditions that favor the evolution toward independent regulation.
>Interestingly, such regulatory divergence is inherently a coevolutionary
>process, where repressors and operators must be optimized in a coordinated
>fashion.
This is a gorgeous paper, and it shows how to do *good* math in the area of search-based modeling of evolution. Instead of the empty refrain of “it can’t work”, this paper presents a real model of a process, shows what it can do, and *makes predications* that can be empirically verified to match observations. This, folks, is how it *should* be done.
[migrations]: http://migration.wordpress.com/2006/07/12/duplication_and_coevolutionary_modeling/
[dembski-nfl]: http://scienceblogs.com/goodmath/2006/06/dembski_and_no_free_lunch_with_2.php
[plos]: http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pcbi.0020058

GM/BM Friday: Pathological Programming Languages

In real life, I’m not a mathematician; I’m a computer scientist. Still a math geek, mind you, but what I really do is very much in the realm of applied math, researching how to build systems to help people program.
One of my pathological obsessions is programming languages. Since I first got exposed to TRS-80 Model 1 BASIC back in middle school, I’ve been absolutely nuts programming languages. Last time I counted, I’d learned about 130 different languages; and I’ve picked up more since then. I’ve written programs most of them. Like I said, I’m nuts.
Anyway, I decided that it would be amusing to inflict my obsession on you, my readers, with a new feature: the friday pathological programming language. You see, there are plenty of *crazy* people out there; and many of them like to invent programming languages. Some very small number of them try to design good languages and succeed; a much larger number try to design good languages and fail; and *then* there are the folks who design the languages I’m going to talk about. They’re the ones who set out to design *bizzare* programming languages, and succeed brilliantly. They call them “esoteric” programming languages. I call them evil.
Today, the beautiful grand-daddy of the esoteric language family: the one, the only, the truly and deservedly infamous: [Brainfuck!][bf], designed by Urban Müller. (There are a number of different implementations available; just follow the link.)
Only 8 commands – including input and output – all written using symbols. And yet Turing complete; and not just Turing complete, but actually based on a *real* [formal theoretical design][pprimeprime]. And it’s even been implemented [*in hardware*!][bf-hard]
BrainFuck is based on something very much like a twisted cross between a [Turing machine][turing] and a [Minsky machine][minsky]. It’s got the idea of an input tape, like the turing machine. But unlike the turing machine, each cell of the tape stores a number, which can be incremented or decremented, like a Minsky machine. And like a Minsky, the only control flow is a test for zero.
The 8 instructions:
1. **>**: move the tape head one cell forward.
2. **+++++++++.++++++-.+++++++..+++.>++++<>.<——.++++++
.+++.——.——–.>+.
Let’s pull that apart just a bit so that we can hope to understand.
* “++++++++”: store the number “8” in the current tape cell. We’re going to use that as a loop index, so the loop is going to repeat 8 times.
* “[>+++++++++.”: go to the cell after the loop index, and output what’s there. That outputs the “72” as a character: “H”.
* “++++++-.”: Advance past the index, subtract one, and output. That’s 101, or “e”.
Continues in pretty much the same vein, using a couple of tape cells, and running loops to generate the values of the characters. Beautiful, eh?
If that didn’t seem impressive enough, [here][bf-fib] is a really gorgeous implementation of a fibonacci sequence generator, with documentation. The BF compiler used to write this ignores any character other than the 8 commands, so the comments don’t need to be marked in any way; they just need to be really careful not to use punctuation.

+++++++++++ number of digits to output
> #1
+ initial number
>>>> #5
++++++++++++++++++++++++++++++++++++++++++++ (comma)
> #6
++++++++++++++++++++++++++++++++ (space)
<<<<< #1
copy #1 to #7
[>>>>>>+>+<<<<<<>>>>>>[<<<<<<>>>>>>-]

++++++++++  set the divisor #8
[
subtract from the dividend and divisor
->+>+<<>>[<<>>-]
set #10
+
if #9 clear #10
[-][<>>+<<>[-]]
jump back to #8 (divisor possition)
<>> #11
copy to #13
[>>+>+<<>>[<<>>-]
set #14
+
if #13 clear #14
[-][<>[-]]
<<<<<<>>>> #12
if #12 output value plus offset to ascii 0
[++++++++++++++++++++++++++++++++++++++++++++++++.[-]]
subtract #11 from 10
++++++++++  #12 is now 10
- #12
output #12 even if it's zero
++++++++++++++++++++++++++++++++++++++++++++++++.[-]
<<<<<<<<<<< #1
check for final number
copy #0 to #3
>>+>+<<<>>>[<<<>>>-]
>.>.<<<[-]]
<>+>+<<>>[<<>>-]<<[-]>[-]<<<-
]

[bf]: http://www.muppetlabs.com/~breadbox/bf/
[bf-fib]: http://esoteric.sange.fi/brainfuck/bf-source/prog/fibonacci.txt
[turing]: http://goodmath.blogspot.com/2006/03/playing-with-mathematical-machines.html
[minsky]: http://goodmath.blogspot.com/2006/05/minsky-machine.html
[bf-hard]: http://www.robos.org/?bfcomp
[pprimeprime]: http://en.wikipedia.org/wiki/P_prime_prime

Why I Hate Religious Bayesians

Last night, a reader sent me a link to yet another wretched attempt to argue for the existence of God using Bayesian probability. I really hate that. Over the years, I’ve learned to dread Bayesian arguments, because so many of them are things like this, where someone cobbles together a pile of nonsense, dressing it up with a gloss of mathematics by using Bayesian methods. Of course, it’s always based on nonsense data; but even in the face of a lack of data, you can cobble together a Bayesian argument by pretending to analyze things in order to come up with estimates.

You know, if you want to believe in God, go ahead. Religion is ultimately a matter of personal faith and spirituality. Arguments about the existence of God always ultimately come down to that. Why is there this obsessive need to justify your beliefs? Why must science and mathematics be continually misused in order to prop up your belief?

Anyway… Enough of my whining. Let’s get to the article. It’s by a guy named Robin Collins, and it’s called “God, Design, and Fine-Tuning“.

Let’s start right with the beginning.

Suppose we went on a mission to Mars, and found a domed structure in which everything was set up just right for life to exist. The temperature, for example, was set around 70o F and the humidity was at 50%; moreover, there was an oxygen recycling system, an energy gathering system, and a whole system for the production of food. Put simply, the domed structure appeared to be a fully functioning biosphere. What conclusion would we draw from finding this structure? Would we draw the conclusion that it just happened to form by chance? Certainly not. Instead, we would unanimously conclude that it was designed by some intelligent being. Why would we draw this conclusion? Because an intelligent designer appears to be the only plausible explanation for the existence of the structure. That is, the only alternative explanation we can think of–that the structure was formed by some natural process–seems extremely unlikely. Of course, it is possible that, for example, through some volcanic eruption various metals and other compounds could have formed, and then separated out in just the right way to produce the “biosphere,” but such a scenario strikes us as extraordinarily unlikely, thus making this alternative explanation unbelievable.

The universe is analogous to such a “biosphere,” according to recent findings in physics. Almost everything about the basic structure of the universe–for example, the fundamental laws and parameters of physics and the initial distribution of matter and energy–is balanced on a razor’s edge for life to occur. As eminent Princeton physicist Freeman Dyson notes, “There are many . . .lucky accidents in physics. Without such accidents, water could not exist as liquid, chains of carbon atoms could not form complex organic molecules, and hydrogen atoms could not form breakable bridges between molecules” (1979, p.251)–in short, life as we know it would be impossible.

Yes, it’s the good old ID argument about “It looks designed, so it must be”. That’s the basic argument all the way through; they just dress it up later. And as usual, it’s wrapped up in one incredibly important assumption, which they cannot and do not address: that we understand what it would mean to change the fundamental structure of the universe.

What would it mean to change, say, the ratio of the strengths of the electromagnetic force and gravity? What would matter look like if we did? Would stars be able to exist? Would matter be able to form itself into the kinds of complex structures necessary for life?

We don’t know. In fact, we don’t even really have a clue. And not knowing that, we cannot meaningfully make any argument about how likely it is for the universe to support life.

They do pretend to address this:

Various calculations show that the strength of each of the forces of nature must fall into a very small life-permitting region for intelligent life to exist. As our first example, consider gravity. If we increased the strength of gravity on earth a billionfold, for instance, the force of gravity would be so great that any land-based organism anywhere near the size of human beings would be crushed. (The strength of materials depends on the electromagnetic force via the fine-structure constant, which would not be affected by a change in gravity.) As astrophysicist Martin Rees notes, “In an imaginary strong gravity world, even insects would need thick legs to support them, and no animals could get much larger.” (Rees, 2000, p. 30). Now, the above argument assumes that the size of the planet on which life formed would be an earth-sized planet. Could life forms of comparable intelligence to ourselves develop on a much smaller planet in such a strong-gravity world? The answer is no. A planet with a gravitational pull of a thousand times that of earth — which would make the existence of organisms of our size very improbable– would have a diameter of about 40 feet or 12 meters, once again not large enough to sustain the sort of large-scale ecosystem necessary for organisms like us to evolve. Of course, a billion-fold increase in the strength of gravity is a lot, but compared to the total range of strengths of the forces in nature (which span a range of 1040 as we saw above), this still amounts to a fine-tuning of one part in 1031. (Indeed,other calculations show that stars with life-times of more than a billion years, as compared to our sun’s life-time of ten billion years, could not exist if gravity were increased by more than a factor of 3000. This would have significant intelligent life-inhibiting consequences.) (3)

Does this really address the problem? No. How would matter be different if gravity were a billion times stronger, and EM didn’t change? We don’t know. For the sake of this argument, they pretend that mucking about with those ratios wouldn’t alter the nature of matter at all. That’s what they’re going to build their argument on: the universe must support life exactly like us: it’s got to be carbon-based life on a planetary surface that behaves exactly like matter does in our universe. In other words: if you assume that everything has to be exactly as it is in our universe, then only our universe is suitable.

They babble on about this for quite some time; let’s skip forwards a bit, to where they actually get to the Bayesian stuff. What they want to do is use the likelihood principle to argue for design. (Of course, they need to obfuscate, so they cite it under three different names, and finally use the term “the prime principle of confirmation” – after all, it sounds much more convincing than “the likelihood principle”!)

The likelihood principle is a variant of Bayes’ theorem, applied to experimental systems. The basic idea of it is to take the Bayesian principle of modifying an event probability based on a prior observation, and to apply it backwards to allow you to reason about the probability of two possible priors given a final observation. In other words, take the usual Bayesian approach of asking: “Given that Y has already occurred, what’s the probability of X occurring?”; turn it around, and say “X occurred. For it to have occurred, either Y or Z must have occurred as a prior. Given X, what are the relative probabilities for Y and Z as priors?”

There is some controversy over when the likelihood principle is applicable. But let’s ignore that for now.

To further develop the core version of the fine-tuning argument, we will summarize the argument by explicitly listing its two premises and its conclusion:

Premise 1. The existence of the fine-tuning is not improbable under theism.

Premise 2. The existence of the fine-tuning is very improbable under the atheistic single-universe hypothesis. (8)

Conclusion: From premises (1) and (2) and the prime principle of confirmation, it follows that the fine-tuning data provides strong evidence to favor of the design hypothesis over the atheistic single-universe hypothesis.

At this point, we should pause to note two features of this argument. First, the argument does not say that the fine-tuning evidence proves that the universe was designed, or even that it is likely that the universe was designed. Indeed, of itself it does not even show that we are epistemically warranted in believing in theism over the atheistic single-universe hypothesis. In order to justify these sorts of claims, we would have to look at the full range of evidence both for and against the design hypothesis, something we are not doing in this paper. Rather, the argument merely concludes that the fine-tuning strongly supports theism over the atheistic single-universe hypothesis.

That’s pretty much their entire argument. That’s as mathematical as it gets. Doesn’t stop them from arguing that they’ve mathematically demonstrated that theism is a better hypothesis than atheism, but that’s really their whole argument.

Here’s how they argue for their premises:

Support for Premise (1).

Premise (1) is easy to support and fairly uncontroversial. The argument in support of it can be simply stated as follows: since God is an all good being, and it is good for intelligent, conscious beings to exist, it not surprising or improbable that God would create a world that could support intelligent life. Thus, the fine-tuning is not improbable under theism, as premise (1) asserts.

Classic creationist gibberish: pretty much the same stunt that Swinburne pulled. They pretend that there are only two possibilities. Either (a) there’s exactly one God which has exactly the properties that Christianity attributes to it; or (b) there are no gods of any kind.

They’ve got to stick to that – because if they admitted more than two possibilities, they’d have to actually consider why their deity is more likely that any of the other possibilities. They can’t come up with an argument that Christianity is better than atheism if they acknowledge that there are thousands of possibilities as likely as theirs.

Support for Premise (2).

Upon looking at the data, many people find it very obvious that the fine-tuning is highly improbable under the atheistic single-universe hypothesis. And it is easy to see why when we think of the fine-tuning in terms of the analogies offered earlier. In the dart-board analogy, for example, the initial conditions of the universe and the fundamental constants of physics can be thought of as a dart- board that fills the whole galaxy, and the conditions necessary for life to exist as a small one-foot wide target. Accordingly, from this analogy it seems obvious that it would be highly improbable for the fine-tuning to occur under the atheistic single-universe hypothesis–that is, for the dart to hit the board by chance.

Yeah, that’s pretty much it. The whole argument for why fine-tuning is less probably in a universe without a deity than in a universe with one. Because “many people find it obvious”, and because they’ve got a clever dartboard analogy.

They make a sort of token effort to address the obvious problems with this, but they’re really all nothing but more empty hand-waving. I’ll just quote one of them as an example; you can follow the link to the article to see the others if you feel like giving yourself a headache.

Another objection people commonly raise against the fine-tuning argument is that as far as we know, other forms of life could exist even if the constants of physics were different. So, it is claimed, the fine-tuning argument ends up presupposing that all forms of intelligent life must be like us. One answer to this objection is that many cases of fine-tuning do not make this presupposition. Consider, for instance, the cosmological constant. If the cosmological constant were much larger than it is, matter would disperse so rapidly that no planets, and indeed no stars could exist. Without stars, however, there would exist no stable energy sources for complex material systems of any sort to evolve. So, all the fine-tuning argument presupposes in this case is that the evolution of life forms of comparable intelligence to ourselves requires some stable energy source. This is certainly a very reasonable assumption.

Of course, if the laws and constants of nature were changed enough, other forms of embodied intelligent life might be able to exist of which we cannot even conceive. But this is irrelevant to the fine-tuning argument since the judgement of improbability of fine-tuning under the atheistic single-universe hypothesis only requires that, given our current laws of nature, the life-permitting range for the values of the constants of physics (such as gravity) is small compared to the surrounding range of non-life-permitting values.

Like I said at the beginning: the argument comes down to a hand-wave that if the universe didn’t turn out exactly like ours, it must be no good. Why does a lack of hydrogen fusion stars like we have in our universe imply that there can be no other stable energy source? Why is it reasonable to constrain the life-permitting properties of the universe to be narrow based on the observed properties of the laws of nature as observed in our universe?

Their argument? Just because.

Protecting the Homeland: the Terrorists' Target List

Longtime readers of GM/BM will remember [this post][homeland], where I discussed the formula used by the Department of Homeland Security for allocating anti-terrorism funds. At the time, I explained:
>It turns out that the allocation method was remarkably simple. In their
>applications for funding, cities listed assets that they needed to protect.
>What DHS did was take the number of listed assets from all of the cities that
>were going to be recipients of funds, and give each city an amount of funding
>proportional to the number of assets they listed.
>
>So, the Empire State building is equal to the neighborhood bank in Omaha. The
>stock exchange on Wall Street is equal to the memorial park in Anchorage,
>Alaska. Mount Sinai hospital is equal to the county hospital in the suburbs of
>Toledo, Ohio. The New York subway system (18.5 billion passenger-miles per
>year) is equal to the Minneapolis transit system (283 million passenger-miles
>per year). The Brooklyn Bridge is equal the George Street bridge in New
>Brunswick, NJ.
Well, according to the [New York Times][nyt] (login required), it appears that I gave *too much credit* to the DHS. They weren’t saying that, for example, Wall Street was equivalent to the memorial park in Anchorage. What they were saying is that the Wall Street stock exchange is equivalent to the Mule Day Parade in Columbia Tenessee; Mt. Sinai hospital is equivalent to an unnamed Donut Shop; the Macy’s thanksgiving parade is equivalent to the Bean Fest in Mountain View, Arkansas.
Questioned about the foolishness of this insane list, a DHS spokesperson responded “We don’t find it embarrassing, the list is a valuable tool.”
Don’t you feel safer now that you know how the government is using what they keep stressing is *your money* to protect you?
[homeland]: http://goodmath.blogspot.com/2006/06/astoundingly-stupid-math-bullshit-of_02.html
[nyt]: http://www.nytimes.com/2006/07/12/washington/12assets.html?_r=1&oref=login

Monads and Programming Languages

One of the questions that a ton of people sent me when I said I was going to write about category theory was “Oh, good, can you please explain what the heck a monad is?”

The short version is: a monad is a category with a functor to itself. The way that this works in a programming language is that you can view many things in programming languages in terms of monads. In particular, you can take things that involve mutable state, and magically hide the state.

How? Well – the state (the set of bindings of variables to values) is an object in a category, State. The monad is a functor from State → State. Since the functor is a functor from a category to itself, the value of the state is implicit – they’re the object at the start and end points of the functor. From the viewpoint of code outside of the monad functor, the states are indistinguishable – they’re just something in the category. For the functor itself, the value of the state is accessible.

So, in a language like Haskell with a State monad, you can write functions inside the State monad; and they are strictly functions from State to State; or you can write functions outside the state monad, in which case the value inside the state is completely inaccessible. Let’s take a quick look at an example of this in Haskell. (This example came from an excellent online tutorial which, sadly, is no longer available.)

Here’s a quick declaration of a State monad in Haskell:

class MonadState m s | m -> s where
  get :: m s
  put :: s -> m ()

instance MonadState (State s) s where
  get   = State $ s -> (s,s)
  put s = State $ _ -> ((),s)

This is Haskell syntax saying we’re defining a state as an object which stores one value. It has two functions: get, which retrieves the value from a state; and put, which updates the value hidden inside the state.

Now, remember that Haskell has no actual assignment statement: it’s a pure functional language. So what “put” actually does is create a new state with the new value in it.

How can we use it? We can only access the state from a function that’s inside the monad. In the example, they use it for a random number generator; the state stores the value of the last random generated, which will be used as a seed for the next. Here we go:

getAny :: (Random a) => State StdGen a
getAny = do g <- get
  (x,g') <- return $ random g
  put g'
  return x

Now – remember that the only functions that exist *inside* the monad are "get" and "put". "do" is a syntactic sugar for inserting a sequence of statements into a monad. What actually happens inside of a do is that *each expression* in the sequence is a functor from a State to State; each expression takes as an input parameter the output from the previous. "getAny" takes a state monad as an input; and then it implicitly passes the state from expression to expression.

"return" is the only way *out* of the monad; it basically says "evaluate this expression outside of the monad". So, "return $ randomR bounds g" is saying, roughly, "evaluate randomR bounds g" outside of the monad; then apply the monad constructor to the result. The return is necessary there because the full expression on the line *must* take and return an instance of the monad; if we just say "(x,g') <- randomR bounds g", we'd get an error, because we're inside of a monad construct: the monad object is going be be inserted as an implicit parameter, unless we prevent it using "return". But the resulting value has to be injected back into the monad – thus the "$", which is a composition operator. (It's basically the categorical º). Finally, "return x" is saying "evaluate "x" outside of the monad – without the "return", it would treat "x" as a functor on the monad.

The really important thing here is to recognize that each line inside of the "do" is a functor from State → State; and since the start and end points of the functor are implicit in the structure of the functor itself, you don't need to write it. So the state is passed down the sequence of instructions – each of which maps State back to State.

Let's get to the formal part of what a monad is. There's a bit of funny notation we need to define for it. (You can't do anything in category theory without that never-ending stream of definitions!)

  1. Given a category C, 1C is the *identity functor* from C to C.
  2. For a category C, if T is a functor C → C, then T2 is the TºT. (And so on for tother )
  3. For a given Functor, T, the natural transformation T → T is written 1T.

Suppose we have a category, C. A *monad on C* is a triple (T,η,μ), where T is a functor from C → C, and η and μ are natural transformations; η: 1C → T, and μ: (TºT) → T. (1C is the identity functor for C in the category of categories.) These must have the following properties:

First, μ º Tμ = μ º μT. Or in diagram form:

monad-prop1.jpg

Second, μ º Tη = μ º ηT = 1T. In diagram form:

monad-prop2.jpg

Basically, what these really comes down to is an associative property ensuring that T behaves properly over composition, and that there is an identity transformation that behaves as we would expect. These two properties together add up to mean that any order of applications of T will behave properly, preserving the structure of the category underlying the monad.

PEAR yet again: the theory behind paranormal gibberish (repost from blogger)

This is a repost from GM/BMs old home; the original article appeared
[here][old]. I’m reposting because someone is attempting to respond to this
article, and I’d rather keep all of the ongoing discussions in one place. I also
think it’s a pretty good article, which some of the newer readers here may not
have seen. As usual for my reposts, I’ve fixed the formatting and made a few
minor changes. This article was originally posted on May 29.
I’ve been looking at PEAR again. I know it may seem sort of like beating a dead
horse, but PEAR is, I think, something special in its way: it’s a group of
people who pretend to use science and mathematics in order to support all sorts
of altie-woo gibberish. This makes them, to me, particularly important targets
for skeptics: if they were legit, and they were getting the kinds of results
that they present, they’d be demonstrating something fascinating and important.
But they’re not: they’re trying to use the appearance of science to undermine
science. And they’re incredibly popular among various kinds of crackpottery:
what led me back to them this time is the fact that I found them cited as a
supporting reference in numerous places:
1. Two different “UFOlogy” websites;
2. Eric Julien’s dream-prophecy of a disastrous comet impact on earth (which was supposed to have happened back in May; he’s since taken credit for *averting* said comet strike by raising consciousness);
3. Three different websites where psychics take money in exchange for psychic predictions or psychic healing;
4. Two homeopathy information sites;
5. The house of thoth, a general clearinghouse site for everything wacky.
Anyway, while looking at the stuff that all of these wacko sites cited from
PEAR, I came across some PEAR work which isn’t just a rehash of the random
number generator nonsense, but instead an attempt to define, in mathematical
terms, what “paranormal” events are, and what they mean.
It’s quite different from their other junk; and it’s a really great example of
one of the common ways that pseudo-scientists misuse math. The paper is called
“M* : Vector Representation of the Subliminal Seed Regime of M5“, and you can
find it [here][pear-thoth].
The abstract gives you a pretty good idea of what’s coming:
>A supplement to the M5 model of mind/matter interactions is proposed
>wherein the subliminal seed space that undergirds tangible reality and
>conscious experience is characterized by an array of complex vectors whose
>components embody the pre-objective and pre-subjective aspects of their
>interactions. Elementary algebraic arguments then predict that the degree of
>anomalous correlation between the emergent conscious experiences and the
>corresponding tangible events depends only on the alignment of these
>interacting vectors, i. e., on the correspondence of the ratios of their
>individual ”hard” and ”soft” coordinates. This in turn suggests a
>subconscious alignment strategy based on strong need, desire, or shared purpose
>that is consistent with empirical experience. More sophisticated versions of
>the model could readily be pursued, but the essence of the correlation process
>seems rudimentary.
So, if we strip out the obfuscation, what does this actually say?
Umm… “*babble babble* complex vectors *babble babble babble* algebra *babble babble* ratios *babble babble* correlation *babble babble*.”
Seriously: that’s a pretty good paraphrase. That entire paragraph is *meaningless*. It’s a bunch of nonsense mixed in with a couple of pseudo-mathematical terms in order to make it sound scientific. There is *no* actual content in that abstract. It reads like a computer-generated paper from
[SCIgen][scigen] .
(For contrast, here’s a SCIgen-generated abstract: “The simulation of randomized algorithms has deployed model checking, and current trends suggest that the evaluation of SMPs will soon emerge. In fact, few statisticians would disagree with the refinement of Byzantine fault tolerance. We confirm that although multicast systems [16] can be made homogeneous, omniscient, and autonomous, the acclaimed low-energy algorithm for the improvement of DHCP [34] is recursively enumerable.”)
Ok, so the abstract is the pits. To be honest, a *lot* of decent technical papers have really lousy abstracts. So let’s dive in, and look at the actual body of the paper, and see if it improves at all.
They start by trying to explain just what their basic conceptual model is. According to the authors, the world is fundamentally built on consciousness; and that most events start in a a pre-conscious realm of ideas called the “*seed region*”; and that as they emerge from the seed region into experienced reality, they manifest in two different ways; as “events” in the material domain, and as “experiences” or “perceptions” in the mental domain. They then claim that in order for something from the seed region to manifest, it requires an interaction of at least two seeds.
Now, they try to start using pseudo-math to justify their gibberish.
Suppose we have two of these seed beasties, S1, and S2. Now, suppose we have a mathematical representation of them as “vectors”. They write that as [S]).
A “normal” event, according to them, is one where the events combine in what they call a “linear” way (scare-quotes theirs): [S1] + [ S2] = [S1 + S2). On the other hand, events that are perceived as anomalous are events for which that’s not true: [S1] + [S2] ≠[S1 + S2].
We’re already well into the land of pretend mathematics here. We have two non-quantifiable “seeds”; but we can add them together… We’re pulling group-theory type concepts and notations, and applying them to things that absolutely do not have any of the prerequisites for those concepts to be meaningful.
But let’s skip past that for a moment, because it gets infinitely sillier shortly.
They draw a cartesian graph with four quadrants, and label them (going clockwise from the first quadrant): T (for tangible), I (for intangible – aka, not observable in tangible reality), U (for unconscious), and C (conscious). So the upper-half is what they consider to be observable, and the bottom half is non-observable; and the left side is mind and the right side is matter. Further, they have a notion of “hard” and “soft”; objective is hard, and subjective is soft. They proceed to give a list of ridiculous pairs of words which they claim are different ways of expressing the fundamental “hard/soft” distinction, including “masculine/feminine”, “particulate/wavelike”, “words/music”, and “yang/yin”.
Once they’ve gotten here, they get to my all-time favorite PEAR statement; one which is actually astonishingly obvious about what they’re really up to:
>It is then presumed that if we appropriate and pursue some established
>mathematical formalism for representing such components and their interactions,
>the analytical results may retain some metaphoric relevance for the emergence
>of anomalous mind/matter manifestations.
I love the amount of hedging involved in that sentence! And the admission that
they’re just “appropriating” a mathematical formalism for no other purpose than
to “retain some metaphoric relevance”. I think that an honest translation of
that sentence into non-obfuscatory english is: “If we wrap this all up in
mathematical symbols, we can make it look as if this might be real science”.
So, they then proceed to say that they can represent the seeds as complex numbers: S = s + iσ. But “s” and “sigma” can’t just be simply “pre-material” and “pre-mental”, because that would be too simple. Instead, they’re “hard” and “soft”; even thought we’ve just gone through the definition which categorized hard/soft as a better characterization of material and mental. Oh, and they have to make sure that this looks sufficiently mathematical, so instead of just saying that it’s a complex, they present it in *both* rectangular and polar coordinates, with the equation for converting between the two notations written out inside the same definition area. No good reason for that, other than have something more impressive looking.
Then they want to define how these “seeds” can propagate up from the very lowest reaches of their non-observable region into actual observable events, and for no particular reason, they decide to use the conjugate product equation randomly selected from quantum physics. So they take a random pair of seeds (remember that they claim that events proceed from a combination of at least two seeds), and add them up. They claim that the combined seed is just the normal vector addition (which they proceed to expand in the most complex looking way possible); and they also take the “conjugate products” and add them up (again in the most verbose and obfuscatory way possible); and then take the different between the two different sums. At this point, they reveal that for some reason, they think that the simple vector addition corresponds to “[S1] + [S2]” from earlier; and the conjugate is “[S1+S2]”. No reason for this correspondence is give; no reason for why these should be equal for “non-anomalous” events; it’s just obviously the right thing to do according to them. And then, of course, they repeat the whole thing in polar notation.
It just keeps going like this: randomly pulling equations out of a hat for no particular reason, using them in bizzarely verbose and drawn out forms, repeating things in different ways for no reason. After babbling onwards about these sums, they say that “Also to be questioned is whether other interaction recipes beyond the simple addition S1,2 = S1 + S2 could profitably be explored.”; they suggest multiplication; but decide against it just because it doesn’t produce the results that they want. Seriously! In their words “but we show that this doesn’t generate similar non-linearities”: that is, they want to see “non-linearities” in the randomly assembled equations, and since multiplying doesn’t have that, it’s no good to them.
Finally, we’re winding down and getting to the end: the “summary”. (I was taught that when you write a technical paper, the summary or conclusion section should be short and sweet. For them, it’s two full pages of tight text.) They proceed to restate things, complete with repeating the gibberish equations in yet another, slightly different form. And then they really piss me off. Statement six of their summary says “Elementary complex algebra then predicts babble babble babble”. Elementary complex algebra “predicts” no such thing. There is no real algebra here, and nothing about algebra would remotely suggest anything like what they’re claiming. It’s just that this is a key step in their reasoning chain, and they absolutely cannot support it in any meaningful way. So they mask it up in pseudo-mathematical babble, and claim that the mathematics provides the link that they want, even though it doesn’t. They’re trying to use the credibility and robustness of mathematics to keep their nonsense above water, even though there’s nothing remotely mathematical about it.
They keep going with the nonsense math: they claim that the key to larger anomalous effects resides in “better alignment” of the interacting seed vectors (because the closer the two vectors are, in their framework, the larger the discrepancy between their two ways of “adding” vectors); and that alignments are driven by “personal need or desire”. And it goes downhill from there.
This is really wretched stuff. To me, it’s definitely the most offensive of the PEAR papers. The other PEAR stuff I’ve seen is abused statistics from experiments. This is much more fundamental – instead of just using sampling errors to support their outcome (which is, potentially, explainable as incompetence on the part of the researchers), this is clear, deliberate, and fundamental misuse of mathematics in order to lend credibility to nonsense.
[old]: http://goodmath.blogspot.com/2006/05/pear-yet-again-theory-behind.html
[pear-thoth]: http://goodmath.blogspot.com/2006/05/pear-yet-again-theory-behind.html
[scigen]: http://pdos.csail.mit.edu/scigen/

Subtraction: Math Too Hard for a Conservative Blogger

This has been written about [elsewhere][lf], but I can’t let such a perfect example of the fundamental innumeracy of so many political pundits pass me by without commenting.
Captain Ed of [Captains Quarters][cq] complains about a speech by John Edwards in which Edwards mentions 37 million people below the poverty line:
>Let’s talk about poverty. Where did John Edwards get his numbers? The US Census
>Bureau has a ready table on poverty and near-poverty, and the number 37 million
>has no relation to those below the poverty line. If his basis is worry, well,
>that tells us nothing; what parent doesn’t worry about putting food on the
>table and clothes on the children, except for rich personal-injury attorneys?
>That threshold is meaningless.
Now, let’s look at the very figures that our brilliant captain links to:

Table 6. People Below 125 Percent of Poverty Level and the Near Poor:
1959 to 2004
(Numbers in Thousands)
____________________________________________________________
Below 1.25      Between 1.00 - 1.25
____________________ ____________________
Year      Total      Number   Percent     Number   Percent
____________________________________________________________
2004.....  290,605     49,666      17.1     12,669       4.4

Ok… So, approximately 50 million people below 1.25 * the poverty line… And approximately 13 million people above the poverty line, but below 1.25 times it…
Now, What kind of brilliant mathematician does it take to figure out how many people are below the poverty line from this table? What kind of sophisticated math do we need to use to figure it out? College calculus? No. High school algebra? No. 3rd grade subtraction? There we are.
50 – 13 = ?
My daughter, who is in *kindergarten* can do this using her *fingers*. But apparently math like this is completely beyond our Captain. (As much as I hate to admit it, this isn’t a phenomenon of people on the political right being innumerate; this kind of innumeracy is widespread on both ends of the political spectrum.)
[lf]: http://lawandpolitics.blogspot.com/2006_07_01_lawandpolitics_archive.html#115250909380587878
[cq]: http://www.captainsquartersblog.com/mt/archives/007436.php

Yoneda's Lemma

So, at last, we can get to Yoneda’s lemma, as I [promised earlier][yoneda-promise]. What Yoneda’s lemma does is show us how for many categories (in fact, most of the ones that are interesting) we can take the category C, and understand it using a structure formed from the functors from C to the category of sets. (From now on, we’ll call the category of sets **Set**.)
So why is that such a big deal? Because the functors from C to the **Set** define a *structure* formed from sets that represents the properties of C. Since we have a good intuitive understanding of sets, that means that Yoneda’s lemma
gives us a handle on how to understand all sorts of difficult structures by looking at the mapping from those structures onto sets. In some sense, this is what category theory is really all about: we’ve taken the intuition of sets and functions; and used it to build a general way of talking about structures. Our knowledge and intuition for sets can be applied to all sorts of structures.
As usual for category theory, there’s yet another definition we need to look at, in order to understand the categories for which Yoneda’s lemma applies.
If you recall, a while ago, I talked about something called *[small categories][smallcats]*: a small category is a categories for which the class of objects is a set, and not a proper class. Yoneda’s lemma applies to a a class of categories slightly less restrictive than the small categories, called the *locally small categories*.
The definition of locally small categories is based on something called the Hom-classes of a category. Given a category C, the hom-classes of C are a partition of the morphisms in the category. Given any two objects a and b in Obj(C), the hom-class **Hom**(a,b) is the class of all morphisms f : a → b. If **Hom**(a,b) is a set (instead of a proper class), then it’s called the hom-set of a and b.
A category C is *locally small* if/f all of the hom-classes of C are sets: that is, if for every pair of objects in Obj(C), the morphisms between them form a set, and not a proper class.
So, on to the lemma.
Suppose we have a locally small category C. Then for each object a in Obj(C), there is a *natural functor* corresponding to a mapping to **Set**. This is called the hom-functor of A, and it’s generally written: *h*a = **Hom**(a,-). *h*a is a functor which maps from a object X in C to the set of morphisms **Hom**(a,x).
If F is a functor from C to **Set**, then for all a ∈ Obj(C), the set of natural transformations from *h*a to F have a one-to-one correspondence with the elements of F(A): that is, the natural transformations – the set of all structure preserving mappings – from hom-functors of C to **Set** are isomorphic to the functors from C to **Set**.
So the functors from C to **Set** provide all of the structure preserving mappings from C to **Set**.
Yesterday, we saw a way how mapping *up* the abstraction hierarchy can make some kinds of reasoning easier. Yoneda says that for some things where we’d like to use our intuitions about sets and functions, we can also *map down* the abstraction hierarchy.
(If you saw my posts on group theory back at GM/BMs old home, this is a generalization of what I wrote about [the symmetric groups][symmetry]: the fact that every group G is isomorphic to a subgroup of the symmetric group on G.)
Coming up next: why computer science geeks like me care about this abstract nonsense? What does all of this gunk have to do with programming and programming languages? What the heck is a Monad? and more.
[symmetry]: http://goodmath.blogspot.com/2006/04/permutations-and-symmetry-groups.html
[yoneda-promise]: http://scienceblogs.com/goodmath/2006/06/category_theory_natural_transf.php
[smallcats]: http://scienceblogs.com/goodmath/2006/06/more_category_theory_getting_i.php

Using Natural Transformations: Recreating Closed Cartesian Categories

Today’s contribution on category theory is going to be short and sweet. It’s an example of why we really care about [natural transformations][nt]. Remember the trouble we went through working up to define [cartesian categories and cartesian closed categories][ccc]?
As a reminder: a [functor][functor] is a structure preserving mapping between categories. (Functors are the morphisms of the category of small categories); natural transformations are structure-preserving mappings between functors (and are morphisms in the category of functors).
Since we know that the natural transformation can be viewed as a kind of arrow, then we can take the definitions of iso-, epi-, and mono-morphisms, and apply them to natural transformations, resulting in *natural isomorphisms*, *natural monomorphisms*, and *natural epimorphisms*.
Expressed this way, a cartesian category is a category C where:
1. C contains a terminal object t; and
2. (∀ a,b ∈ Obj(C)), C contains a product object a×b; and
a *natural isomorphism* Δ, which maps each Functor over (C×C): ((x → a) → b) to (x → (a×b))
What this really says is: if we look at categorical products, then for a cartesian category, there’s a way of understanding the product as a
mapping within the category as a pairing structure over arrows.
structure-preserving transformation from arrows between the pairs of values (a,b) and the products (a×b).
The closed cartesian category is just the same exact trick using the exponential: A CCC is a category C where:
1. C is a cartesian category, and
2. (∀ a,b ∈ Obj(C)), C contains an object ba, and a natural isomorphism Λ, where (∀ y ∈ Obj(C)) Λ : (y×a → b) → (y → ab).
Look at these definitions; then go back and look at the old definitions that we used without the new constructions of the natural transformation. That will let you see what all the work to define natural transformations buys us. Category theory is all about structure; with categories, functors, and natural transformations, we have the ability to talk about extremely sophisticated structures and transformations using a really simple, clean abstraction.
[functor]: http://scienceblogs.com/goodmath/2006/06/more_category_theory_getting_i.php
[nt]: http://scienceblogs.com/goodmath/2006/06/category_theory_natural_transf.php
[ccc]: http://scienceblogs.com/goodmath/2006/06/categories_products_exponentia_1.php

Lying with Statistics: Abortion Rates

Via [Feministe][feministe], we see a wingnut named Tim Worstall [trying to argue something about sexual education][worstall]. It’s not entirely clear just what the heck he thinks his argument is; he wants to argue that sexual education “doesn’t work”; his argument about this is based on abortion rates. This
is an absolutely *classic* example of how statistics are misused in political arguments. So let’s take a look, and see what’s wrong.
[feministe]: http://www.feministe.us/blog/archives/2006/07/10/lies-damn-lies-and-statistics/
[worstall]: http://timworstall.typepad.com/timworstall/2006/07/sex_education_w.html#comment-19323490
He quotes an article from the Telegraph, a UK newspaper. The telegraph article cites statistics from the UK department of health. Here’s what Worstall has to say:
>Yup, gotta hand it to them, the campaigners are right. Sex education obviously works
>
>Abortions have reached record levels, and nearly a third of women who have an abortion have had one
>or more before.
>
>Department of Health statistics reveal that abortions in England and Wales rose by more than 700 in
>2005, from 185,713 in 2004 to 186,416.
>…
>Some 31 per cent of women had one or more previous abortions, a figure that rises to 43 per cent
>among black British women.
>
>The ever increasing amount of sex education, the ever easier provision of contraception is clearly >driving down the number of unwanted pregnancies.
Clearly, Worstall and the author of the telegraph piece want us to believe that there’s a significant *increase* in the number of abortions in the UK; and that this indicates some problem with the idea of sex-ed.
So what’s wrong with this picture?
First, let’s just look at those numbers, shall we? We’re talking about a year over year increase of *700* abortions from a base of *185,000*. How significant is that? Well, do the math: 0.37%. Yes, about one third of one percent. Statistically significant? Probably not. (Without knowing exactly how those numbers are gathered, including whether or not there’s a significant possibility of abortions being underreported, there’s no way to be absolutely sure, but 1/3 of 1% from a population of 185,000 or so is not likely to be significant.)
But it gets worse. Take a good look at those statistics: what do they measure? They’re a raw number of abortions. But what does that number actually mean? Statistics like that taken out of context are very uninformative. Let’s put them in context. From the [statistics for England and Wales][stats]:
[stats]: http://www.johnstonsarchive.net/policy/abortion/ab-ukenglandwales.html
In the year 2003, there were 621,469 live births, and 190,660 abortions. In 2004, there were 639,721 live births, and 194,179 abortions. Now, these stats from from the UK Office of National Statistics. Note that the numbers *do not match* the numbers cited earlier. In fact, taken as bare statistics, these numbers show a *much larger* increase in abortions: about 1.8%.
But, put in context… Take the number of abortions as a percentage of non-miscarried pregnancies (which we need to do because the miscarriage statistics for the years 2003 and 2004 are not available), and we find that
the number of abortions per 1000 pregnancies actually *declined* from 292/1000 in 2003 to 290/1000 in 2004. And that number from 2003 was a decline from 2002, which was a decline from 2001. So for the last four years for which statistics are available, the actual percentage of pregnancies ending in abortions has been nearly constant; but closely studying the numbers shows that the number has been *declining* for those four years.
In fact, if we look at abortion statistics overall, what we find is that from the legalization of abortion in the UK, there was a consistent increase until about 1973 (when the number of abortions reached 167,000), and since then, the number has ranged upwards and downwards with no consistent pattern.
So – what we’ve got here is a nut making an argument that’s trying to use statistics to justify his political stance. However, the *real* statistics, in context, don’t say what he wants them to say. So – as usual for a lying slimebag – he just selectively misquotes them to make it *look like* they say what he wants them to.