I’m away on vacation this week, taking my kids to Disney World. Since I’m not likely to have time to write while I’m away, I’m taking the opportunity to re-run some old classic posts which were first posted in the summer of 2006. These posts are mildly revised.
 Back when I first wrote this post, I was taking a break from some puzzling debugging.
Since I was already a bit frazzled, and I felt like I needed some comic relief, I decided to
hit one of my favorite comedy sites, Answers in Genesis. I can pretty much always find
something sufficiently stupid to amuse me on their site. On that fateful day, I came across a
gem called Information, science and biology”, by the all too appropriately named
“Werner Gitt”. It’s yet another attempt by a creationist twit to find some way to use
information theory to prove that life must have been created by god.
 This article really interested me in the bad-math way, because I’m a big fan of information theory. I don’t pretend to be anything close to an expert in it, but I’m
fascinated by it. I’ve read several texts on it, taken one course in grad school, and had the incredible good fortune of getting to know Greg Chaitin, one of the co-inventors of algorithmic information theory. Basically, it’s safe to say that I know enough about
information theory to get myself into trouble.
 Unlike admission above, it looks like the Gitt hasn’t actually read any real
information theory much less understood it. All that he’s done is heard Dembski presenting
one of his wretched mischaracterizations, and then regurgitated and expanded upon them.
Dembski was bad enough; building on an incomplete understanding of Dembski’s misrepresentations, misunderstandings, and outright and errors produces a result
that is just astonishingly ridiculous. It’s actually a splendid example of my mantra on this blog: “the worst math is no math“; the entire article pretends to be doing math – but it’s actual mathematical content is nil. Still, to the day of this repost, I continue
to see references to this article as “Gitt’s math” or “Gitt’s proof”.
 Gitt starts his article by thoroughly butchering an introduction to Shannon
information theory.  I’ll just let that breeze by; no sense belaboring the obvious. After
his botched introduction, he moves on to the rubbish that I’ll focus on.
The highest information density known to us is that of the DNA (deoxyribonucleic acid)
molecules of living cells. This chemical storage medium is 2 nm in diameter and has a 3.4 NM
helix pitch (see Figure 1). This results in a volume of 10.68×10-21
cm3 per spiral. Each spiral contains ten chemical letters (nucleotides), resulting
in a volumetric information density of 0.94×1021 letters/cm3. In
the genetic alphabet, the DNA molecules contain only the four nucleotide bases, that is,
adenine, thymine, guanine and cytosine. The information content of such a letter is 2
bits/nucleotide. Thus, the statistical information density is 1.88×1021
bits/cm3.
 This is, of course, utter gibberish. DNA is not the “highest information density
known”. In fact, the concept of information density is not well-defined. Without a good definition, it’s meaningless: How do you compare the “information density” of a DNA molecule with the information density of an electromagnetic wave emitted by a pulsar? You can’t: it’s meaningless to compare. This is just a sign of the kind of nonsense to come: Gitt is a guy who doesn’t believe that he needs to be bothered with trivial little details like
definition. He’s a big idea guy!
 Anyway… we can define a kind of information density as bits per cubit centimeter. Of course, that’s still not well-defined; how do we decide what’s a bit? Naively it
seems obvious, but when you think about it in detail, you’ll realize where the ambiguity comes in. Is a bit a specific chunk that can be one of several options – as in the segments
of DNA? Is a bit the magnetic alignment of a bit of iron? Is a bit the charge of an ion? Any of those are perfectly plausible definitions of a unit of information encoded into a physical
form. Depending on how you define it, you can come up with a number of different “highest information density known to us”.
 Consider, for example, the information density of a crystal, like a
diamond. A diamond is an incredibly compact crystal of carbon atoms. There are no perfect
diamonds: all crystals contain irregularities and impurities. Consider how dense the
information of that crystal is: the position of every flaw, every impurity, the positions of
the subset of carbon atoms in the crystal that are carbon-14 as opposed to carbon-12.
 Just take the impurities, and look up the density of a diamond. Assume that there’s one
non-carbon atom per billion in the diamond – that’s probably on the low-end of the number
of impurities. Use its position in the diamond lattice as a bit indicator. Assume that the
impurity encodes only one bit – even though you could encode quite a lot more. Now, work
out the “information density” of the diamond.
Considerably denser than DNA, huh?
 After this is where it really starts to get silly. Our Gitt claims that Shannon
theory is incomplete, because after all, it’s got a strictly quantitative measure of
information: it doesn’t care about what the message means. So he sets out to “fix”
that problem. He proposes five levels of information: statistics, syntax, semantics,
pragmatics, and apobetics. He claims that Shannon theory (and in fact information theory
as a whole) only concerns itself with the first; it’s incomplete because it doesn’t
differentiate between syntactically valid and invalid information, much less attempt to reason about the higher levels. 
Let’s take a quick run through the five, before I start mocking them.
- Statistics: This is what information theory refers to as information content, expressed in terms of an event sequence (as I said, he’s following Dembski); so we’re looking at a series of events, each of which is receiving a character of a message, and the information added by each event is how surprising that event was. That’s why he calls it statistical.
- Syntax: The structure of the language encoded by the message. At this level, it is assumed that every message is written in a code; you can distinguish between “valid” and “invalid” messages by checking whether they are valid strings of characters for the given code.
- Semantics: What the message means.
- Pragmatics: The primitive intention of the transmitter of the message; the specific events/actions that the transmitter wanted to occur as a result of sending the message.
- Apobetics: The purpose of the message.
According to him, level 5 is the most important one.
 Before moving on, I’ll just briefly note: formulating things this way is assuming
the conclusion. What he wants to prove is that all real information includes
a message which was sent with intent and purpose – and thus can’t be created by
anything other than an intelligent sender. But he’s already assuming in his definition of information that it must have these components – including the intention of the sender and the purpose of the message.
 Throughout the article, he constantly writes “theorems”. He clearly doesn’t understand what the word “theorem” means, because these things are just statements that he would like to be true, but which are unproven, and often unprovable. These aren’t theorems. In math, the word “theorem” means something very specific. A theorem isn’t just “a statement that I think is true”, or “a statement that I want to specifically label because it’s important”. A theorem is a proven statement. If
you don’t show a proof for it, it’s not a theorem. No matter how obvious it seems, no
matter how straightforward, it’s not a theorem if you don’t have a proof.
Now let’s look a few examples of his so-called theorems. I’m quoting the
entire theorems here – a series of them and the start of the discussion
that follows. This is really how he presents “theorems”. This comes
from his section on what he calls the syntax level of information.
Theorem 4: A code is an absolutely necessary condition for the representation
of information.Theorem 5: The assignment of the symbol set is based on convention and
constitutes a mental process.Theorem 6: Once the code has been freely defined by convention, this definition
must be strictly observed thereafter.Theorem 7: The code used must be known both to the transmitter and receiver if
the information is to be understood.Theorem 8: Only those structures that are based on a code can represent
information (because of Theorem 4). This is a necessary, but still inadequate,
condition for the existence of information.These theorems already allow fundamental statements to be made at the level of
the code. If, for example, a basic code is found in any system, it can be
concluded that the system originates from a mental concept.
 How do we conclude that a code is a necessary condition for the representation  of information? We just assert it. Worse, how do we conclude that only things that are based on a code represent information? Again, just an assertion – but an incredibly strong one. He is asserting that nothing without a
structured encoding is information. And this is also the absolute crux of his argument: information only exists as a part of a code designed by an intelligent process. 
Despite the fact that he claims to be completing Shannon theory, there is nothing to do with math in the rest of this article. It’s all words. “Theorems” like the ones quoted above, but becoming progressively more outrageous and unjustified.
For example, his theorem 11:
The apobetic aspect of information is the most important, because it embraces
the objective of the transmitter. The entire effort involved in the four lower
levels is necessary only as a means to an end in order to achieve this
objective.
After this, we get to his conclusion, which is quite a prize.
On the basis of Shannon’s information theory, which can now be regarded as
being mathematically complete, we have extended the concept of information as
far as the fifth level. The most important empirical principles relating to the
concept of information have been defined in the form of theorems.
See, to him, a theorem is nothing but a “form”: a syntactic structure. And this whole article, to him, is mathematically complete.
The Bible has long made it clear that the creation of the original groups of
fully operational living creatures, programmed to transmit their information to
their descendants, was the deliberate act of the mind and the will of the
Creator, the great Logos Jesus Christ.We have already shown that life is overwhelmingly loaded with information; it
should be clear that a rigorous application of the science of information is
devastating to materialistic philosophy in the guise of evolution, and strongly
supportive of Genesis creation.
 That’s where he wanted to go all through this train-wreck. DNA is the highest-possible
density information source. It’s a message originated by god, and transmitted by each
generation to its children. 
 And as usual for the twits (or Gitts) that write this stuff, they’re pretending to put
together logical/scientific/mathematical arguments for god; but they can only do it by
specifically including the necessity of god as a premise. In this case, he asserts that DNA
is a message; and a message must have an intelligent agent creating it. Since living things
cannot be the original creators of the message, since the DNA had to be created before us.
Therefore there must be a god.
Same old shit.

I like your “classics”! Thanks for the re-post.
Disclaimer: I don’t know what I’m talking about.
I think there is some kind of absolute information over volume.
Let have a Space S of size V.
This Space could be empty, full of things etc.
[hic sunt leones] This Space have only a finite (huge) set of states, that are the ways it could be, partially or totally, fulfilled and how.
So each V has a corespondent T (# of states V could have).
So there is such a thing like a Information Density: D(S)=T(V(S), S)/V(S).
Now if T(V(S), S) doesn’t depends by S (the particular Space, but only by V) we could simplify it in: D(V)=T(V)/V
If T is linear dependent with V the D = K (where K is a constant of ours universe)
So the absolute information density exist and it is a constant – boring…
Of course I agree with Mark Chu-Carroll about the meaninglessness of ideological wish fulfillment disguised as Math.
When I was about 5, I saw a high school Geometry textbook, and proceeded to have my father draw circles with a compass, which I would subdivide with line segments in multiple ways, with points of intersection labeled A, B, C, A’, B’, C”’ and so forth, and have my father (who had better printing than I had at my age) write “theorems” about AB = AB’= A”B” and the like. For a few minutes, my father happily complied. But then he stopped, and explained that my dictation superficially looked like Math, but it was wrong. And Theorems have to be true. An important lesson.
He then underlined it by posing a simple problem about circles to me. He drew two dots on a page of paper, and asked me how many circles passed through those two dots.
“One,” I said, and drew the smallest circle that passed through the two dots, so that they were at opposite ends of a diameter.
“Wrong,” he said. There are an infinite number of solutions. And he drew a dozen more circles that passed through those two dots, with the dots being at opposite ends of chords of the circles.
I was impressed by how I’d self-limited my imagination.
My father then put the lesson in italics by teaching me about ellipses, so that I realized that everything I knew about circles was only a special case.
Not bad for my Dad, who only graduated Harvard (cum laude) in 3 years because they gave him Math credit for the Navigation that he taught pilots when he was a Flight Instructor in World War II. And whose only Math publication was a letter to the editor of a Rhode Island newspaper insisting that 1 was a prime.
I miss him every day, just as Mark Chu-Carroll misses his Dad every day. Ironically, both my father and his father died because of undiagnosed or misdiagnosed and/or mistreated diseases. The issue is, not that these ID creeps are wrong, or “not even wrong”, but that their disinformation has a damaging effect on the life or death issues of the medical profession.
“Information density” has a cute angle in some science fiction of Greg Bear, where he suggests that packing too many bits per cubic centimeter makes the information implode into a black hole.
Your example of diamond impurities doesn’t quite work since one cannot have impurities of arbitrary forms in certain lattice positions. The chemistry won’t work. I suspect that correcting for this sort of issue won’t change things much but I don’t know enough chemistry to determine that.
I’m also concerned with the diamond information density calculation. The DNA is also largely carbon, and encodes one bit of information with far fewer than 1e9 atoms (
um, don’t forget to close your tags, tho 😉
One of my favourite things-to-say about this topic is that a “code” must have an understander at both ends, and so calling DNA a “code” is begging the question. It isn’t a code – it’s a chemical that reacts with other chemicals.
However, you can certainly have C-12, C-13, or C-14 in any position. So in the most general case, that would be over a bit per atom. Less if you constrain it to match natural isotope abundance.
Of course, you could play the same game with DNA and isotopes.
The information content possibilities for any crystal (diamond is ok as an example) is enormous. Impurities are one way to vary the crystal structure from the perfect (and probably impossible to achieve) ideal structure. Isotopes are another. But structural imperfections (dislocations, or even simply a missing atom) of many varieties also exist. Suppose we discuss a diamond of 12 grams- one mole or 6×10^24 atoms.(well, it’s also 60 carats, a decent size) All real crystals have many defects, but let’s suppose we could make a perfect crystal and then introduce just one defect at a chosen spot. This gives Avogodro’s number of possible locations. We could certainly choose a code to represent each possible location as a bit (not sure if this is correct). Then we could introduce 2 imperfections simultaneously. Or any number. All of human history, art, philosophy, literature, could be encoded in that one crystal. And there are a lot of crystals.
Bob
You spent the whole post picking nits and axe grinding. How about contending with the author’s argument?
“What he wants to prove is that all real information includes a message which was sent with intent and purpose – and thus can’t be created by anything other than an intelligent sender.”
That’s true. Information does not exist without something perceiving meaning in it. It requires context and a field of meaning which it will encode. If DNA is the result of evolution (pick your brand), then it contains no information because it is, by definition, random and not intelligent. We can say it has information, but really that’s nonsense to say that random assemblies of atoms that just happen to trigger some chemicals in our minds that say, “Oh, patterns, neat!” really have any of this concept of “information” in them.
To restate this another way, if you do not set the semantic field of meaning, my thinking human friend, then you have no data. DNA is not a recipe for tasty bread. DNA is constrained by our mind and senses to only be considered in its interactions that build and run natural life. We constrain it to a semantic field and recognize a limited scope of its behavior and existence.
To say information exists apart from a perciever is like thought without a thinker.
The real battle in this article hasn’t been touched: Is there a message or is this noise in the cosmos? If you dare to say DNA is more than a random bit of noise, then tell me, what on earth might that message be saying?
Apologies for taking far too many paragraphs to come to that conclusion.
Anonymous: you’re consistently wrong in each and every sentence.
“Information does not exist without something perceiving meaning in it.”
Wrong. Entropy has a formal definition which does not require any observer or any meaning. Thermodynamic entropy likewise does not require any observer or any meaning. Or do you Creationist loons believe and fear Maxwell’s Demon?
“It requires context and a field of meaning which it will encode.”
Wrong. The set of possible messages is context enough. Meaning does not enter into the picture.
“If DNA is the result of evolution (pick your brand), then it contains no information because it is, by definition, random and not intelligent.”
You have not even bothered to read this blog. Evolution IS NOT RANDOM. Mutation and other sources of variation may be random, but NATURAL SELECTION IS NOT RANDOM. Nor does it have “meaning.”
Everything you write is either wrong, or so incoherent as to be “not even wrong.”
Try reading this blog first, even if you refuise to read any actual paper or book on Information Theory or Biology which is not written by a certifiable lunatic or liar.
And Merry Christmas, God Bless us, every one.
Looks like the commenting software ate my post — a less than sign is apparently too much math. Regardless, MarkCC’s proposed diamond encoding, with one bit per billion carbon atoms, is quite information poor compared to DNA. We can imagine denser encodings, but without detailing the all the processes for reading, writing, and the storage conditions, it’s hard to reasonably compare them. Hard disk makers struggle with these issues all the time, they are perfectly well defined.
There does seem to be a physical limit to information density though: see http://en.wikipedia.org/wiki/Holographic_principle and http://community.livejournal.com/ref_sciam/1190.html . Supposedly it’s about 1e66 bits per cm3.
None of this affects the main argument, which is about the conflation of distinct meanings of the word “information”. But it’s an interesting diversion.
I published, 1979, in a refereed IEEE venue (Google “quintillabit” to find the PDF) about how many bits could be stored in a 12-gram diamond. I coined the term in that paper “1 Shannon = 1 mole of bits.”
The most interesting recent foundational paper on Evolutionary dynamics, to me, was Yaneer Bar Yam et al’s review article in the journal Complexity, which reveals limits of traditional views of evolutionary dynamics. Among the insights are:
* Fitness cannot be treated as a property of individual genes or individual organisms when local contexts vary (symmetry breaking causes averaging approximations to fail).
* In a spatial system organisms inherit the environment that is affected by their parents.
* Overexploiting organisms may have many offspring but their descendants suffer the consequences.
* Altruism arises in evolution because selfish individuals diminish the survivability of their descendants.
Is this the paper you mean (PDF)? On a fast scan it seems to cover all those points.
i am impressed that you got beyond the nonsense after shannons 2nd theorem. I lost interest after that.
But wait…to double the information content of this post…
I AM IMPRESSED THAT YOU GOT BEYOND THE NONSENSE AFTER SHANNONS @ND THEOREM. I LOST INTEREST AFTER THAT.
@Jonathan Vos Pos #11
Johnathan, the formal definition of entropy has todo with a philosophical argument being kicked around by mathematicians. You’re working off of a category error.
I’m sorry that my taking liberty with language (using the term “random”) distracted you from my point. I’m also sorry you chose to pick fights over my irrelevant understanding of your understanding of evolution. I encourage you and Mark to engage the author in a fair manner. Understand what they are trying to say. Distinguish that from flaws in their polemic. Lay out your case.
If all this is, however, just mocking people you disagree with, then you have no need for fair or respectful argumentation.
Wanker!