Friday Pathological Programming: Programming Through Grammars in Thue

Today’s treat: [Thue][thue], pronounced two-ay, after a mathematician named Axel Thue.
Thue is a language based on [semi-Thue][semi-thue] grammars. semi-Thue grammars are equivalent to Chosmky-0 grammars, except that they’re even more confusing: a semi-Thue grammar doesn’t distinguish between terminal and non-terminal symbols. (Since Chomsky-0 is equivalent to a turing machine, we know that Thue is turing complete.) The basic idea of it is that a program is a set of rewrite rules that tell you how to convert one string into another. The language itself is incredibly simple; programs written in it are positively mind-warping.
There are two parts to a Thue program. First is a set of grammar rules, which are written in what looks like pretty standard BNF: “*<lhs> ::= <rhs>*”. After that, there is a section which is the initial input string for the program. The two parts are separated by the “::=” symbol by itself on a line.
As I said, the rules are pretty much basic BNF: whatever is on the left-hand side can be replaced with whatever is on the right-hand side of the rule, whenever a matching substring is found. There are two exceptions, for input and output. Any rule whose right hand side is “:::” reads a line of input from the user, and replaces its left-hand-side with whatever it read from the input. And any rule that has “~” after the “::=” prints whatever follows the “~” to standard out.
So, a hello world program:
x ::=~ Hello world
::=
x
Pretty simple: the input string is “x”; there’s a replacement for “x” which replaces “x” with nothing, and prints out “Hello world”.
How about something much more interesting. Adding one in binary. This assumes that the binary number starts off surrounded by underscore characters to mark the beginning and end of a line.
1_::=1++
0_::=1
01++::=10
11++::=1++0
_0::=_
_1++::=10
//::=:::
::=
_//_
Let’s tease that apart a bit.
1. Start at the right hand side:
1. “1_::=1++” – If you see a “1” on the right-hand end of the number,
replace it with “1++”, removing the “_” on the end of the line. The rest
of the computation is going to use the “++” to mark the point in
string where the increment is going to alter the string next.
2. “0_::=1” – if you see a “0” on the right-hand-end, then replace it with 1. This doesn’t insert a “++”, so none of the other rules are going to do anything: it’s going to stop. That’s what you want: if the string ended in a zero, then adding one to it just changes that 0 to a 1.
2. Now, we get to something interesting: doing the addition:
1. If the string by the “++” is “01”, then adding one will change it to “10”, and that’s the end of the addition. So we just change it to 10, and get rid of the “++. ”
2. If the string by the “++” is “11”, then we change the last digit to “0”, and move the “++” over to the left – that’s basically like adding 1 to 9 in addition: write down a zero, carry the one to the left.
3. Finally, some termination cleanup. If we get to the left edge, and there’s a zero, after the “_”, we can get rid of it. If we get to the left edge, and there’s a “1++”, then we replace it with 10 – basically adding a new digit to the far left; and get rid of the “++”, because we’re done.
3. Finally, inputting the number “//::=:::” says “If you see “//”, then read some input from the user, and replace the “//” with whatever you read.
4. Now we have the marker for the end of the rules section, and the string to start the program is “_//_”. So that triggers the input rule, which inputs a binary number, and puts it between the “_”s.
Let’s trace this on a couple of smallish numbers.
* Input: “111”
1. “_111_” → “_111++”
2. “_111++_” → “_11++0”
3. “_11++0” → “_1++00”
4. “_1++00” → “1000”
* Input: “1011”
1. “_1011_” → “_1011++”
2. “_1011++” → “_101++0”
3. “_101++0” → “_1100”.
Not so hard, eh?
Decrement is the same sort of thing; here’s decrement with an argument hardcoded instead of reading from input:
0_::=0–
1_::=0
10–::=01
00–::=0–1
_1–::=@
_0–::=1
_0::=
::=
_1000000000000_
Now, something more complicated. From [here][vogel], a very nicely documented program that computes fibonacci numbers. It’s an interesting mess – it converts things into a unary form, does the computation in unary, then converts back. This program uses a fairly neat trick to get comments. Thue doesn’t have any comment syntax. But they use “#” on the left hand side as a comment marker, and then make sure that it never appears anywhere else – so none of the comment rules can ever be invoked.
—-
#::= fibnth.t – Print the nth fibonacci number, n input in decimal
#::= (C) 2003 Laurent Vogel
#::= GPL version 2 or later (http://www.gnu.org/copyleft/gpl.html)
#::= Thue info at: http://www.catseye.mb.ca/esoteric/thue/
#::= This is modified thue where only foo::=~ outputs a newline.
#::= ‘n’ converts from decimal to unary-coded decimal (UCD).
n9::=*********,n
n8::=********,n
n7::=*******,n
n6::=******,n
n5::=*****,n
n4::=****,n
n3::=***,n
n2::=**,n
n1::=*,n
n0::=,n
n-::=-
#::= ‘.’ prints an UCD number and eats it.
.*********,::=.~9
.********,::=.~8
.*******,::=.~7
.******,::=.~6
.*****,::=.~5
.****,::=.~4
.***,::=.~3
.**,::=.~2
.*,::=.~1
.,::=.~0
~9::=~9
~8::=~8
~7::=~7
~6::=~6
~5::=~5
~4::=~4
~3::=~3
~2::=~2
~1::=~1
~0::=~0
#::= messages moving over ‘,’, ‘*’, ‘|’
*<::=<*
,<::=<,
|<::=*::=*0>
0>,::=,0>
0>|::=|0>
1>*::=*1>
1>,::=,1>
1>|::=|1>
2>*::=*2>
2>,::=,2>
2>|::=|2>
#::= Decrement n. if n is >= 0, send message ‘2>’;
#::= when n becomes negative, we delete the garbage using ‘z’ and
#::= print the left number using ‘.’.
*,-::=,2>
,,-::=,-*********,
|,-::=z
z*::=z
z,::=z
z|::=.
#::= move the left number to the right, reversing it (0 becomes 9, …)
#::= reversal is performed to help detect the need for carry. As the
#::= order of rule triggering is undetermined in Thue, a rule matching
#::= nine consecutive * would not work.
*c<::=c
,c<::=c
|c
0>d*::=d
1>d::=d*********,
#::= when the copy is done, ‘d’ is at the right place to begin the sum.
2>d::=f<|
#::= add left to right. 'f' moves left char by char when prompted by a '<'.
#::= it tells 'g' to its right what the next char is.
*f
,f
#::= when done for this sum, decrement nth.
|f’ ‘g’ drops an ‘i’ that increments the current digit.
0>g::=ig
*i::=<
,i::=i,*********
|i::=’ tells ‘g’ to move left to the next decimal position,
#::= adding digit 0 if needed (i.e. nine ‘*’ in reversed UCD)
*1>g::=1>g*
,1>g::=g::=|’ tells ‘g’ that the sum is done. We then prepare for copy (moving
#::= a copy cursor ‘c’ to the correct place at the beginning of the left
#::= number) and reverse (using ‘j’) the right number back.
*2>g::=2>g*
,2>g::=2>g,
|2>g::=c|*********j
#::= ‘j’ reverses the right number back, then leaves a copy receiver ‘d’
#::= and behind it a sum receiver ‘g’.
*j*::=j
j,*::=,********j
j,,::=,*********j,
j,|::=’ sent by ‘-‘
#::= will start when hitting ‘g’ the first swap + sum cycle
#::= for numbers 0 (left) and 1 (reversed, right).
::=
|n?-|,|********g,|
——
Ok. Now, here’s where we go *really* pathological. Here, in all it’s wonderful glory, is a [Brainfuck][brainfuck] interpreter written in Thue, written by Mr. Frederic van der Plancke. (You can get the original version, with documentation and example programs, along with a Thue interpreter written in Python from [here][vdp-thue])
——
#::=# BRAINFUNCT interpreter in THUE
#::=# by Frederic van der Plancke; released to the Public Domain.
o0::=0o
i0::=0i
o1::=1o
i1::=1i
o]::=]o
i]::=]i
o[::=[o
i[::=[i
oz::=zo
oZ::=Zo
iz::=zi
iZ::=Zi
0z::=z0
1z::=z1
]z::=z]
[z::=z[
_z::=z_
0Z::=Z0
1Z::=Z1
]Z::=Z]
[Z::=Z[
_Z::=Z_
}}]::=}]}
&}]::=]&}
&]::=]^
}[::=[{
&[::=[0::=0}
{1::=1{
?Z[::=[&
?z[::=[^
?z]::=]
?Z]::=]^
[{{::={[{
[{::=[
[::=[^
]{::={]
]::={]
0::=
1::=1
0{::={0
1{::={1
^[::=?[iii
^]::=?]iii
}|::=}]|WARN]WARN
&|::=&]|WARN]WARN
WARN]WARN::=~ERROR: missing ‘]’ in brainfunct program; inserted at end
^000::=000^ooo
^001::=001^ooi
^010::=010^oio
^011::=011^oii
^100::=100^ioo
^101::=101^ioi
ooo|::=|ooo
ooi|::=|ooi
oio|::=iio|oio
oii|::=|oii
ioo|::=|ioo
ioi|::=|ioi
iii|::=|iii
iio|!::=|
|z::=z|
|Z::=Z|
o_::=_o
i_::=_i
0!::=!0
1!::=!1
_!::=!_
/!::=!/
oooX!::=Xooo
oooY::=$Y
0$::=!1
1$::=$0
X$::=X!1
ooiX!::=Xooi
ooiY::=@1Y
1@1::=@00
0@1::=@11
1@0::=@01
0@0::=@00
X@1::=X!WARNDEC0WARN
WARNDEC0WARN::=~WARNING: attempt at decrementing zero (result is still zero)
X@00::=X@0
X@01::=X!1
X@0Y::=X!0Y
oioX!::=LLYLR
0LL::=LL0
1LL::=LL1
_LL::=!X!
|LL::=|!0X!
LR0::=0LR
LR1::=1LR
LRY::=_
oiiX!::=_oii
oiiY::=X!%
%0::=0%
%1::=1%
%_0::=Y0
%_1::=Y1
%_/::=Y0_/
iooX!::=X(in)
(in)0::=(in)
(in)1::=(in)
(in)Y::=Yi
i/0::=Zi/
i/1::=zi/
i/_::=!/
XY!::=X!0Y
0Y!::=0!Y
1Y!::=1!Y
YZ::=0Y
Yz::=1Y
ioiX!::=X(out)
(out)0::=0(out)O0
(out)1::=1(out)O1
(out)Y::=OO!Y
O0::=~0
O1::=~1
OO::=~_
iiiX!0::=ZX!0
iiiX!1::=zX!1
+::=000
-::=001
::=011
,::=100
.::=101
:::=|X!0Y0_/
::=
^*
[vdp-thue]: http://fvdp.homestead.com/files/eso_index.html#BFThue
[thue]: http://esoteric.voxelperfect.net/wiki/Thue
[semi-thue]: http://en.wikipedia.org/wiki/Semi-Thue_grammar
[brainfuck]: http://scienceblogs.com/goodmath/2006/07/gmbm_friday_pathological_progr.php
[vogel]: http://lvogel.free.fr/thue.htm

Causeless Math from Dembski and Friend

Over at his blog, William Dembski, my least-favorite pathetic excuse for a mathematician, [has cited an article][dembski] written by one John Davison about the theory of evolution. According to Dembski, this article explains “why the naked emperor still lingers”; that is, why the theory of evolution is still around even though it’s so obviously wrong. (Update: I originally typed “Paul Davison” instead of “John Davison”, I don’t know why. Any “Paul Davison”s out there, sorry for associating your name with this dreck. Additionally, the article is posted on Dembski’s blog, but it wasn’t posted by Dembski himself; it was posted by one of the site moderators “Scott”.)
It’s easy to see why Dembski likes this article. I’ve always found Dembski’s writing to be obnoxiously pretentious; this guy writes in that same snotty faux-intellectual style. Here’s a taste from the beginning of Davison’s article.
>Darwinism has persisted because it failed to recognize the nature of first
>causes. It is only natural to assume that results had causes and it is the duty
>of the scientist to find and reveal those causes. At this science has been
>incredibly successful. Many examples are presented in medical science with the
>discovery of the causes, treatments and cures for hundreds of diseases. All of
>Chemistry has been revealed from the consideration of how atomic forces have
>caused molecules to have the structure and properties that they have. This is
>analytical science and it is great science.
>
>I like titles presented as questions because that is what science is really
>supposed to be all about – answering questions. One cannot answer a question
>until it has been posed.
>
>I have used this technique in the past with “Is evolution finished” and most
>recently, also in the current issue of Rivista di Biologia, “Do we have an
>evolutionary theory?”
>
>You will note that I choose my words carefully. I do not question that it has
>persisted because that is self-evident, but rather how has that been possible?
>
>I have the answer and here it is in abbreviated form.
See what I mean? This section also already starts to hint at what’s wrong; but what really set me off, and let me to write about it here, on a math blog, is what comes next:
>Darwinism has persisted because it failed to recognize the nature of first
>causes. It is only natural to assume that results had causes and it the duty of
>the scientist to find and reveal those causes. At this science has been
>incredibly successful. Many examples are presented in medical science with the
>discovery of the causes, treatments and cures for hundreds of diseases. All of
>Chemistry has been revealed from the consideration of how atomic forces have
>caused molecules to have the structure and properties that they have. This is
>analytical science and it is great science.
>
>But does this approach have limits beyond which it cannot DIRECTLY proceed?
>This is another very critical question and I will answer it with a resounding
>yes.
>
>Those limits are met when we attempt to identify the causes of the tools with
>which we proceed. I will use mathematics as an example. Mathematics has
>rightfully been described as “The Queen of the Sciences.” Without math there
>could be no science, at least a science as we know it.
Yeah, he’s going to invoke mathematics as his argument. And of course, it’s going to be *bad*. **Really** bad. Stupid bad.
>So here comes the moment of truth as it were. What is the cause of mathematics?
>More accurately we should ask – what WAS the cause of mathematics because it
>has always been there just waiting to be discovered. That discovery began with
>the Pythagoreans and continues to this day.
>
>Mathematics has no discernable cause does it? Now what does this all have to do
>with evolution? It has everything to do with evolution because both ontogeny
>and phylogeny, like mathematics have no discernable cause.
Yes, the root of his argument is that mathematics has *no cause*. And evolution, like mathematics, also has no discernable cause.
What the hell does this *mean*? Well, to be frank, absolutely bloody nothing. This is what is crudely known as “talking out your ass”.
>And so we come to the answer to the question posed in my title.
>
>Darwinism has persisted because it assumes a detectable, discernable cause, a
>cause which never existed. It even claims to tell us all about this
>non-existent cause. The cause is random changes in genes (mutations) coupled
>with nature acting to select which of these should survive. These two
>processes, genetic variation and selection, have been the sole means by which
>organisms have evolved.
Yeah, y’see, evolution has *no cause*, just like mathematics. But the theory of evolution has hung around not because it actually explains anything; not because it has evidence to support it; not because it matches the facts; it’s because it creates an *illusion* of a cause.
>Now what is the actual tangible evidence to support this model? That is another
>very good question by the way. That is what science is all about, asking
>questions and then trying to answer them. In this case the answers that emerge
>are very clear.
That’s a very good question indeed. Shame he doesn’t bother to answer it.
>Natural selection first of all is very real. Its effect is to prevent change
>rather than to promote it. This was first recognized by Mivart and then
>subsequently and independently by Reginald C. Punnett and then Leo Berg.
Yeah, y’see there were these two guys, and like we were talking? and they said that natural solution prevents change, and they were, like, really convincing.
That’s his “tangible evidence” for the argument that evolution as a theory has persisted because it creates an illusion of cause where there is none.
>So you see there are really two reasons that Darwinism has persisted.
>
>The first I have already presented. It assumes a cause which never existed. The
>second reason it has persisted is because it has also assumed that no one ever
>existed who questioned the cause which never existed.
And yet again, random bullshit comes out of nowhere. Evolution has persisted because it denies the existence of people who question it.
>Like mathematics, both ontogeny and phylogeny never had exogenous causes. Both
>are manifestations of the expression of preformed pre-existent blocks of highly
>specific information which has been released over the millennia as part of a
>self-limiting process known as organic evolution, a phenomenon, my opinion, no
>longer in progress.
And again, we come back to that horrible comparison to math. Math, according to Davison is “causeless”; it consists of a set of facts that exist independently of any cause. Likewise, he claims that evolution is “causeless”; it’s nothing but the expression of a set of genetic information that has been coded into life from the very beginning. Evidence? He’s so smart, he doesn’t need any stinking evidence! Evidence is for stuff that has a cause!
>Everything we are now learning supports this interpretation which I have
>presented in summary form in my recent paper – “A Prescribed Evolutionary
>Hypothesis.”
Everything we’re learning supports this. Of course, he doesn’t mention *any* of it; not one fact, not one scrap of evidence; not anything about all of the genomes we’ve mapped out; not the name of one biologist who’s done work supporting this, not one paper that talks about this evidence. Nothing.
*This* is what Dembski thinks of as a *good* article arguing in favor of ID.
[dembski]: http://www.uncommondescent.com/index.php/archives/1353

Something Nifty: A Taste of Simple Continued Fractions

One of the annoying things about how we write numbers is the fact that we generally write things one of two ways: as fractions, or as decimals.
You might want to ask, “Why is that annoying?” (And in fact, that’s what I want you to ask, or else there’s no point in my writing the rest of this!)
It’s annoying because both fractions and decimals can both only describe
*rational* numbers – that is, numbers that are a perfect ratio of two integers. And *most* numbers aren’t rational.
But it’s even more annoying than that: if you use decimals, then there are lots of rational numbers that you can’t represent exactly (i.e., 1/3); and if you use fractions, then it’s hard to express the idea that the fraction isn’t exact. (How do you write π as a fraction? 22/7 is a standard fractional approximation, but how do you say π, which is *almost* 22/7?)
So what do we do?
One of the answers is something called *continued fractions*. A continued fraction is a very neat thing. Here’s the idea: take a number where you don’t know it’s fractional form. Pick the nearest simple fraction 1/n that’s just a *little bit too large*. If you were looking at, say, 0.4, you’d take 1/2 – it’s a bit bigger. Now – you’ve got an approximation, but it’s too large. So that means that the demoninator is *too small*. So you add a correction to the denominator to make it a little bit bigger. And you just keep doing that – you approximate the correction to the denominator by adding a fraction to the denominator that’s just a little too big, and then you add a correction to *that* correction.
Let’s look at an example: 2.3456
1. It’s close to 2. So we start with 2 + (0.3456)
2. Now, we start approximating the fraction. The way we do that is we take the *reciprocal* of 0.3456 and take the integer part of it: 1/0.3456 rounded down is 2 . So we make it 2 + 1/2; and we know that the denominator is off by .3088.
3. We take the reciprocal again, and get 3, and it’s off by .736
4. We take the reciprocal again, and get 1, and it’s off by 0.264
5. Next we get 3, but it’s off by 208/1000
6. Then 4, off by 0.168
7. Then 5, off by .16
8. Then 6, off by .25
9. Then 4, off by 0; so now we have an exact result.
So as a continued fraction, 2.3456 looks like:
continued.jpg
As a shorthand, continued fractions are normally written using a list notation inside of square brackets: the integer part, following by a semicolon, followed by a comma-separated list of the denominators of each of the fractions. So our continued fraction for 2.3456 would be written [2; 2, 3, 1, 3, 4, 5, 6, 4].
There’s a very cool visual way of understanding that algorithm. I’m not going to show it for 2.3456, because it’s a bit too much… But let’s look at something simpler: let’s try to write 9/16ths as a continued fraction. Basically, we make a grid consisting of 16 squares across by 9 squares up and down. We draw the *largest* square we can on that grid. The number of squares of that size that we can draw is the first digit of the continued fraction. Now there’s a rectangle left over: we draw the largest squares we can, there. And so on:

square-continued.jpg

So the continued fraction for 9/16ths is [0; 1, 1, 3, 2].
Using continued fractions, we can represent *any* rational number in a finite-length continued fraction.
One incredibly nifty thing about this way of writing numbers is: what’s the reciprocal of 2.3456, aka [2; 2, 3, 1, 3, 4, 5, 6, 4]? It’s [0; 2, 2, 3, 1, 3, 4, 5, 6, 4]. We just add a zero to the front as the integer part, and push everything else one place to the right. If it was a zero in front, then we would have removed the zero and pulled everything else one place to the left.
Irrational numbers are represented as *infinite* continued fractions. So there’s an infinite series of correction fractions. You can understand it as a series of every-improving approximations of the value of the number. And you can define it using a recurrence relation (that is, a recursive formula) for how to get to the next digit.
For example, π = [3; 7, 15, 1, 292, 1, …]. If we work that out, the first six places of the continued fraction for pi work out in decimal form to 3.14159265392. That’s correct to the first *11* places in decimal. Not bad, eh?
A very cool property of the continued fractions is: square roots written as continued fractions *always repeat*. Even cooler? What’s the square root of two as a continued fraction? [1; 2, 2, 2, 2, …. ].

e – the Unnatural Natural Number

Looks like I’ve accidentally created a series of articles here about fundamental numbers. I didn’t intend to; originally, I was just trying to write the zero article I’d promised back during the donorschoose drive.
Anyway. Todays number is *e*, aka Euler’s constant, aka the natural log base. *e* is a very odd number, but very fundamental. It shows up constantly, in all sorts of strange places where you wouldn’t expect it.
What is e?
————
*e* is a transcendental irrational number. It’s roughly 2.718281828459045. It’s also the base of the natural logarithm. That means that by definition, if ln(x)=y, then *e*y=x. Given my highly warped sense of humor, and my love of bad puns (especially bad *geek* puns) , I like to call *e* the *unnatural natural number*. (It’s natural in the sense that it’s the base of the natural logarithm; but it’s not a natural number according to the usual definition of natural numbers. Hey, I warned you that it was a bad geek pun.)
But that’s not a sufficient answer. We call it the *natural* logarithm. Why is that bizzare irrational number just a bit smaller than 2 3/4 *natural*?
Take the curve y=1/x. The area under the curve from 1 to n is the natural log of n. *e* is the point on the x axis where the area under the curve from 1 is equal to one:
ln.jpg
It’s also what you get if you you add up the reciprocal of the factorials of every natural number: (1/0! + 1/1! + 1/2! + 1/3! + 1/4! + …)
It’s also what you get if you take the limit: *lim*n → ∞ (1 + 1/n)n.
It’s also what you get if you work out this very strange looking series:

2 + 1/(1+1/(2+2/(3+3/(4+..))))

It’s also the base of a very strange equation: the derivative of *e*x is… *e*x.
And of course, as I mentioned yesterday, it’s the number that makes the most amazing equation in mathematics work: *e*=-1.
Why does it come up so often? It’s really deeply fundamental. It’s tied to the fundamental structure of numbers. It really is a deeply *natural* number; it’s tied into the shape of a circle, to the basic 1/x curve. There are dozens of different ways of defining it, because it’s so deeply embedded in the structure of *everything*.
Wikipedia even points out that if you put $1 into a bank account paying 100% interest compounded continually, at the end of the year, you’ll have exactly *e* dollars. (That’s not too suprising; it’s just another way of stating the integral definition of *e*, but it’s got a nice intuitiveness to it.)
History

Lighter Topics – what do you want to know?

The category theory series is finally winding down; I’ve got one topic I’d like to write about, and then I’ll have had my fill of category theory for a while. I don’t want to dive right in to another really deep topic like topology, so I’m looking for some subjects that people are interested in that can be covered in one or two posts. I could come up with some by myself (and probably will), but there are a lot of things like the zero article which so many people seemed to enjoy which I could write about, but probably wouldn’t think of on my own.
So, what would you like to see one or two posts on?

i : the Imaginary Number

After the amazing response to my post about zero, I thought I’d do one about something that’s fascinated me for a long time: the number *i*, the square root of -1. Where’d this strange thing come from? Is it real (not in the sense of real numbers, but in the sense of representing something *real* and meaningful)?
History
———
The number *i* has its earliest roots in some of the work of early arabic mathematicians; the same people who really first understood the number 0. But they weren’t quite as good with *i* as they were with 0: they didn’t really get it. They had some concept of roots of a cubic equation, where sometimes the tricks for finding the roots of the equation *just didn’t work*. They knew there was something going on, some way that the equation needed to have roots, but just what that really mean, they didn’t get.
Things stayed that way for quite a while. Various others, like the Greeks, encountered them in various ways when things didn’t work, but no one *really* grasped the idea that algebra required numbers that were more than just points on a one-dimensional number-line.
The next step was in Italy, over 1000 years later. During the 16th century, people were searching for solutions to the cubic equations – the same thing that the arabic scholars were looking at. But getting some of the solutions – even solutions to equations with real roots – required playing with the square root of -1 along the way. It was first really described by Rafael Bombelli in the context of the solutions to the cubic; but Bombello didn’t really think that they were *real*, *meaningful* numbers: it was viewed as a useful artifact of the process of solving the equations, but it wasn’t accepted.
It got its name as the *imaginary number* as a result of a diatribe by Rene Descartes, who believed it was a phony artifact of sloppy algebra. He did not accept that it had any meaning at all: thus it was an “imaginary” number.
They finally came into wide acceptance as a result of the work of Euler in the 18th century. Euler was probably the first to really, fully comprehend the complex number system created by the existence of *i*. And working with that, he discovered one of the most fascinating and bizzare mathematical discoveries ever, known as *Euler’s equation*. I have no idea how many years it’s been since I was first exposed to this, and I *still* have a hard time wrapping my head around *why* it’s true.

e = cos θ + i sin θ

And what *that* really means is:

e = -1

That’s just astonishing. The fact that there is *such* a close relationship between i, π, and e is just shocking to me.
What *i* does
—————
Once the reality of *i* as a number was accepted, mathematics was changed irrevocably. Instead of the numbers described by algebraic equations being points on a line, suddenly they become points *on a plane*. Numbers are really *two dimensional*; and just like the integer “1” is the unit distance on the axis of the “real” numbers, “i” is the unit distance on the axis of the “imaginary” numbers. As a result numbers *in general* become what we call *complex*: they have two components, defining their position relative to those two axes. We generally write them as “a + bi” where “a” is the real component, and “b” is the imaginary component.

complex-axis.jpg

The addition of *i* and the resulting addition of complex numbers is a wonderful thing mathematically. It means that *every* polynomial equation has roots; in particular, a polynomial equation in “x” with maximum exponent “n” will always have exactly “n” complex roots.
But that’s just an effect of what’s really going on. The real numbers are *not* closed algebraically under multiplication and addition. With the addition of *i*, multiplicative algebra becomes closed: every operation, every expression in algebra becomes meaningful: nothing escapes the system of the complex numbers.
Of course, it’s not all wonderful joy and happiness once we go from real to complex. Complex numbers aren’t ordered. There is no < comparison for complex numbers. The ability to do meaningful inequalities evaporates when complex numbers enter the system in a real way.
What *i* means
——————
But what do complex numbers *mean* in the real world? Do they really represent real phenomena? Or are they just a mathematical abstraction?
They’re very real. There’s one standard example that everyone uses: and the reason that we all use it is because it’s such a perfect example. Take the electrical outlet that’s powering your computer. It’s providing alternating current. What does that mean?
Well, the *voltage* – which (to oversimplify) can be viewed as the amount of force pushing the current – is complex. In fact, if you’ve got a voltage of 110 volts AC at 60 hz (the standard in the US), what that means is that the voltage is a number of magnitude “110”. If you were to plot the “real” voltage on a graph with time on the X axis and voltage of the Y, you’d see a sine wave:

sinewave.jpg

But that’s not really accurate. If you grabbed the wire when the voltage is supposedly zero on that graph, *you’d still get a shock*! Take the moment marked “t1” on the graph above. The voltage at time t1 on the complex plane is a point at “110” on the real axis. At time t2, the voltage on the “real” axis is zero – but on the imagine axis it’s 110. In fact, the *magnitude* of the voltage is *constant*: it’s always 110 volts. But the vector representing that voltage *is rotating* through the complex plane.

voltage.jpg

You also see it in the Fourier transform: when we analyze sound using a computer, one of the tricks we use is decomposing a complex waveform (like a human voice speaking) into a collection of basic sine waves, where the sine waves added up equal the wave at a given point in time. The process by which we
do that decomposition is intimately tied with complex numbers: the fourier transform, and all of the analyses and transformations built on it are dependent on the reality of complex numbers (and in particular on the magnificent Euler’s equation up above).

Bad, bad, bad math! AiG and Information Theory

While taking a break from some puzzling debugging, I decided to hit one of my favorite comedy sites, Answers in Genesis. I can pretty much always find something sufficiently stupid to amuse me on their site. Today, I came across a gem called [“Information, science and biology”][gitt], by the all too appropriately named “Werner Gitt”. It’s yet another attempt by a creationist twit to find some way to use information theory to prove that life must have been created by god.
It looks like the Gitt hasn’t actually *read* any real information theory, but has rather just read Dembski’s wretched mischaracterizations, and then regurgitated and expanded upon them. Dembski was bad enough; building on an incomplete understand of Dembski’s misrepresentations and errors is just astonishing.
Anyway, after butchering an introduction to Shannon theory, he moves onward.
>The highest information density known to us is that of the DNA
>(deoxyribonucleic acid) molecules of living cells. This chemical storage medium
>is 2 nm in diameter and has a 3.4 NM helix pitch (see Figure 1). This results
>in a volume of 10.68 x 10-21 cm3 per spiral. Each spiral contains ten chemical
>letters (nucleotides), resulting in a volumetric information density of 0.94 x
>1021 letters/cm3. In the genetic alphabet, the DNA molecules contain only the
>four nucleotide bases, that is, adenine, thymine, guanine and cytosine. The
>information content of such a letter is 2 bits/nucleotide. Thus, the
>statistical information density is 1.88 x 1021 bits/cm3.
This is, of course, utter gibberish. DNA is *not* the “highest information density known”. In fact, the concept of *information density* is not well-defined *at all*. How do you compare the “information density” of a DNA molecule with the information density of an electromagnetic wave emitted by a pulsar? It’s meaningless to compare. But even if we accept information as only physically encoded, Consider the information density of a crystal, like a diamond. A diamond is an incredibly compact crystal of carbon atoms. There are no perfect diamonds: all crystals contain irregularities and impurities. Consider how dense the information of that crystal is: the position of every flaw, every impurity, the positions of the subset of carbon atoms in the crystal that are carbon-14 as opposed to carbon-12. Considerably denser than DNA, huh?
After this is where it *really* starts to get silly. Our Gitt claims that Shannon theory is incomplete, because after all, it’s got a strictly *quantitative* measure of information: it doesn’t care about what the message *means*. So he sets out to “fix” that problem. He proposes five levels of information: statistics, syntax, semantics, pragmatics, and apobetics. He claims that Shannon theory (and in fact information theory *as a whole*) only concerns itself with the first; because it doesn’t differentiate between syntactically valid and invalid information.
Let’s take a quick run through the five, before I start mocking them.
1. Statistics. This is what information theory refers to as information content, expressed in terms of an event sequence (as I said, he’s following Dembski); so we’re looking at a series of events, each of which is receiving a character of a message, and the information added by each event is how surprising that event was. That’s why he calls it statistical.
2. Syntax. The structure of the language encoded by the message. At this level, it is assumed that every message is written in a *code*; you can distinguish between “valid” and “invalid” messages by checking whether they are valid strings of characters for the given code.
3. Semantics. What the message *means*.
4. Pragmatics. The *primitive intention* of the transmitter of the message; the specific events/actions that the transmitter wanted to occur as a result of sending the message.
5. Apobetics: The *purpose* of the message.
According to him, level 5 is the most important one.
Throughout the article, he constantly writes “theorems”. He clearly doesn’t understand what the word “theorem” means, because these things are just statements that he would *like* to be true, but which are unproven, and often unprovable. A few examples?
For example, if we look at the section about “syntax”, we find the following as theorems:
>Theorem 4: A code is an absolutely necessary condition for the representation
>of information.
>
>Theorem 5: The assignment of the symbol set is based on convention and
>constitutes a mental process.
>
>Theorem 6: Once the code has been freely defined by convention, this definition
>must be strictly observed thereafter.
>
>Theorem 7: The code used must be known both to the transmitter and receiver if
>the information is to be understood.
>
>Theorem 8: Only those structures that are based on a code can represent
>information (because of Theorem 4). This is a necessary, but still inadequate,
>condition for the existence of information.
>
>These theorems already allow fundamental statements to be made at the level of
>the code. If, for example, a basic code is found in any system, it can be
>concluded that the system originates from a mental concept.
How do we conclude that a code is a necessary condition for the representation of information? We just assert it. Worse, how do we conclude that *only* things that are based on a code represent information? Again, just an assertion – but an *incredibly* strong one. He is asserting that *nothing* without a
structured encoding is information. And this is also the absolute crux of his argument: information only exists as a part of a code *designed by an intelligent process*.
Despite the fact that he claims to be completing Shannon theory, there is *nothing* to do with math in the rest of this article. It’s all words. Theorems like the ones quoted above, but becoming progressively more outrageous and unjustified.
For example, his theorem 11:
>The apobetic aspect of information is the most important, because it embraces
>the objective of the transmitter. The entire effort involved in the four lower
>levels is necessary only as a means to an end in order to achieve this
>objective.
After this, we get to his conclusion, which is quite a prize.
>On the basis of Shannon’s information theory, which can now be regarded as
>being mathematically complete, we have extended the concept of information as
>far as the fifth level. The most important empirical principles relating to the
>concept of information have been defined in the form of theorems.
See, to him, a theorem is nothing but a “form”: a syntactic structure. And this whole article, to him, is mathematically complete.
>The Bible has long made it clear that the creation of the original groups of
>fully operational living creatures, programmed to transmit their information to
>their descendants, was the deliberate act of the mind and the will of the
>Creator, the great Logos Jesus Christ.
>
>We have already shown that life is overwhelmingly loaded with information; it
>should be clear that a rigorous application of the science of information is
>devastating to materialistic philosophy in the guise of evolution, and strongly
>supportive of Genesis creation.
That’s where he wanted to go all through this train-wreck. DNA is the highest-possible density information source. It’s a message originated by god, and transmitted by each generation to its children.
And as usual for the twits (or Gitts) that write this stuff, they’re pretending to put together logical/scientific/mathematical arguments for god; but they can only do it by specifically including the necessity of god as a premise. In this case, he asserts that DNA is a message; and a message must have an intelligent agent creating it. Since living things cannot be the original creators of the message, since the DNA had to be created before us. Therefore there must be a god.
Same old shit.
[gitt]: http://www.answersingenesis.org/tj/v10/i2/information.asp

The Categorical Model of Linear Logic

Today we’ll finally get to building the categories that provide the model for
the multiplicative linear logic. Before we jump into that, I want to explain why it is that we separate out the multiplicative part.
Remember from the simply typed lambda calculus, that [we showed that the type system was precisely a subset of intuitionistic logic][lambda-type], and that programs in the lambda calculus corresponded to proofs in the type system. In particular, it was propositional intuitionistic logic using only the “→” operation. Similarly, if you build up a more interesting typed lambda calculus like [System-F][systemf], the type system is intuitionist logic *with* the “∀” quantifer and the “∧” operator, but without “∨”, negation, or “∃”. Why do we eliminate the existentials and friends? Because if we left them in, then the *type system* becomes Turing complete and undecidable. If we’re careful, we can analyze a program using universal types, logical and (composition in lambda), and implication and infer the types if it’s a valid program. (Actually, even System-F is undecidable without any type annotations, but HMF – the Hindley-Milner subset, *is* decidable. NP-hard, but decidable.)
The main reason that people like me really care about linear logic is because there is a variant of lambda calculus that includes *linear types*. Linear types are types where referencing the value corresponding to the type *consumes* it. For dealing with the semantics of real programming languages, there are many things that make sense as linear types. For example, the values of an input stream behave in a linear way: read something from an input stream, and it’s not there anymore: unless you copied it, to put it someplace safe, it’s gone.
For typing linear lambda calculus, we use *intuitionistic linear logic* with the *multiplicative* operators. They give us a *decidable* type system with the capability to express linear type constraints.
Enough side-tracking; back to the categories.
We can now finally define a *linear category*. A category C is a linear category if it is both a [cartesian category][cartesian] and a [monoidal][monoidal] [closed category][monclose] with a *dualizing functor*. A dualizing functor is a *contravariant* closed functor defining properties of the categorical exponential, written (-)* : C → C. (That should be read with “-” as a placeholder for an object; and * as a placeholder for an exponent.) (-)* has the property that there is a natural isomorphism d : Id ≡ (-)** (That is, d is an identity natural isomorphism, and it’s equivalent to applying (-)* to the result of applying (-)* ) such that the following commutes:

linear-dual.jpg

So, what does this mean? Take the linear implication operator; mix it with the categorical exponent; and what you get *still* behaves *linearly*: that is, if “X -o Y”; that is, one X can be consumed to produce one Y, then 2 Xs can be consumed to produce 2 Ys; and N Xs can be consumed to produce N Ys.
So a linear category behaves cleanly with exponents; t has a linear implication; it has an *eval* operator (from the fact that it’s a cartesian category) to perform the linear implications; it has tensor for producing groups of resources. That’s it; that’s everything we need from categories for a model of linear logic.
Now, there’s just one more question: are there any real linear categories that make sense as a model for linear logic? And obviously, the answer is yes. There are two of them, called **CPO** and **Lin**.
**CPO** is the category with continuous partial orders as objects, and monotonic functions as morphisms. Think about that, and it makes a whole lot of sense for linear logic; the equivalence classes of the partial order are resources of the same “value” (that is, that can be exchanged for one another); and you can’t climb *upwards* in the partial order without adding something.
The other one, **Lin**, is a lot more complicated. I’m not going to explain it in detail; that would lead into another two-month-long series! But to put it briefly, **Lin** is the category with *coherent domains* as objects, and linear maps as morphisms. To make that make sense, I would have to explain domain theory, which (to put it mildly) is complicated; the short version is that a domain is a kind of CPO. But the reason we care about **Lin** is that programming language semantics are generally described using [*denotational semantics*][denote]; and denotational semantics are built using a kind of domain, called a Scott domain. So this gives us a domain that we can use in programming language semantics with exactly the properties we need to explain how linear types work.
[lambda-type]: http://goodmath.blogspot.com/2006/06/finally-modeling-lambda-calculus.html
[systemf]: http://en.wikipedia.org/wiki/System_F
[cartesian]: http://scienceblogs.com/goodmath/2006/06/categories_products_exponentia_1.php
[monclose]: http://scienceblogs.com/goodmath/2006/07/towards_a_model_for_linear_log_1.php
[monoidal]: http://scienceblogs.com/goodmath/2006/07/the_category_structure_for_lin_1.php
[denote]: http://en.wikipedia.org/wiki/Denotational_semantics

Mocking a Silly Anti-Relativity Rant

I was reading an article on Slashdot the other day about a recent discovery of what might be a MECO. A [MECO][wiki-meco] is a “magnetospheric eternally collapsing object”; if this were true, it would be a big deal because according to relativity, either black holes exist and MECOs don’t, or MECOs exist and black holes don’t.
I have no intention of getting into the MECO vs. black hole argument. But a commenter there put down a link to something that he seemed to think was a [reasonable argument against relativity][nastytruth]. I took a look, and it’s just *hysterically* funny. The author of the site is a total crackpot; not only does he propose a way of totally redefining physics, but he also proposes an explanation for everything that’s wrong with modern software, and exactly how to build a real, proper AI.
One of my mantras for dealing with crackpots is: “The very worst math is no math”. This guy does a spectacular job of demonstrating that.
Just for fun, I’ve got to quote the beginning of his diatribe. There’s nothing more fun than watching a crackpot rant about how it’s the *rest* of the world that are crackpots.
>The Crackpottery
>
>We have all been taught that there is no such thing as absolute motion or
>position or that every motion and position in the universe is relative. This
>unsubstantiated belief, which I have named exclusive relativity, has been
>around for centuries, even before the advent of Albert Einstein and the theory
>of relativity. It was not until early in the twentieth century, however, that
>exclusive relativity became in vogue. Nowadays most physicists consider the
>concept of absolute motion to be no more credible than the flat earth.
>Simple Proof #1 That Exclusive Relativity Is Bogus
>If all positions are relative, then we have a self-referential system in which
>every position is ultimately relative to itself. For example, suppose we have a
>two-body universe. Body A’s position is relative to body B’s position and vice
>versa. Since both positions are relative to the other and there are no other
>bodies, each body’s position is ultimately relative to itself. Of course, it
>does not matter whether there are only two bodies or a billion.
>
>Exclusive relativity amounts to saying things like, “you are as tall as you
>are” or “this sound is as loud as itself” or “pick yourself up by your own
>bootstraps.” Of course this is silly but this is the sort of silliness we have
>to believe in if we accept exclusive relativity.
Nope.
If you have two particles and nothing else, you can define their *positions* relative to each other in terms of their *distance* from each other. It’s not circular. Distance is the important fact. In a relativistic universe, there is no special *distinguished* reference point where the “real” position of objects is defined relative to that reference. Everything is described relative to *a* reference; but that reference can be pretty much any location you choose.
This doesn’t mean that measurements or positions are meaningless. It just means that they’re *relative*.
There’s actually a whole field of mathematics that studies things like this: it’s called metric topology. Speaking *very* loosely, metric topology looks at what kinds of *shapes* a continuous surface can take, and how to measure distance in those different kinds of spaces.
For example, if we lived in a two dimensional world, we could imagine that the world was a flat plane. In that case, the distance between two points is defined in one way. And it doesn’t matter *where* you put your reference point on the plane; the distance between two objects on that surface will be the same. We could also imagine a two dimensional world that was the surface of a torus. The distance between objects would be rather different there; but still, you could measure the distance between two objects on the surface of the torus. And no matter what point of reference you choose, the torus looks the same.
But if you’re a clueless twit who doesn’t understand what “relative position” means, then you can end up with the argument that this guy just presented.
>Simple Proof #2 That Exclusive Relativity Is Bogus
>
>Suppose there is a force acting on a particle so as to accelerate it. The
>particle has as many relative velocities as there are possible frames of
>reference, an infinite number in fact. Which of the myriads of relative
>velocities does the force change? How does the accelerating agent know about
>them so as to change them all? Answer, it does not. Only one velocity is
>changed by the force because it has no access to the others. The others are
>abstract, i.e., non-physical.
Once again, nope.
One of the things that’s beautiful about relativity is that it provides a set of equations that make this all work. From one point of reference, it may appear that an object is accelerating at rate X; from another point of view, it may appear that it’s accelerating at rate Y; work out the relativity equations, and they’re *both* right. Time dilation and relativistic mass shift makes it all work. (If fact, if you were around to read [my series on group theory][groups], you can see [where Blake Stacey explained in a comment][relativity] how relativity describes a lot of things as groups that are symmetric over the kinds of transformations that we’re discussing.)
The problem with the author of this piece is that *he’s not doing math*. Relativity isn’t just a theory with a bunch of words that say “position is relative”, etc. It’s a set of mathematical equations that define in a very precise way what that means, and how it works. Like I said: the worst math is no math. If he’d tried to understand the math, he’d know that there’s no problem here.
>Simple Proof #3 That Exclusive Relativity Is Bogus
>
>Let’s consider the motion of a particle. How does a particle “know” about its
>motion or rest relative to extrinsic frames of references so as to move or be
>at rest relative to them? Are particles psychic? I think not. No particle in
>the universe can make use of the relative because it has no access to it. It
>follows that the universe does not use the relative. The only properties that
>it can use are absolute ones.
Same exact problem as his “simple proof #2”. He didn’t do the math, and so he drew a really stupid invalid conclusion. The math of relativity explains how this works: the apparent velocity and acceleration of a particle in all frames of reference are equally valid; and the reason that they’re equally valid is because if you do the math for shifting the reference frame, you find that the different apparent values are really just different views of the same thing.
[nastytruth]: http://pages.sbcglobal.net/louis.savain/Crackpots/nasty.htm
[groups]: http://goodmath.blogspot.com/2006/06/group-theory-index.html
[relativity]: http://goodmath.blogspot.com/2006/04/some-applications-of-group-theory.html
[wiki-meco]: http://en.wikipedia.org/wiki/Magnetospheric_eternally_collapsing_object
[slashdot-meco]: http://science.slashdot.org/article.pl?sid=06/07/28/0543250

Friday Pathological Programming Language: Whenever

I was hoping for a bit of a vanity post for todays pathological programming language in honor of my 40th birthday (tomorrow), but I didn’t have time to finish implementing my own little piece of insanity. So it’ll have to wait for some other occasion.
Todays pathological programming language is a really simple monstrosity called [“Whenever”][whenever]. Whenever is a programming language where programs get executed in *random* order, and there are *no* variables. The only state, the only way of manipulating information in a Whenever program is by manipulating the collection of executable statements itself: the program is both the code *and* the data at the same time.
The basic program model of Wheneveris: you write a set of statements. The statements are inserted into a grab-bag by the interpreter. The interpreter then repeatedly picks a statement out of the bag *at random* and executes it. It keeps doing that until there are no more statements in the bag.
Everything in Whenever is done by manipulating the statement bag.
Each statement is labeled by a number. The number has no direct meaning; it’s just an identifier that will be used to reference the statement. The numbers assigned to lines don’t have to match the order in which the lines were written in the source file; and they have no effect on the order by which statements are pulled out of the bag.
So how do you do anything in this crazy language?
There’s a print statement for printing things out. So, for example,
23 print(“Hello worldn”);
is the hello world program. Since there’s only one statement, it’ll get grabbed from the statement bag, and executed.
There’s also a read statement, which reads a number from standard input, and then acts as if the statement were the number that it read.
The simplest statement is just a number. A number statement *adds* a copy of the line identifier by that number to the bag. If the number is negative, then it *removes* a copy of the statement from the bag. So, for example:
23 print(“Hello worldn”);
11 23;
would print “Hello world” twice. If 23 were executed first, it would print “Hello world”, and then only 11 would be in the bag, so it would execute 11, which would add 23 to the bag. If 11 went first, then it would add 23 to the bag, and there would be two 23s in the bag, which would get executed.
You can add multiple copies of a line to the bag: 5#3 adds 3 copies of statement 5 to the bag. And you can add multiple statements to the bag at once, by separating them with a comma. So:
17 21, -2, 3#4, -5#2;
Would insert one copy of statement 21 and four copies of statement 3 to the bag; and remove one copy of statement 2, and two copies of statement 5.
You can also do any normal arithmetic operation on numbers. The result of the arithmetic operation is interpreter as a line number.
There’s also two kinds of conditionals, “defer” and “again”. Defer takes a parameter which is evaluated to a line number, and if there are any copies of that line number in the bag, then it reinserts itself, and doesn’t do anything else. If there are no copies of the parameter in the bag, then the statement on the rest of the line is executed.
Again, an example:
1 print(“Hello”);
2 defer(1) print(“Worldn”);
is a more complicated version of hello world.
The “again” statement is very similar to the “defer” statement; but if its argument is true (i.e., evaluates to a statement that is present in the bag), then it adds a copy of itself to the bag; whether the parameter is true or false, it then executes the rest of the statement.
There’s one helpful built-in function: N(x) returns the number of copies of statement x in the bag.
So, a couple of interesting example programs:
1 defer (4 || N(1)<N(2) && N(2)<N(3)) print(N(1)+" bottles of beer on the wall, "+N(1)+" bottles of beer,");
2 defer (4 || N(1)==N(2)) print("Take one down and pass it around,");
3 defer (4 || N(2)==N(3)) print(N(1)+" bottles of beer on the wall.");
4 1#98,2#98,3#98;
This first ensures that statement four runs first: statements 1, 2, and 3 will all defer until 4 has been executed. Once four is run, there are 99 copies of statements 1, 2, and 3 in the bag. The rest of the defer statement makes sure that 1 executes before 2, and 2 before 3; so it cycles through 1, 2, 3 99 times. Pretty simple, right?
How about this?
1 again (1) defer (3 || N(1)99) 2#N(1),3,7;
2 again (2) defer (3 || N(2)99) 1#N(2),3,7;
3 defer (5) print(N(1)+N(2));
4 defer (5) print(“1”);
5 4,-3,7;
6 defer (4) 3;
7 7;
8 defer (N(7)<100) -1#N(1),-2#N(2),-7#100;
9 defer (3 || 6) 1,3;
If you look carefully.. It generates the first 100 fibonacci numbers.
It's an incredibly simple language. Simple, but quite twisted, and seriously mind-warping to try to program: you need to always keep track of the fact that the statements that represent your data could get selected for execution, which will modify the data unless you used a "defer" as a guard, but then you need to make sure that the guard gets preserved correctly… It's quite devious.
[whenever]: http://www.dangermouse.net/esoteric/whenever.html