Monthly Archives: June 2012

Turing Machines – what they are, what they aren't.

It’s the anniversary of the birth of Alan Turing. So there’s a ton of people writing commemorative stories. And naturally, a ton of those stories get it wrong. And they get it wrong in a very sad way.

Of course, when you talk about Turing, you talk about Turing machines. Despite the fact that Turing did lots of stuff besides just that machine, it’s always the machine that people focus on. Partly that’s laziness – the machine has his name after all, so it’s one of the first things you find if you Google Turing. It’s also the easiest thing to write about.

What do they say about the Turing machine? It’ “the simplest computing device”. It’s the “basis for modern computers”. It’s “the theoretical model for the microchips in your laptop”. It’s the “mathematical description of your computer”. None of those things are true. And they all both over and under-state what Turing really did. In terms of modern computers, the Turing machine’s contribution to the design of real computers is negligible if not non-existent. But his contrubition to our understanding of what computers can do, and our understanding of how mathematics really works – they’re far, far more significant than the architecture of any single machine.

The Turing machine isn’t a model of real computers. The computer that you’re using to read this has absolutely nothing to do with the Turing machine. As a real device, the turing machine is absolutely terrible.

The turing machine is a mathematical model not of computers, but of computation. That’s a really important distinction. The Turing machine is an easy to understand model of a computing device. It’s definitely not the simplest model. There are simpler computing devices (for example, I think that the rule 111 machine is simpler) – but their simplicitly makes them harder to understand.

The reason that the Turing machine is so important comes down to two important facts. First, which machine you use to talk about computation doesn’t matter. There’s a limit to what a mechanical device can do. There are lots of machines out there – but ultimately, no machine can go past the limit. Any machine that can reach that limit is, for the purposes of the theory of computation, pretty much the same. When we talk about studying computation, what we’re talking about is the set of things that can be done by a machine – not by a specific machine, but by any machine. The specific choice of machine isn’t important. And that’s the point: computation is computation. That’s what Turing figured out.

The Turing machine is a brilliant creation. It’s a simple machine. It’s really easy to understand. And it’s easy to tweak – that is, it’s easy to do experiments where you can modify the machine, and see what effect it has.

So let’s take a step back, and see: what is a Turing machine?

The Turing machine is a very simple kind of theoretical computing device. In fact, it’s almost downright trivial. But according to everything that we know and understand about computation, this trivial device is capable of any computation that can be performed by any other computing device.

The basic idea of the Turing machine is very simple. It’s a machine that runs on top of a tape, which is made up of a long series of little cells, each of which has a single character written on it. The machine is a read/write head that moves over the tape, and which can store a little bit of information. Each step, the machine looks at the symbol on the cell under the tape head, and based on what it sees there, and whatever little bit of information it has stored, it decides what to do. The things that it can do are change the information it has store, write a new symbol onto the current tape cell, and move one cell left or right.

That’s really it. People who like to make computing sound impressive often have very complicated explanations of it – but really, that’s all there is to it. The point of it was to be simple – and simple it certainly is. And yet, it can do anything that’s computable.

To really understand how that trivial machine can do computations, it helps to be a bit formal. In formal terms, we talk about a turing machine as a tuple: (S, s0, A, T), where:

  • S is a finite list of possible states that the machine can be in. The state is the information that the machine can store in the head to make decisions. It’s a very limited amount of information; you can think of it as either a specific list of states, or a small set of small numbers. For most Turing machines, we’ll use it as a specific list of states. (You’ll see what I mean in a minute.)
  • s0S is the initial state – the state that the machine will be in when it starts a computation.
  • A is the machine’s alphabet, which is the set of symbols which can appear on the machine’s tape.
  • T is the machines transition function. This is the real heart of the machine, where the computation is defined. It’s a function from the machine state and the alphabet character on the current tape cell to the action that the machine should take. The action is a triple consisting of a new value for the machine’s state, a character to write onto the current tape cell, and a direction to move the tape head – either left or right.

So, for example, let’s look at a simple machine. This is one of the classic examples: a Turing machine that does subtraction using unary numbers. A unary number “N” is written as a series of N 1s. For the program, we’ll give the machine an input in the format “N-M=” written onto the tape; after running the machine, the tape will contain the value of M subtracted from N. So, for example, we could use “1111-11=” as an input; the output would be “11”.

The alphabet is the characters “1”, ” ” (blank space), “-” (minus sign), and “=” (equal sign). The machine has four states: {“scanright”, “eraseone”, “subone”, “skip”}. It starts in the state “scanright”. It’s transition function is given by the following table:

FromState Symbol ToState WriteChar Dir
scanright space scanright space right
scanright 1 scanright 1 right
scanright minus scanright minus right
scanright equal eraseone space left
eraseone 1 subone equal left
eraseone minus HALT space n/a
subone 1 subone 1 left
subone minus skip minus left
skip space skip space left
skip 1 scanright space right

What this machine does is move to the right until it sees the equal sign; it erases the equal sign and moves to the left, erases one digit off of the second number, and replaces it with the equal sign (so the second number is reduced by one, and the equal sign is moved over one position). Then it scans back to the minus sign (which separates the two numbers), and erases one digit of the first number, and then switches back to scanning to the right for the equal. So one at a time, it erases one digit from each of the two numbers. When it reaches the equal sign, and starts going back to erase a digit from the second number, and hits the “-” before it finds a digit, that means its done, so it stops. Let’s look at a simple execution trace. In the trace, I’ll write the machine state, followed by a colon, followed by the tape contents surrounded by “[]”, with the current tape cell surrounded by “{}”.

	scanright:  [ {1}1111111-111= ]"
	scanright:  [ 1{1}111111-111= ]"
	scanright:  [ 11{1}11111-111= ]"
	scanright:  [ 111{1}1111-111= ]"
	scanright:  [ 1111{1}111-111= ]"
	scanright:  [ 11111{1}11-111= ]"
	scanright:  [ 111111{1}1-111= ]"
	scanright:  [ 1111111{1}-111= ]"
	scanright:  [ 11111111{-}111= ]"
	scanright:  [ 11111111-{1}11= ]"
	scanright:  [ 11111111-1{1}1= ]"
	scanright:  [ 11111111-11{1}= ]"
	scanright:  [ 11111111-111{=} ]"
	eraseone :  [ 11111111-11{1}  ]"
	subone   :  [ 11111111-1{1}=  ]"
	subone   :  [ 11111111-{1}1=  ]"
	subone   :  [ 11111111{-}11=  ]"
	skip     :  [ 1111111{1}-11=  ]"
	scanright:  [ 1111111 {-}11=  ]"
	scanright:  [ 1111111 -{1}1=  ]"
	scanright:  [ 1111111 -1{1}=  ]"
	scanright:  [ 1111111 -11{=}  ]"
	eraseone :  [ 1111111 -1{1}   ]"
	subone   :  [ 1111111 -{1}=   ]"
	subone   :  [ 1111111 {-}1=   ]"
	skip     :  [ 1111111{ }-1=   ]"
	skip     :  [ 111111{1} -1=   ]"
	scanright:  [ 111111 { }-1=   ]"
	scanright:  [ 111111  {-}1=   ]"
	scanright:  [ 111111  -{1}=   ]"
	scanright:  [ 111111  -1{=}   ]"
	eraseone :  [ 111111  -{1}    ]"
	subone   :  [ 111111  {-}=    ]"
	skip     :  [ 111111 { }-=    ]"
	skip     :  [ 111111{ } -=    ]"
	skip     :  [ 11111{1}  -=    ]"
	scanright:  [ 11111 { } -=    ]"
	scanright:  [ 11111  { }-=    ]"
	scanright:  [ 11111   {-}=    ]"
	scanright:  [ 11111   -{=}    ]"
	eraseone :  [ 11111   {-}     ]"
	Halt:       [ 11111  { }-     ]"
	

See, it works!

One really important thing to understand here is that we do not have a program. What we just did was define a Turing machine that does subtraction. The machine does not take any instructions: the states and the transition function are an intrinsic part of the machine. So the only thing this machine can do is to subtract.

The real genius of Turing wasn’t just defining this simple computing machine. It was realizing the concept of the program: you could design a Turing machine whose input tape contained a description of a Turing machine – that is a program – followed by an input to the program. This single machine – the Universal Turing machine – could simulate any other Turing machine; one machine which could perform any computation.

Turing machines are a lot of fun to play with. Figuring out how to do things can be fascinating. Figuring out how to define a Universal turing machine’s transition function is an amazing thing to do; it’s astonishing how simple the universal machine can be!

As I said earlier – you can’t make a mechanical computing device that does anything that a Turing machine can’t do. One of the beauties of the Turing machine is that it lets you explore that. You can try adding and removing features to the basic machine, and see what happens.

For example: if you can do lots of great stuff with a Turing machine with one tape, what if you had a two-tape turing machine? That is, take the basic turing machine, and say that it’s got two tapes, each with a read/write head. Each state transition rule on this machine depends on the pair of values found on the two tapes. For now, we’ll say that the tapes move together – that is, the transition rule says “move the heads right” or “move the heads left”.

Seems like this should represent a real increase in power, right? No. Take a single-tape turing machine. Take the alphabet for tape one, and call it A1; take the alphabet for tape 2, and call it A2. We can create a single-tape turing machine whose alphabet is the cross-product of A1 and A2. Now each symbol on the tape is equivalent of a symbol on tape 1 and a symbol on tape 2. So we’ve got a single-tape machine which is equivalent to the two-tape machine. Bingo.

We can lift the restriction on the heads moving together, but it’s a lot more work. A two-tape machine can do things a lot faster than a one-tape, and the simulation will necessarily adapt to that. But it’s entirely doable. How about a two-dimensional tape? We can simulate that pretty easily with a two-tape machine, which means we can do it with a one-tape machine. For a two tape machine, what we do is map the two-D tape onto the one-D-tape, as seen in the diagram below – so that cell 0 on the one-D tape corresponds to cell (0,0) on the two tape; cell (0,1) on the two-D corresponds to cell 1 on the one-D; cell (1,1) on the 2-D is cell 2 on the 1-D; etc. Then we use the second tape for the book-keeping necessary to do the equivalent of T-D tape moves. And we’ve got a two-D turing machine simulated with a two-tape one-D; and we know that we can simulate a two-tape one-D with a one-tape one-D.

This is, to me, the most beautiful thing about the Turing machine. It’s not just a fundamental theoretical construction of a computing device; it’s a simple construction of a computing device that’s really easy to experiment with. Consider, for a moment, lambda calculus. It’s more useful that a Turing machine for lots of purposes – we write real programs in lambda calculus, when no one would build a real application using a Turing machine program. But imagine how you’d try things like the alternate constructions of the Turing machine? It’s a whole lot harder to build experiments like those in lambda calculus. Likewise for Minsky machines, Markov machines, etc.

For your enjoyment, I’ve implemented a Turing machine programming language. You feed it a Turing machine description, and an input string, and it will give you a trace of the machines execution like the one above. Here’s the specification of the subtraction machine written in my little Turing language:

states { "scanright" "eraseone" "subtractOneFromResult" "skipblanks" } initial "scanright"
alphabet { '1' ' ' '=' '-' } blank ' '
trans from "scanright" to "scanright" on (' ') write ' ' move right
trans from "scanright" to "scanright" on ('1') write '1' move right
trans from "scanright" to "scanright" on ('-') write '-' move right
trans from "scanright" to "eraseone" on ('=') write ' ' move left
trans from "eraseone" to "subtractOneFromResult" on ('1') write '=' move left
trans from "eraseone" to "Halt" on ('-') write ' ' move left
trans from "subtractOneFromResult" to "subtractOneFromResult" on ('1') write '1' move left
trans from "subtractOneFromResult" to "skipblanks" on ('-') write '-' move left
trans from "skipblanks" to "skipblanks" on (' ') write ' ' move left
trans from "skipblanks" to "scanright" on ('1') write ' ' move right

I think it’s pretty clear as a syntax, but it still needs explanation.

  • The first line declares the possible states of the machine, and what state it starts in. This machine has four possible states – “scanright”, “eraseone”, “subtractOneFromResult”, and “skipblanks”. When the machine starts, it will be in the state “skipright”.
  • The second line declares the set of symbols that can appear on the tape, including what symbol will initially appear on any tape cell whose value wasn’t specified by the input. For this machine the symbols are the digit 1, a blank space, the equal sign, and the subtraction symbol; the blank symbol is on any tape cell whose initial value wasn’t specified.
  • After that is a series of transition declarations. Each declaration specifies what the machine will do for a given pair of initial state and tape symbol. So, for example, if the machine is in state “scanright”, and the current tape cell contains the equal-to sign, then the machine will change to state “eraseone”, write a blank onto the tape cell (erasing the “=” sign), and move the tape cell one position to the left.

Finally, the code. You’ll need GHCI to run it; at the moment, it won’t work in Hugs (I don’t have the parsing library in my version of Hugs), and I haven’t worked out the linkage stuff to make it work under the GHC compiled mode.

--
-- A Simple Turing machine interpreter
-- Copyright 2007 by Mark C. Chu-Carroll
--    markcc@gmail.com
--   http://scienceblogs.com/goodmath
--
-- You can do anything you want with this code so long as you
-- leave this copyright notice intact.
--
module Turing where
import Text.ParserCombinators.Parsec
import qualified Text.ParserCombinators.Parsec.Token as P
import Text.ParserCombinators.Parsec.Language
import System.Environment

data Motion = MoveLeft  | MoveRight deriving (Show)

-- Transition from to on move write
data Transition = Transition String String [Char] Motion  Char deriving (Show)

-- TuringMachine states initial alphabet blank transitions
data TuringMachine = Machine [String] String   [Char]    Char    [Transition] deriving (Show)

data MachineState = TMState [Char] Char [Char]  String  deriving (Show)
--                           tape-left curcell  curstate

getBlankSym :: TuringMachine -> Char
getBlankSym (Machine _ _ _ bl _) = bl

findMatchingTransition :: [Transition] -> String -> Char -> Maybe Transition
findMatchingTransition [] _ _ =  Nothing
findMatchingTransition translist state c =
     let matches = filter (transitionMatches state c) translist
     in case matches of
          [] -> Nothing
          t:[] -> Just t
          _ -> error "Ambiguous transition"
       where transitionMatches state c (Transition from to on move write) =
                        (from == state) && (elem c on)

runTransition :: TuringMachine -> [Char] -> Char -> [Char] -> String -> Transition -> MachineState
runTransition m (l:left) c right state (Transition from tostate on MoveLeft write) =
   TMState left l (write:right) tostate
runTransition m left c [] state (Transition from tostate on MoveRight write) =
   TMState (write:left) (getBlankSym m) [] tostate
runTransition m left c (r:right) state (Transition from tostate on MoveRight write) =
   TMState (write:left) r right tostate

stepMachine :: TuringMachine -> MachineState -> MachineState
stepMachine machine@(Machine _ _ _ _ translist) st@(TMState tapeleft c taperight state) =
       let trans = findMatchingTransition translist state c
       in case trans of
          (Just t) -> runTransition machine tapeleft c taperight state t
          Nothing -> error "No applicable transition from state"

getMachineStateString (TMState left c right state) =
	(state ++ ":[" ++ (reverse left) ++ "{" ++ [c] ++ "}" ++ right ++ "]")

runTracedMachine :: TuringMachine -> [Char] -> [String]
runTracedMachine tm@(Machine states initial alphabet blank transitions) (t:tape) =
    runTracedMachineSteps tm (TMState [] t tape initial) where
        runTracedMachineSteps machine state =
           let newstate@(TMState left c right st) = stepMachine machine state
           in if st == "Halt"
               then [getMachineStateString newstate]
               else let trace=runTracedMachineSteps machine newstate
                    in ((getMachineStateString newstate):trace)

runMachine :: TuringMachine -> [Char] -> [Char]
runMachine tm@(Machine states initial alphabet blank transitions) (t:tape) =
    runMachineSteps tm (TMState [] t tape initial) where
        runMachineSteps machine state =
           let newstate@(TMState left c right st) = stepMachine machine state
           in if st == "Halt"
               then (concat [(reverse left), [c], right])
               else runMachineSteps machine newstate

---------------------------------------------------------------------------
-- Parsing code - implemented using the Parsec monadic parser library.
---------------------------------------------------------------------------

-- Basic setup stuff - use a standard haskell style lexer; set up the reserved words
-- and symbols in the lexer.
lexer :: P.TokenParser ()
lexer = P.makeTokenParser (haskellDef
                        { P.reservedNames = ["states","alphabet","trans","from","to","on","write","move","left","right","initial","blank"] })

reserved = P.reserved lexer
symbol = P.symbol lexer
braces = P.braces lexer
parens = P.parens lexer
charlit = P.charLiteral lexer
strlit = P.stringLiteral lexer
ident = P.identifier lexer
whitespace = P.whiteSpace lexer

states = reserved "states"
alphabet = reserved "alphabet"
trans = reserved "trans"
from = reserved "from"
to = reserved "to"
on = reserved "on"
write = reserved "write"
move = reserved "move"
initial = reserved "initial"
blank = reserved "blank"

left = do { reserved "left"
          ; return MoveLeft
          }

right = do { reserved "right"
           ; return MoveRight
           }

-- statesDecl ::= "states" "{"  strlit+  "}" "initial" strlit
statesDecl = do { states
                ; strlst <- braces (many1 strlit)
                ; initial
                ; i <- strlit
                ; return (i,strlst)
                }

-- alphaDecl ::= "alphabet" "{" charlit+  "}" "blank" charlit
alphaDecl = do { alphabet
               ; charlst <- braces (many1 charlit)
               ; blank
               ; bl <- charlit
               ; return (bl, charlst)
               }

-- transDecl ::= "trans" "from" strlit "to" strlit "on" "(" charlit+ ")" "write" charlit "move" ("left" | "right")
transDecl = do { trans
               ; from
               ; fromState <- strlit
               ; to
               ; targetState <- strlit
               ; on
               ; chrs <- parens (many1 charlit)
               ; write
               ; wrchar <- charlit
               ; move
               ; dir <- ( left <|> right)
      	   ; return (Transition fromState targetState chrs dir wrchar)
              }

-- machine ::= statesDecl alphaDecl transDecl+
machine = do { (i,sts) <- statesDecl
             ; (bl,alpha) <- alphaDecl
             ; trans <- many1 transDecl
             ; return (Machine sts i alpha bl trans)
             }

run :: (Show a) => Parser a -> String -> IO a
run p input
	= case (parse p "" input) of
		Left err -> do{ putStr "parse error at "
		; print err
		; error "Parse error"
		}
		Right x -> return x

runTParser ::  String -> IO TuringMachine
runTParser input =
	run (do { whitespace
	        ; x <- machine
	        ; eof
	        ; return x })  input

--------------------------------------------------------------------------
-- A sample Turing program - first in the comment, and then coded into
-- a string literal, to make it easy to run tests. This sample program
-- is a basic Turing subtract - it takes a unary equation "111111-111=",
-- and leaves the result of subtracting the second from the first.
--
-- states { "one" "two" "three" "four" } initial "one"
-- alphabet { '1' ' ' '=' '-' } blank ' '
-- trans from "one" to "one" on (' ') write ' ' move right
-- trans from "one" to "one" on ('1') write '1' move right
-- trans from "one" to "one" on ('-') write '-' move right
-- trans from "one" to "two" on ('=') write ' ' move left
-- trans from "two" to "three" on ('1') write '=' move left
- trans from "two" to "Halt" on ('-') write '-' move left
-- trans from "three" to "three" on ('1') write '1' move left
-- trans from "three" to "four" on ('-') write '-' move left
-- trans from "four" to "four" on (' ') write ' ' move left
-- trans from "four" to "one" on ('1') write ' ' move right

sampleMachine = concat ["states { "one" "two" "three" "four" } initial "one"n ",
                        " alphabet { '1' ' ' '=' '-' } blank ' 'n ",
                        "trans from "one" to "one" on (' ') write ' ' move rightn ",
                        "trans from "one" to "one" on ('1') write '1' move rightn ",
                        "trans from "one" to "one" on ('-') write '-' move rightn ",
                        "trans from "one" to "two" on ('=') write ' ' move leftn ",
                        "trans from "two" to "three" on ('1') write '=' move leftn ",
                        "trans from "two" to "Halt" on ('-') write '-' move leftn ",
                        "trans from "three" to "three" on ('1') write '1' move leftn ",
                        "trans from "three" to "four" on ('-') write '-' move leftn ",
                        "trans from "four" to "four" on (' ') write ' ' move leftn ",
                        "trans from "four" to "one" on ('1') write ' ' move right"  ]

runTracedMachineOnString :: String -> String -> IO ([String])
runTracedMachineOnString m str =
	do
		tm <- runTParser m
		return (runTracedMachine tm str)

runMachineOnString :: String -> String -> IO String
runMachineOnString m str =
    do
	    tm <- runTParser m
	    return (runMachine tm str)

sampleInput = " 11111111-111= "

------------------------------------------------------------------------
-- Main program execution scaffolding
-- main still needs a bit of work so that ghci will link correctly;
-- runs fine in GHCI, but linkage errors in GHC. For now, just load
-- this file, and then execute "runFromFile" from the command line.
------------------------------------------------------------------------
main =
    do
       [file] <- getArgs
       m <- parseFromFile (do { whitespace
                              ; x <- machine
                              ; eof
                              ; return x }) file
       case m of
          Right machine -> do
             print "Enter input for parser:"
             s <- getLine
             result <- return (runMachine machine s)
             print (concat ["Result:[", result, "]"])
          Left x -> do
	        print (concat ["Parse error"])

runFromFile :: String -> IO ()
runFromFile filename =
    do
       m <- parseFromFile (do { whitespace
                              ; x <- machine
                              ; eof
                              ; return x }) filename
       case m of
          Right machine -> do
             print "Enter input for parser:"
             s <- getLine
             result <- return (runMachine machine s)
             print (concat ["Result:[", result, "]"])
          Left x -> do
	        print (concat ["Parse error"])


A Neat Trick with Partial Evalutors

This is something I was talking about with a coworker today, and I couldn’t quickly find a friendly writeup of it online, so I decided to write my own. (Yes, we do have a lot of people who enjoy conversations like this at foursquare, which is one of the many reasons that it’s such a great place to work. And we are hiring engineers! Feel free to get in touch with me if you’re interested, or just check our website!)

Anyway – what we were discussing was something called partial evaluation (PE). The idea of PE is almost like an eager form of currying. Basically, you have a two parameter function, f(x, y). You know that you’re going to invoke it a bunch of times with the same value of x, but different values of y. So what you want to do is create a new function, fx(y), which evaluates as much as possible of f using the known parameter.

It’s often clearer to see it in programmatic form. Since these days, I pretty much live Scala, I’ll use Scala syntax. We can abstractly represent a program as an object which can be run with a list of arguments, and which returns some value as its result.

trait Program {
  def run(inputs: List[String]): String
}

If that’s a program, then a partial evaluator would be:

object PartialEvaluator {
  def specialize(prog: Program, knownInputs: List[String]): Program = {
     ...
  }
}

What it does is take and program and a partial input, and returns a new program, which is the original program, specialized for the partial inputs supplied to the partial evaluator. So, for example, imagine that you have a program like:

object Silly extends Program {
  def run(inputs: List[String]): String = {
    val x: Int = augmentString(inputs[0]).toInt
    val y: Int = augmentString(inputs[0]).toInt
    if (x % 2 == 0) {
	  (y * y).toString
    } else {
	  ((y+1)*(y+1)).toString
    }
  }
}

This is obviously a stupid program, but it’s good for an example, because it’s so simple. If we’re going to run Silly a bunch of times with 1 as the first parameter, but with different second parameters, then we could generate a specialized version of Silly:

   val sillySpecialOne = PartialEvaluator.specialize(Silly, List("1"))

Here’s where the interesting part comes, where it’s really different from
currying. A partial evaluator evaluates everything that it
possibly can, given the inputs that it’s supplied. So the value produced by the specializer would be:

object Silly_1 extends Program {
  def run(inputs: List[String]): String = {
    val y: Int = augmentString(inputs[0]).toInt
    ((y+1)*(y+1)).toString
  }
}

This can be a really useful thing to do. It can turn in to a huge
optimization. (Which is, of course, why people are interested in it.) In compiler terms, it’s sort of like an uber-advanced form of constant propagation.

But the cool thing that we were talking about is going a bit meta. Suppose that you run a partial evaluator on a programming language interpreter?

object MyInterpreter extends Program {
  def run(inputs: List[String]): String = {
    val code = inputs[0]
    val inputsToCode = inputs.tail
    interpretProgram(code, inputsToCode)
  }

  def interpretProgram(code: String, inputs: List[String]): String = {
     ...
  }
}

We can run our partial evaluator to generate a version of the interpreter that’s specialized for the code that the interpreter is supposed to execute:

  1. We have an interpreter for language M, written in language L.
  2. We have a partial evaluator for language L.
  3. We run the partial evaluator on the interpreter, with a program
    written in M that we want to interpret:
    PartialEvaluator.specialize(M_Interpreter, M_Code).
  4. The partial evaluator produces a program written in L.

So the partial evaluator took a program written in M, and transformed it into a program written in L!

We can go a step further – and generate an M to L compiler. How? By running the partial evaluator on itself. That is, run the partial
evaluator like this: PartialEvaluator.specialize(PartialEvaluator, M_Interpreter). The result is a program which takes an M program as
input, and generates an L program: that is, it’s an M to L compiler!

We can go yet another step, and turn the partial evaluator into a
compiler generator: PartialEvaluator(PartialEvaluator,
List(source(PartialEvalutor)))
. What we get is a program which takes an interpreter written in L, and generates a compiler from it.

It’s possible to actually generate useful tools this way. People have actually implemented Lisp compilers this way! For example, you can see someone do a simple version here.

Numeric Pareidolia and Vortex Math

Update: as of 8/8/2012, the youtube video has been pulled at the request of the TED organization. They’ve also asked me to help them figure out how to keep crackpots like this out of their conferences. I turned them down: if they want help, they’ve got the money to hire a legitimate professional, someone who actually knows what they’re doing – i.e., not me.

I’m not a big fan of the TED phenomenon. In my opinion, it’s basically an elaborate scheme to help make a bunch of self-important rich guys show off their importance by getting people to come give them speeches that, ultimately, serve to reinforce their self-importance.

But there’s another reason that I dislike it. In addition to the self-importance of its audience, as it’s grown, it’s also turned into a forum where other self-important twits can pretend that they’re actual scientists presenting important scientific work.

A great example of this is something called “Vortex Math”. An idiot by the name of “Marko Rodin” came up with this ridiculous idea, wrote up a website talking about how wonderful it was and how brilliant he is, and then worked out an invitation to talk at one of the TEDX conferences, which he’s then used to further promote himself – after all, he must be a genuine genius in order to have been allowed to present to such an important group of people!

Vortex math is an example of what I called numerical pareidolia. For those who haven’t heard the term, pareidolia is seeing patterns in randomness. For example, the common event seeing the image of jesus in a piece of toast, or in a mildew stain, or… We find these images, and then believe that they’re not just an illusion, but that they’re a real, deliberate, meaningful reality.

Paradeidolia isn’t limited to seeing images. We humans are natural pattern seekers. We can find patterns in dirt, in words, and in numbers. Numeric pareidolia is finding patterns in numbers. The most common version of that is numerology, where you assign meanings to numbers, and then find more meanings by performing arithmetic to combine numbers with arithmetic, and finding meaning in result.

In Vortex math, Mr. Rodin has done something interesting, for some definition of that word. He’s found a numeric pattern, and he believes that not only is that a pattern, but it is the pattern, the fundamental structure of the universe. And what is this oh-so-awesome pattern, this vortex that defines the entire nature of the universe?

It’s a sequence of 1-digit, base-10 numbers. To get it, you start, obviously, with 1. So take the number 1. Double it. You get 2. Double it again, and you get 4. Again, 8. Again, 16. But 16 is two digits – so add them together: 7. Double 16 again, and you get 32, 3+2=5, so 5 is next. Double 32, and you get 64. 6+4=10, 1+0=1. Etc. You get a repeating cycle.

That’s it.

Of course, this only works in base-10. It’s a result of the interaction of doubling with the base. So you won’t get the same pattern in any other number system! According to Rodin, because of the significance of this pattern, that means that base-10 really is the only correct number system.

And what is the significance of this base-10 pattern? Let’s let Rodin and his supporters explain, shall we?

Marko Rodin has discovered the source of the non-decaying spin of the electron. Although scientists know that all electrons in the universe spin, they have never discovered the source of this spin. Rodin has. He has discovered the underpinning geometry of the universe, the fabric of time itself. He has done this by reducing all higher mathematics – calculus, geometry, scalar math – to discrete-number mathematics.

With the introduction of Vortex-Based Mathematics you will be able to see how energy is expressing itself mathematically. This math has no anomalies and shows the dimensional shape and function of the universe as being a toroid or donut-shaped black hole. This is the template for the universe and it is all within our base ten decimal system!

The potential scope and breadth of the Rodin Solution is staggering; it is universally applicable in mathematics, science, biology, medicine, genetics, astronomy, chemistry, physics and computer science. The Rodin Solution will revolutionize computer hardware by creating a crucial gap space, or equi-potential major groove, in processors. This gap space generates underpinning nested vortices resulting in far higher efficiency with no heat build-up. The Rodin Solution replaces the binary code with a new code called the binary triplet which will revolutionize computer operating systems. It will transform physics and astrophysics by finally answering how black holes and pulsars work. Space travel will be revolutionized by reactionless drives that are unaffected by the weight they pull, making the present day combustion engine obsolete. The revolution brought on by reactionless drives will far surpass the societal changes wrought by the shift from steam engines to the present day combustion engine. The Rodin Solution can even be applied to ending pollution and drought by creating an inexhaustible, nonpolluting energy source. Because Rodin´s Vortex-Based Mathematics enables him to condense a trillion-fold calculation to only a few integer steps and because he is able to solve all the mathematical enigmas, the Rodin Solution will revolutionize computer information compression.

Pretty impressive, eh?

And what would crackpottery be without at least a bit of conspiracy? See, the government knows all about it, and they’re actually secretly using it to protect us:

Rudimentary versions of the Rodin Coil, or Rodin Torus, have been created and tested by leading scientists and are presently being used by the U.S. Government in antennas that protect the four corners of the continental U.S.. Life-saving medical devices based on crude approximations of the Rodin Coil Torus are being manufactured and used in the treatment of cancer patients. Microsoft´s former senior researcher is using the Rodin Coil to research, develop and patent new computer information-compression schemes.

Nifty!

Alas, it’s all bullshit. It’s not worth spending too much time on this, but I’ll grab a couple of the claims that are close to my interests, and briefly explain why he’s so full of shit.

One of the claims in the passage above is how he’ll revolutionize computer operating systems:

The Rodin Solution replaces the binary code with a new code called the binary triplet which will revolutionize computer operating systems

Suppose for a moment, that we replaced binary in our computers with a different underlying representation – any underlying representation. Ternary, quadrary, decimal, or his “binary triplets”, whatever they are. How much difference would that make?

None at all. We’ve had the capacity to create ternary computers for a long time – there’s just no reason to. We have built decimal computers. For many years, IBM had computers for financial systems that used a representation called BCD – binary coded decimal. BCD can be useful in financials, because it’s easier to control rounding errors. Floating point math is a bit weird, because numbers that should be precise don’t necessarily have precise binary floating point representations, so you can get some odd rounding errors if you’re not careful. You don’t need BCD to do this – you can use a variety of notations, so long as you’re doing fixed point instead of floating point, but using a decimal fixed point representation makes it all easier.

The thing is, you can’t do anything with different representations that you can’t do with binary. It doesn’t matter. So we don’t build hardware using different representations. We don’t use binary because we don’t know how to build anything else; we use binary because it’s easiest to build binary hardware, and there’s no benefit to making the hardware more complex.

More important, one of the beautiful things about computers is that computers don’t really do binary numbers. Computers use binary to represent things. Numbers are one example of something we can represent. But we can represent anything we want. When I was in college, one of my assignments in a CS class was to implement ternary arithmetic. It’s a simple enough thing that it makes an easy assignment for an undergrad introductory CS class! We can build any representation that we want, and use it. We do this routinely. We’re constantly building new representations for particular purposes. Some of them are so useful that they’ve been enshrined in hardware. For example, computers used to only come with integer hardware – that is, the only mathematical operations that were implemented in the hardware were operations on integers. The computers still did floating point math – you just needed to implement the representation in software. It was so useful that we added it to hardware in order to make it faster. But it’s not fundamentally different. And if a new representation that worked better than simple binary worked, we could implement it using a standard binary computer.

So Rodin’s magic vortex binary-triple computer? There’s just nothing special about it. It’s not going to revolutionize computers.

Another example is compression:

Because Rodin´s Vortex-Based Mathematics enables him to condense a trillion-fold calculation to only a few integer steps and because he is able to solve all the mathematical enigmas, the Rodin Solution will revolutionize computer information compression.

Again, it’s stupid. The problem with compression isn’t that it’s too hard to compute. The problem is more fundamental than that. We can’t compress everything – it’s impossible. (I described more about the reason why it’s generally impossible to do universal compression in this post.) The science and math of data compression are based on the fact that we don’t actually want to compress arbitrary things; we want to compress specific types of things: text, images, video. For each of those, common representations contain a lot of redundancies, and compression tries to remove those redundancies. So, for example, by finding regions in successive frames of a video that don’t change, we can reduce the size of a video file. But that technique won’t do anything for a still image or a text file. We exploit the specific properties of the medium to find an effective way of compressing that specific medium.

In fact, we can do better at specific kinds of media with customized hardware. People build custom hardware for things like mp4 compression all the time. But that’s for a specific medium. It’s got nothing to do with general compression. General compression remains impossible, vortex math or no.