{"id":100,"date":"2006-08-04T08:30:00","date_gmt":"2006-08-04T08:30:00","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/08\/04\/friday-pathological-programming-programming-through-grammars-in-thue\/"},"modified":"2006-08-04T08:30:00","modified_gmt":"2006-08-04T08:30:00","slug":"friday-pathological-programming-programming-through-grammars-in-thue","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/08\/04\/friday-pathological-programming-programming-through-grammars-in-thue\/","title":{"rendered":"Friday Pathological Programming: Programming Through Grammars in Thue"},"content":{"rendered":"<p>Today&#8217;s treat: [Thue][thue], pronounced two-ay, after a mathematician named Axel Thue.<br \/>\nThue is a language based on [semi-Thue][semi-thue] grammars. semi-Thue grammars are equivalent to Chosmky-0 grammars, except that they&#8217;re even more confusing: a semi-Thue grammar doesn&#8217;t distinguish between terminal and non-terminal symbols. (Since Chomsky-0 is equivalent to a turing machine, we know that Thue is turing complete.) The basic idea of it is that a program is a set of rewrite rules that tell you how to convert one string into another.  The language itself is incredibly simple; programs written in it are positively mind-warping.<br \/>\nThere are two parts to a Thue program. First is a set of grammar rules, which are written in what looks like pretty standard BNF:  &#8220;*&lt;lhs&gt; ::= &lt;rhs&gt;*&#8221;. After that, there is a section which is the initial input string for the program. The two parts are separated by the &#8220;::=&#8221; symbol by itself on a line.<br \/>\nAs I said, the rules are pretty much basic BNF: whatever is on the left-hand side can be replaced with whatever is on the right-hand side of the rule, whenever a matching substring is found. There are two exceptions, for input and output. Any rule whose right hand side is &#8220;:::&#8221; reads a line of input from the user, and replaces its left-hand-side with whatever it read from the input. And any rule that has &#8220;~&#8221; after the &#8220;::=&#8221; prints whatever follows the &#8220;~&#8221; to standard out.<br \/>\nSo, a hello world program:<br \/>\nx ::=~ Hello world<br \/>\n::=<br \/>\nx<br \/>\nPretty simple: the input string is &#8220;x&#8221;; there&#8217;s a replacement for &#8220;x&#8221; which replaces &#8220;x&#8221; with nothing, and prints out &#8220;Hello world&#8221;.<br \/>\nHow about something much more interesting. Adding one in binary. This assumes that the binary number starts off surrounded by underscore characters to mark the beginning and end of a line.<br \/>\n1_::=1++<br \/>\n0_::=1<br \/>\n01++::=10<br \/>\n11++::=1++0<br \/>\n_0::=_<br \/>\n_1++::=10<br \/>\n\/\/::=:::<br \/>\n::=<br \/>\n_\/\/_<br \/>\nLet&#8217;s tease that apart a bit.<br \/>\n1. Start at the right hand side:<br \/>\n1. &#8220;1_::=1++&#8221; &#8211; If you see a &#8220;1&#8221; on the right-hand end of the number,<br \/>\nreplace it with &#8220;1++&#8221;, removing the &#8220;_&#8221; on the end of the line. The rest<br \/>\nof the computation is going to use the &#8220;++&#8221; to mark the point in<br \/>\nstring where the increment is going to alter the string next.<br \/>\n2. &#8220;0_::=1&#8221; &#8211; if you see a &#8220;0&#8221; on the right-hand-end, then replace it with 1. This doesn&#8217;t insert a &#8220;++&#8221;, so none of the other rules are going to do anything: it&#8217;s going to stop. That&#8217;s what you want: if the string ended in a zero, then adding one to it just changes that 0 to a 1.<br \/>\n2. Now, we get to something interesting: doing the addition:<br \/>\n1. If the string by the &#8220;++&#8221; is &#8220;01&#8221;, then adding one will change it to &#8220;10&#8221;, and that&#8217;s the end of the addition. So we just change it to 10, and get rid of the &#8220;++. &#8221;<br \/>\n2. If the string by the &#8220;++&#8221; is &#8220;11&#8221;, then we change the last digit to &#8220;0&#8221;, and move the &#8220;++&#8221; over to the left &#8211; that&#8217;s basically like adding 1 to 9 in addition: write down a zero, carry the one to the left.<br \/>\n3. Finally, some termination cleanup. If we get to the left edge, and there&#8217;s a zero, after the &#8220;_&#8221;, we can get rid of it. If we get to the left edge, and there&#8217;s a &#8220;1++&#8221;, then we replace it with 10 &#8211; basically adding a new digit to the far left; and get rid of the &#8220;++&#8221;, because we&#8217;re done.<br \/>\n3. Finally, inputting the number &#8220;\/\/::=:::&#8221; says &#8220;If you see &#8220;\/\/&#8221;, then read some input from the user, and replace the &#8220;\/\/&#8221; with whatever you read.<br \/>\n4. Now we have the marker for the end of the rules section, and the string to start the program is &#8220;_\/\/_&#8221;. So that triggers the input rule, which inputs a binary number, and puts it between the &#8220;_&#8221;s.<br \/>\nLet&#8217;s trace this on a couple of smallish numbers.<br \/>\n* Input: &#8220;111&#8221;<br \/>\n1. &#8220;_111_&#8221; &rarr; &#8220;_111++&#8221;<br \/>\n2. &#8220;_111++_&#8221; &rarr; &#8220;_11++0&#8221;<br \/>\n3. &#8220;_11++0&#8221; &rarr; &#8220;_1++00&#8221;<br \/>\n4. &#8220;_1++00&#8221; &rarr; &#8220;1000&#8221;<br \/>\n* Input: &#8220;1011&#8221;<br \/>\n1. &#8220;_1011_&#8221; &rarr; &#8220;_1011++&#8221;<br \/>\n2. &#8220;_1011++&#8221; &rarr; &#8220;_101++0&#8221;<br \/>\n3. &#8220;_101++0&#8221; &rarr; &#8220;_1100&#8221;.<br \/>\nNot so hard, eh?<br \/>\nDecrement is the same sort of thing; here&#8217;s decrement with an argument hardcoded instead of reading from input:<br \/>\n0_::=0&#8211;<br \/>\n1_::=0<br \/>\n10&#8211;::=01<br \/>\n00&#8211;::=0&#8211;1<br \/>\n_1&#8211;::=@<br \/>\n_0&#8211;::=1<br \/>\n_0::=<br \/>\n::=<br \/>\n_1000000000000_<br \/>\nNow, something more complicated. From [here][vogel], a very nicely documented program that computes fibonacci numbers. It&#8217;s an interesting mess &#8211; it converts things into a unary form, does the computation in unary, then converts back. This program uses a fairly neat trick to get comments. Thue doesn&#8217;t have any comment syntax. But they use &#8220;#&#8221; on the left hand side as a comment marker, and then make sure that it never appears anywhere else &#8211; so none of the comment rules can ever be invoked.<br \/>\n&#8212;-<br \/>\n#::= fibnth.t &#8211; Print the nth fibonacci number, n input in decimal<br \/>\n#::= (C) 2003 Laurent Vogel<br \/>\n#::= GPL version 2 or later (http:\/\/www.gnu.org\/copyleft\/gpl.html)<br \/>\n#::= Thue info at: http:\/\/www.catseye.mb.ca\/esoteric\/thue\/<br \/>\n#::= This is modified thue where only foo::=~ outputs a newline.<br \/>\n#::= &#8216;n&#8217; converts from decimal to unary-coded decimal (UCD).<br \/>\nn9::=*********,n<br \/>\nn8::=********,n<br \/>\nn7::=*******,n<br \/>\nn6::=******,n<br \/>\nn5::=*****,n<br \/>\nn4::=****,n<br \/>\nn3::=***,n<br \/>\nn2::=**,n<br \/>\nn1::=*,n<br \/>\nn0::=,n<br \/>\nn-::=-<br \/>\n#::= &#8216;.&#8217; prints an UCD number and eats it.<br \/>\n.*********,::=.~9<br \/>\n.********,::=.~8<br \/>\n.*******,::=.~7<br \/>\n.******,::=.~6<br \/>\n.*****,::=.~5<br \/>\n.****,::=.~4<br \/>\n.***,::=.~3<br \/>\n.**,::=.~2<br \/>\n.*,::=.~1<br \/>\n.,::=.~0<br \/>\n~9::=~9<br \/>\n~8::=~8<br \/>\n~7::=~7<br \/>\n~6::=~6<br \/>\n~5::=~5<br \/>\n~4::=~4<br \/>\n~3::=~3<br \/>\n~2::=~2<br \/>\n~1::=~1<br \/>\n~0::=~0<br \/>\n#::= messages moving over &#8216;,&#8217;, &#8216;*&#8217;, &#8216;|&#8217;<br \/>\n*&lt;::=&lt;*<br \/>\n,&lt;::=&lt;,<br \/>\n|&lt;::=*::=*0&gt;<br \/>\n0&gt;,::=,0&gt;<br \/>\n0&gt;|::=|0&gt;<br \/>\n1&gt;*::=*1&gt;<br \/>\n1&gt;,::=,1&gt;<br \/>\n1&gt;|::=|1&gt;<br \/>\n2&gt;*::=*2&gt;<br \/>\n2&gt;,::=,2&gt;<br \/>\n2&gt;|::=|2&gt;<br \/>\n#::= Decrement n. if n is &gt;= 0, send message &#8216;2&gt;&#8217;;<br \/>\n#::= when n becomes negative, we delete the garbage using &#8216;z&#8217; and<br \/>\n#::= print the left number using &#8216;.&#8217;.<br \/>\n*,-::=,2&gt;<br \/>\n,,-::=,-*********,<br \/>\n|,-::=z<br \/>\nz*::=z<br \/>\nz,::=z<br \/>\nz|::=.<br \/>\n#::= move the left number to the right, reversing it (0 becomes 9, &#8230;)<br \/>\n#::= reversal is performed to help detect the need for carry. As the<br \/>\n#::= order of rule triggering is undetermined in Thue, a rule matching<br \/>\n#::= nine consecutive * would not work.<br \/>\n*c&lt;::=c<br \/>\n,c&lt;::=c<br \/>\n|c<br \/>\n0&gt;d*::=d<br \/>\n1&gt;d::=d*********,<br \/>\n#::= when the copy is done, &#8216;d&#8217; is at the right place to begin the sum.<br \/>\n2&gt;d::=f&lt;|<br \/>\n#::= add left to right. &#039;f&#039; moves left char by char when prompted by a &#039;&lt;&#039;.<br \/>\n#::= it tells &#039;g&#039; to its right what the next char is.<br \/>\n*f<br \/>\n,f<br \/>\n#::= when done for this sum, decrement nth.<br \/>\n|f&#8217; &#8216;g&#8217; drops an &#8216;i&#8217; that increments the current digit.<br \/>\n0&gt;g::=ig<br \/>\n*i::=&lt;<br \/>\n,i::=i,*********<br \/>\n|i::=&#8217; tells &#8216;g&#8217; to move left to the next decimal position,<br \/>\n#::= adding digit 0 if needed (i.e. nine &#8216;*&#8217; in reversed UCD)<br \/>\n*1&gt;g::=1&gt;g*<br \/>\n,1&gt;g::=g::=|&#8217; tells &#8216;g&#8217; that the sum is done. We then prepare for copy (moving<br \/>\n#::= a copy cursor &#8216;c&#8217; to the correct place at the beginning of the left<br \/>\n#::= number) and reverse (using &#8216;j&#8217;) the right number back.<br \/>\n*2&gt;g::=2&gt;g*<br \/>\n,2&gt;g::=2&gt;g,<br \/>\n|2&gt;g::=c|*********j<br \/>\n#::= &#8216;j&#8217; reverses the right number back, then leaves a copy receiver &#8216;d&#8217;<br \/>\n#::= and behind it a sum receiver &#8216;g&#8217;.<br \/>\n*j*::=j<br \/>\nj,*::=,********j<br \/>\nj,,::=,*********j,<br \/>\nj,|::=&#8217; sent by &#8216;-&#8216;<br \/>\n#::= will start when hitting &#8216;g&#8217; the first swap + sum cycle<br \/>\n#::= for numbers 0 (left) and 1 (reversed, right).<br \/>\n::=<br \/>\n|n?-|,|********g,|<br \/>\n&#8212;&#8212;<br \/>\nOk. Now, here&#8217;s where we go *really* pathological. Here, in all it&#8217;s wonderful glory, is a [Brainfuck][brainfuck] interpreter written in Thue, written by Mr. Frederic van der Plancke. (You can get the original version, with documentation and example programs, along with a Thue interpreter written in Python from [here][vdp-thue])<br \/>\n&#8212;&#8212;<br \/>\n#::=# BRAINFUNCT interpreter in THUE<br \/>\n#::=# by Frederic van der Plancke; released to the Public Domain.<br \/>\no0::=0o<br \/>\ni0::=0i<br \/>\no1::=1o<br \/>\ni1::=1i<br \/>\no]::=]o<br \/>\ni]::=]i<br \/>\no[::=[o<br \/>\ni[::=[i<br \/>\noz::=zo<br \/>\noZ::=Zo<br \/>\niz::=zi<br \/>\niZ::=Zi<br \/>\n0z::=z0<br \/>\n1z::=z1<br \/>\n]z::=z]<br \/>\n[z::=z[<br \/>\n_z::=z_<br \/>\n0Z::=Z0<br \/>\n1Z::=Z1<br \/>\n]Z::=Z]<br \/>\n[Z::=Z[<br \/>\n_Z::=Z_<br \/>\n}}]::=}]}<br \/>\n&amp;}]::=]&amp;}<br \/>\n&amp;]::=]^<br \/>\n}[::=[{<br \/>\n&amp;[::=[0::=0}<br \/>\n{1::=1{<br \/>\n?Z[::=[&amp;<br \/>\n?z[::=[^<br \/>\n?z]::=]<br \/>\n?Z]::=]^<br \/>\n[{{::={[{<br \/>\n[{::=[<br \/>\n[::=[^<br \/>\n]{::={]<br \/>\n]::={]<br \/>\n0::=<br \/>\n1::=1<br \/>\n0{::={0<br \/>\n1{::={1<br \/>\n^[::=?[iii<br \/>\n^]::=?]iii<br \/>\n}|::=}]|WARN]WARN<br \/>\n&amp;|::=&amp;]|WARN]WARN<br \/>\nWARN]WARN::=~ERROR: missing &#8216;]&#8217; in brainfunct program; inserted at end<br \/>\n^000::=000^ooo<br \/>\n^001::=001^ooi<br \/>\n^010::=010^oio<br \/>\n^011::=011^oii<br \/>\n^100::=100^ioo<br \/>\n^101::=101^ioi<br \/>\nooo|::=|ooo<br \/>\nooi|::=|ooi<br \/>\noio|::=iio|oio<br \/>\noii|::=|oii<br \/>\nioo|::=|ioo<br \/>\nioi|::=|ioi<br \/>\niii|::=|iii<br \/>\niio|!::=|<br \/>\n|z::=z|<br \/>\n|Z::=Z|<br \/>\no_::=_o<br \/>\ni_::=_i<br \/>\n0!::=!0<br \/>\n1!::=!1<br \/>\n_!::=!_<br \/>\n\/!::=!\/<br \/>\noooX!::=Xooo<br \/>\noooY::=$Y<br \/>\n0$::=!1<br \/>\n1$::=$0<br \/>\nX$::=X!1<br \/>\nooiX!::=Xooi<br \/>\nooiY::=@1Y<br \/>\n1@1::=@00<br \/>\n0@1::=@11<br \/>\n1@0::=@01<br \/>\n0@0::=@00<br \/>\nX@1::=X!WARNDEC0WARN<br \/>\nWARNDEC0WARN::=~WARNING: attempt at decrementing zero (result is still zero)<br \/>\nX@00::=X@0<br \/>\nX@01::=X!1<br \/>\nX@0Y::=X!0Y<br \/>\noioX!::=LLYLR<br \/>\n0LL::=LL0<br \/>\n1LL::=LL1<br \/>\n_LL::=!X!<br \/>\n|LL::=|!0X!<br \/>\nLR0::=0LR<br \/>\nLR1::=1LR<br \/>\nLRY::=_<br \/>\noiiX!::=_oii<br \/>\noiiY::=X!%<br \/>\n%0::=0%<br \/>\n%1::=1%<br \/>\n%_0::=Y0<br \/>\n%_1::=Y1<br \/>\n%_\/::=Y0_\/<br \/>\niooX!::=X(in)<br \/>\n(in)0::=(in)<br \/>\n(in)1::=(in)<br \/>\n(in)Y::=Yi<br \/>\ni\/0::=Zi\/<br \/>\ni\/1::=zi\/<br \/>\ni\/_::=!\/<br \/>\nXY!::=X!0Y<br \/>\n0Y!::=0!Y<br \/>\n1Y!::=1!Y<br \/>\nYZ::=0Y<br \/>\nYz::=1Y<br \/>\nioiX!::=X(out)<br \/>\n(out)0::=0(out)O0<br \/>\n(out)1::=1(out)O1<br \/>\n(out)Y::=OO!Y<br \/>\nO0::=~0<br \/>\nO1::=~1<br \/>\nOO::=~_<br \/>\niiiX!0::=ZX!0<br \/>\niiiX!1::=zX!1<br \/>\n+::=000<br \/>\n-::=001<br \/>\n::=011<br \/>\n,::=100<br \/>\n.::=101<br \/>\n:::=|X!0Y0_\/<br \/>\n::=<br \/>\n^*<br \/>\n[vdp-thue]: http:\/\/fvdp.homestead.com\/files\/eso_index.html#BFThue<br \/>\n[thue]: http:\/\/esoteric.voxelperfect.net\/wiki\/Thue<br \/>\n[semi-thue]: http:\/\/en.wikipedia.org\/wiki\/Semi-Thue_grammar<br \/>\n[brainfuck]: http:\/\/scienceblogs.com\/goodmath\/2006\/07\/gmbm_friday_pathological_progr.php<br \/>\n[vogel]: http:\/\/lvogel.free.fr\/thue.htm<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today&#8217;s treat: [Thue][thue], pronounced two-ay, after a mathematician named Axel Thue. Thue is a language based on [semi-Thue][semi-thue] grammars. semi-Thue grammars are equivalent to Chosmky-0 grammars, except that they&#8217;re even more confusing: a semi-Thue grammar doesn&#8217;t distinguish between terminal and non-terminal symbols. (Since Chomsky-0 is equivalent to a turing machine, we know that Thue is [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[92],"tags":[],"class_list":["post-100","post","type-post","status-publish","format-standard","hentry","category-pathological-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-1C","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/100","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=100"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/100\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=100"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=100"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=100"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}