{"id":221,"date":"2006-11-24T12:03:53","date_gmt":"2006-11-24T12:03:53","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/11\/24\/pathological-programming-by-string-rewriting\/"},"modified":"2006-11-24T12:03:53","modified_gmt":"2006-11-24T12:03:53","slug":"pathological-programming-by-string-rewriting","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/11\/24\/pathological-programming-by-string-rewriting\/","title":{"rendered":"Pathological Programming by String Rewriting"},"content":{"rendered":"<p>Todays tidbit of torture is a simple little language called [Leszek][leszek], with an implementation available [here][leszek-impl]. Leszek is based on the idea of *iterative string rewriting*, which is actually a useful and valuable concept. Of course, Leszek takes it to an extreme of insanity which takes a perfectly good idea, and turns it into a horror. But thats what makes it fun!<br \/>\n[leszek]: http:\/\/www.esolangs.org\/wiki\/Leszek<br \/>\n[leszek-impl]: http:\/\/students.mimuw.edu.pl\/~js248325\/zsyp\/leszek.tar.gz<br \/>\nIn Lezsek, there are no variables; no loops; no state. A program is just a string with<br \/>\na collection of embedded rewriting operators. The way that things work is, the interpreter looks<br \/>\nat the string. It finds all of the rewriting commands in the string, and executes them, concatenating the *results* of those commands. When it&#8217;s done all of<br \/>\nthe rewrites in the entire program, it takes the *resulting* program string, and executes *that*. It keeps going until there is no possibility of any more rewrites generating output, and then it halts.<\/p>\n<p><!--more--><br \/>\nSo&#8230; The guts of the language is, obviously, the set of rewrite operators. So let&#8217;s dive in, and take a look:<br \/>\n### Cut<br \/>\n&#8220;`C`&#8221;, the &#8220;cut&#8221; command. This takes two parameters: the first is an escaped character *c*, and<br \/>\nthe second is an integer *i*. The integer parameter is written as a sequence of digits followed by a period. What is does is search for the *i*th occurence of the character<br \/>\n*c*, and return the string from the *i*th occurence through the *i+1*th occurence.<br \/>\nFor example, if  you had a program: &#8220;`fooCb2.flobbiebletch`&#8221;, the result of the &#8220;`Cb2.`&#8221; operation is the string between the second and third non-escaped &#8220;b&#8221; character, that is the string &#8220;ie&#8221;.<br \/>\n### Length<br \/>\n&#8220;`L`&#8221; takes the same parameters as &#8220;`C`&#8221;, but instead of returning the string, it returns the *length* of the string, in period-terminated integer form.  So for the same example as the &#8220;`C`&#8221;,  &#8220;`fooLb2.foobiebletch`&#8221; the result would be &#8220;2.&#8221;.<br \/>\n### Concatenation<br \/>\nThe simple concatenation operator &#8220;`T`&#8221; takes two rewrite expressions, and returns the concatenation of them. So for example, in &#8220;`foobarfiebletchTCb1.Lb1.`&#8221;, the &#8220;`T`&#8221; command evaluates &#8220;`Cb1.`&#8221;, getting &#8220;arfie&#8221;, and then evaluates &#8220;`Lb1.`&#8221;, getting &#8220;5.&#8221;. So the overall result is &#8220;arfie5.&#8221;<br \/>\nThere&#8217;s also a group concatenation operator, &#8220;`G`&#8221;, which takes an integer parameter *n*, followed by *n* other expressions; it evaluates all of the expressions, and then concatenations them. So &#8220;`T`&#8221; is equivalent to &#8220;`G2.`&#8221;.<br \/>\n### Selection<br \/>\n&#8220;`A`&#8221; and &#8220;`D`&#8221; each take two expressions as parameters and evaluates them. &#8220;`A`&#8221; returns the result of the first one. &#8220;`D`&#8221; returns the result of the second one.<br \/>\n&#8220;`E`&#8221; is a conditional selection operator. It takes three expressions as parameters and evaluates them. If the first expression returns a nonempty string, it returns the second one; otherwise, it returns the third.<br \/>\nFor use in &#8220;`E`&#8221;, there&#8217;s a null command, &#8220;`N`&#8221; which takes no parameters, and always evaluates to an empty string.<br \/>\n### Comparisons and Logic<br \/>\n&#8220;`=`&#8221; takes two expressions. If the two evaluate to the same string, then it returns &#8220;1&#8221;; otherwise, it returns an empty string.<br \/>\n&#8220;`&amp;`&#8221;, &#8220;`|`&#8221;, and &#8220;`!`&#8221; are logical operators: and, or, and not.<br \/>\n### Input and Output<br \/>\n&#8220;`I`&#8221; reads a character from input, and returns it. &#8220;`M`&#8221; reads an integer from input, and returns it in the period-terminated form.<br \/>\n&#8220;`O`&#8221; takes an expression, evaluates it, and prints the result to the standard output.<br \/>\n### Arithmetic<br \/>\nArithmetic operators take two integers as parameters, and return the result of the appropriate arithmetic operation: &#8220;`+`&#8221; (add), &#8220;`-`&#8221; (subtract), , &#8220;`*`&#8221; (multiply), &#8220;`V`&#8221; (integer divide) and &#8220;`%`&#8221; (modulo).<br \/>\nLeszek Programs<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br \/>\nAs usual, we start with hello world:<br \/>\nOC.1.Hello World!.<br \/>\n&#8220;`O`&#8221; outputs the result of its parameter, &#8220;`C.1.`&#8221;. The &#8220;`C`&#8221; is a bit tricky: it copies<br \/>\nfrom the first non-escaped &#8220;.&#8221; to the next &#8220;.&#8221;; the first non-escaped period is the one that terminates the &#8220;`C`s&#8221; integer parameter. So the copy command returns the string &#8220;Hello World!&#8221;,<br \/>\nthe output prints it, and then the program terminates.<br \/>\nAnd an alternative version which is more interesting:<br \/>\nE=%C|1.NTC|.C%3.E=%C|1.NOD|Hello World!%|<br \/>\nLet&#8217;s tease that apart, to make it more comprehensible. What it&#8217;s going to try to do is print<br \/>\nout the characters between the &#8220;|&#8221;s, one at a time, and it&#8217;s going to use the &#8220;`%`&#8221; after the `&#8221;Hello World!&#8221;` string to recognize when it&#8217;s done.<br \/>\n* It starts is with the &#8220;E&#8221; conditional. We can rewrite that into a slightly more conventional way:<br \/>\nif (=%C|1. != &#8220;&#8221;) then N<br \/>\nelse TC|.C%3.<br \/>\n* If the contents of the string between the &#8220;`|`&#8221;s is just the &#8220;%&#8221; character, it&#8217;s done. So the conditional will just return an empty string if it&#8217;s done.<br \/>\n* If there are more characters between the &#8220;`|`&#8221;s than just &#8220;%&#8221;, then it does the &#8220;`TC|.C%3.`&#8221;. This is a simple concatenation of &#8220;`C|.`&#8221; and &#8220;`C%3.`&#8221;.<br \/>\n* &#8220;`C|.`&#8221; copies from the 0th to the first &#8220;|&#8221;. The 0th mention of any character is always treated as the character *before* the first character of the string. So it copies the full program<br \/>\nup to the character before the first &#8220;|&#8221;: &#8220;`E=%C`&#8221;<br \/>\n* &#8220;`C%3.`&#8221; copies from the third to the fourth &#8220;%&#8221; character (&#8220;`C|1.NOD|Hello World!`&#8221;). That&#8217;s quite a clever little bit, which basically finishes quineing the the beginning of the program.<br \/>\n* So the result of the first conditional is either nothing, or it reproduces the program for the next iteration.<br \/>\n* Next is another conditional, which is very much like the original. If there&#8217;s nothing<br \/>\nbetween the &#8220;`|`&#8221;s then it returns nothing; otherwise, it goes on to the rest.<br \/>\n* &#8220;`OD|Hello World!%|`&#8221;. This a clever bit. It&#8217;s *really* &#8220;`OD|H`&#8221;, followed by &#8220;`ello World!%`&#8221;. &#8220;`OD|H`&#8221; prints the *second* of the two expressions that follow it. The first is the &#8220;`|`&#8221; that&#8217;s quoting the string; the second is the first character of the string. So it prints &#8220;H&#8221;. The rest of the program has no rewrite commands, so it just returns *itself* as a string.<br \/>\n* So in this first iteration, it&#8217;s output the first character of the string, and *very nearly* quines itself. In fact, the almost-quine is missing one character: the *first* character of the quoted string.<br \/>\nFrom there, you should be able to see what&#8217;s going on. It&#8217;s going to repeat the process, only the second time, the first character of the quoted string is &#8220;e&#8221;. And so on.<br \/>\nI&#8217;ll close with another clever Leszek program, which you can work  your way through by yourself. It&#8217;s a fibonacci generator, which inputs a number *N*, and outputs the *N*th fibonacci number.<br \/>\nE=C|6.C|7.OC|4.C|.C%1.C|1.C%1.D<br \/>\n+C|2.C|4.|C|3.|1.|DG+2.L|4.|0.|C|5.C%1.+T1A.|.|OG14.Input number:<br \/>\nM|%|<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Todays tidbit of torture is a simple little language called [Leszek][leszek], with an implementation available [here][leszek-impl]. Leszek is based on the idea of *iterative string rewriting*, which is actually a useful and valuable concept. Of course, Leszek takes it to an extreme of insanity which takes a perfectly good idea, and turns it into a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[92],"tags":[],"class_list":["post-221","post","type-post","status-publish","format-standard","hentry","category-pathological-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-3z","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/221","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=221"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/221\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=221"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=221"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=221"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}