{"id":170,"date":"2006-09-29T12:51:22","date_gmt":"2006-09-29T12:51:22","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/09\/29\/programming-without-control-smith\/"},"modified":"2006-09-29T12:51:22","modified_gmt":"2006-09-29T12:51:22","slug":"programming-without-control-smith","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/09\/29\/programming-without-control-smith\/","title":{"rendered":"Programming without Control: Smith"},"content":{"rendered":"<p>Joy of joys, [Cat&#8217;s Eye Technologies](http:\/\/catseye.mine.nu:8080\/projects\/), the home of Chris Pressey, one of the most prolific designers of thoroughly bizarre languages is back up! And so, today, we&#8217;re going to take a look at one of his masterpieces, the language *Smith*. This is one of my personal all-time favorite pathological languages; it&#8217;s positively brilliantly twisted.<br \/>\nSmith is, mostly, a pretty typical memory-oriented machine language. So what makes it pathological? It&#8217;s got *no* jumps, *no* branches, *no* control flow whatsoever. The program counter starts at zero, and every step, it increments by one *and only one*.<br \/>\nSo, you might quite sensibly ask, how on earth could this beast do anything useful, much less be Turing complete? The answer is: self-modifying code; in particular, the program can *copy* its own instructions.<\/p>\n<p><!--more--><br \/>\nWant to do a loop? No problem: it&#8217;s just a set of instructions that repeats, right? So the last instruction *copies* the loop *including* the copy instruction. But how do you terminate the loop? There&#8217;s no compare instruction!<br \/>\nYou *compute* the number of instructions to copy. So you cleverly arrange to have a register holding the length of the loop *without* the copy instruction. And then you do a computation that winds up either adding 0 or 1 to that length.<br \/>\nLet&#8217;s take a quick look at the instructions, so that we can show examples to make that a bit clearer. The SMITH machine is a pretty typical memory-based machine with the conditional branch and comparison instructions removed. Most operations work on memory locations, which SMITH calls registers. (They behave like memory, not like normal machine registers.) The basic values in the Smith language are literals (written as signed numbers); locations (written R*N*, where *N* is the numeric address); and the values stored in locations (written R[R*N*]);<br \/>\nThe instructions are:<br \/>\n1. <code>MOV target, value<\/code>. Copy the value to the target. The target is either a register or TTY. If it is a register, it can be either a direct or indirect (RN or R[RN]); if it is TTY, then the value is copied to standard out. The value can be a literal, a register, an indirect register (N, RN, R[RN]), PC (for the value of the program counter), TTY (which reads a value from standard in), or &#8220;*&#8221; which is the current line number in the source file.<br \/>\n2. <code>MOV target, stringlit<\/code>. A special form of MOV; stores the characters of stringlit in sequence at (target, target+1, target+2, etc.)<br \/>\n3. <code>SUB target, value<\/code>. Subtract the value from the target, storing the result in the target. Target is a register, either direct or indirect; value is either a register or a literal.<br \/>\n4. <code>MUL target, value<\/code>: multiply the target by the value, and store the result in the target. The parameters are the same as in &#8220;SUB&#8221;.<br \/>\n5. <code>NOT target<\/code>. Target is a register. This is the closest thing to a conditional in SMITH. If the value of target is 0, then this makes it 1; if the target is non-zero, then this makes it 0. So this basically converts a value to a boolean and reverses it.<br \/>\n6. <code>COR target, start, length<\/code>. Copy a sequence of &#8220;length&#8221; instructions, starting with the instruction located &#8220;start&#8221; away from the PC to an address starting at an offset of &#8220;target&#8221; from the current PC. Start, target, and length can all be either literals or locations.<br \/>\n7. <code>NOP<\/code>. A no-op; does nothing. This gives us one of our two ways of doing something sort-of conditional; we can copy a <code>NOP<\/code> over another copy instruction.<br \/>\n9. <code>STOP<\/code>. End the execution of the program.<br \/>\n10. <code>BLA target, NOP, length<\/code>.  Similar to the &#8220;COR&#8221; instruction, except that this just copies a sequence of &#8220;length&#8221; <code>NOP<\/code> instructions to the target offset.<br \/>\nThere&#8217;s also one pseudo-instruction, &#8220;<code>REP<\/code>&#8220;. <code>REP int, instruction<\/code> inserts &#8220;int&#8221; copies of &#8220;instruction&#8221; at the current line of the source file. So &#8220;<code>REP 3, NOP<\/code>&#8221; is equivalent to three lines, each of which contains a single &#8220;<code>NOP<\/code>&#8220;.<br \/>\nNow that we&#8217;ve seen the instructions, let&#8217;s look at a classic &#8220;Hello world&#8221; program:<br \/>\n; Hello, world in SMITH &#8211; version 2 (loop)<br \/>\n; R0 -&gt; index into string (starts at R10)<br \/>\n; R2 -&gt; -1<br \/>\nMOV R0, 10<br \/>\nMOV R2, 0<br \/>\nSUB R2, 1<br \/>\nMOV R[R0], &#8220;Hello, world!&#8221;<br \/>\nMOV TTY, R[R0]<br \/>\nSUB R0, R2<br \/>\nMOV R1, R0<br \/>\nSUB R1, 23<br \/>\nNOT R1<br \/>\nNOT R1<br \/>\nMUL R1, 8<br \/>\nCOR +1, -7, R1<br \/>\n* <code>MOV,MOV,SUB<\/code>L This starts by putting the value &#8220;10&#8221; into R0, and -1 into R2. 10 is the address where the &#8220;hello world&#8221; string is going to start; and the &#8220;-1&#8221; is going to be used for incrementing a pointer to the current index in the string. Since there is no &#8220;ADD&#8221; instruction, we have to stick with subtracting -1.<br \/>\n* <code>MOV R[R0],string<\/code>. Store &#8220;Hello world&#8221; starting at location 10.<br \/>\n* <code>MOV TTY, R[R0]<\/code>. Output the value stored at the address contained by location 0, which is initially 10 &#8211; that&#8217;s the first character of the string.<br \/>\n* <code> SUB R0, R2<\/code>. Add one to R0, so that it points at the next character of the string.<br \/>\n* <code>MOV R1, R0<\/code>. Put the value of the pointer into the string into R1.<br \/>\n* <code>SUB R1, 23<\/code>. This is where it gets clever. 22 is the location of the last character of the string. So if R1 is now 23, then we&#8217;ve already output the last character. So by doing this subtraction, we end up with &#8220;0&#8221; in R1 if we&#8217;ve output the last character, or some negative value if we haven&#8217;t.<br \/>\n* <code>NOT R1\/NOT R1<\/code>. Converts whatever is in R1 into 0 (if we&#8217;ve output the last character already) or 1 (if we haven&#8217;t).<br \/>\n* <code>MUL R1, 8<\/code>. There are 8 instructions to copy if we want to do another iteration of the loop. So multiple R1 by 8. If there&#8217;s more iterations to be performed, R1 is now 8; if not, then R1 is now 0.<br \/>\n* <code>COR +1, -7, R1<\/code>. Copy R1 instructions to the *next* instruction address, starting from 7 instructions before here. So if we&#8217;ve already output all of the characters, this instruction copies 0 instructions, and we&#8217;re at the  end of the program. Otherwise, it copies 8 instructions &#8211; which is the loop.<br \/>\nNow, is that brilliantly evil, or what?<br \/>\nHere&#8217;s another example &#8211; really quite a remarkably clever one. It does brace matching on  an input file.<br \/>\n; Check input file for matching brackets in SMITH!<br \/>\n; Prints nothing if there was an error or &#8216;OK&#8217; if brackets match.<br \/>\n; R0 -&gt; stack pointer (starts at R32)<br \/>\n; R1 -&gt; work register<br \/>\n; R2 -&gt; -1<br \/>\n; R3 -&gt; input char<br \/>\n; R4 -&gt; scratch temp copy of PC<br \/>\n; R5 -&gt; scratch temp copy of *<br \/>\n; R6 -&gt; scratch for PC\/* arithmetic<br \/>\n; R7 -&gt; 2<br \/>\n; R8 -&gt; 9<br \/>\n; R9 -&gt; scratch char<br \/>\n; R10 -&gt; 15<br \/>\n; R11 -&gt; 1<br \/>\n; set up stack<br \/>\nMOV R0, 32<br \/>\nMOV R2, 0<br \/>\nSUB R2, 1<br \/>\nMOV R8, 9<br \/>\nMOV R7, 2<br \/>\nMOV R10, 15<br \/>\nMOV R11, 1<br \/>\n; push &#8216;S&#8217;<br \/>\nMOV R[R0], &#8220;S&#8221;<br \/>\nSUB R0, R2<br \/>\n; LABEL MainLoop<br \/>\n; read char<br \/>\nMOV R3, TTY<br \/>\n; if char == &#8216;{&#8216; push<br \/>\nMOV R1, R3<br \/>\nMOV R[R8], &#8220;{&#8221;<br \/>\nSUB R1, R9<br \/>\nNOT R1<br \/>\nMUL R1, 2      ; LENGTH PushLeftBrace<br \/>\nMOV R4, PC     ; R4 = PC<br \/>\nMOV R5, *<br \/>\nMOV R6, 0<br \/>\nSUB R6, 10048; bytes between here and PushLeftBrace<br \/>\nSUB R5, R6     ; R5 = PushLeftBrace<br \/>\nSUB R5, R4     ; R5 = PushLeftBrace &#8211; PC<br \/>\nBLA +2, NOP, R7<br \/>\nCOR +1, R5, R1<br \/>\nNOP<br \/>\nNOP<br \/>\n; if char == &#8216;}&#8217; pop<br \/>\nMOV R1, R3<br \/>\nMOV R[R8], &#8220;}&#8221;<br \/>\nSUB R1, R9<br \/>\nNOT R1<br \/>\nMUL R1, 15     ; LENGTH PopLeftBrace<br \/>\nMOV R4, PC     ; R4 = PC<br \/>\nMOV R5, *<br \/>\nMOV R6, 0<br \/>\nSUB R6, 10035; bytes between here and PopLeftBrace<br \/>\nSUB R5, R6     ; R5 = PopLeftBrace<br \/>\nSUB R5, R4     ; R5 = PopLeftBrace &#8211; PC<br \/>\nBLA +2, NOP, R10<br \/>\nCOR +1, R5, R1<br \/>\nREP 15 NOP<br \/>\n; if char != 0 goto MainLoop<br \/>\nMOV R1, R3<br \/>\nNOT R1<br \/>\nNOT R1<br \/>\nMUL R1, 49     ; LENGTH MainLoop + 1<br \/>\n; LENGTH MainLoop<br \/>\nCOR +1, -48, R1<br \/>\nREP 10000 NOP                ; This space intentionally left blank<br \/>\n; x = pop<br \/>\nSUB R0, 1<br \/>\nMOV R1, R[R0]<br \/>\n; if x != &#8216;S&#8217; stop<br \/>\nMOV R[R8], &#8220;S&#8221;<br \/>\nSUB R1, R9<br \/>\nNOT R1<br \/>\nNOT R1<br \/>\nCOR +1, +6, R1<br \/>\nNOP<br \/>\n; print &#8216;OK&#8217;<br \/>\nMOV R[R8], &#8220;O&#8221;<br \/>\nMOV TTY, R9<br \/>\nMOV R[R8], &#8220;K&#8221;<br \/>\nMOV TTY, R9<br \/>\nSTOP<br \/>\n; LABEL PushLeftBrace<br \/>\n; push &#8216;{&#8216;<br \/>\nMOV R[R0], &#8220;{&#8221;<br \/>\nSUB R0, R2<br \/>\n; LENGTH PushLeftBrace == 2<br \/>\n; LABEL PopLeftBrace<br \/>\n; x = pop;<br \/>\nSUB R0, 1<br \/>\nMOV R1, R[R0]<br \/>\n; if x != &#8220;{&#8221; stop<br \/>\nMOV R[R8], &#8220;{&#8221;<br \/>\nSUB R1, R9<br \/>\nNOT R1<br \/>\nNOT R1<br \/>\nMOV R4, PC     ; R4 = PC<br \/>\nMOV R5, *<br \/>\nMOV R6, 0<br \/>\nSUB R6, 1      ; bytes between here and Halt<br \/>\nSUB R5, R6     ; R5 = Halt<br \/>\nSUB R5, R4     ; R5 = Halt &#8211; PC<br \/>\nBLA +2, NOP, R11<br \/>\nCOR +1, R5, R1<br \/>\nNOP<br \/>\n; LENGTH PopLeftBrace == 15<br \/>\n; LABEL Halt<br \/>\nSTOP<br \/>\n; LENGTH Halt == 1<br \/>\nBecause I&#8217;m a thoroughly insane person, I&#8217;m actually working on a brainfuck interpreter in SMITH. If I ever manage to get it working, I&#8217;ll be sure to post it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Joy of joys, [Cat&#8217;s Eye Technologies](http:\/\/catseye.mine.nu:8080\/projects\/), the home of Chris Pressey, one of the most prolific designers of thoroughly bizarre languages is back up! And so, today, we&#8217;re going to take a look at one of his masterpieces, the language *Smith*. This is one of my personal all-time favorite pathological languages; it&#8217;s positively brilliantly twisted. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[92],"tags":[],"class_list":["post-170","post","type-post","status-publish","format-standard","hentry","category-pathological-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-2K","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/170","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=170"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/170\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=170"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=170"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=170"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}