{"id":138,"date":"2006-08-31T09:12:17","date_gmt":"2006-08-31T09:12:17","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/08\/31\/why-oh-why-y\/"},"modified":"2006-08-31T09:12:17","modified_gmt":"2006-08-31T09:12:17","slug":"why-oh-why-y","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/08\/31\/why-oh-why-y\/","title":{"rendered":"Why oh why Y?"},"content":{"rendered":"<p>So in the last few posts, I&#8217;ve been building up the bits and pieces that turn lambda calculus into a useful system. We&#8217;ve got numbers, booleans, and choice operators. The only thing we&#8217;re lacking is some kind of repetition or iteration.<br \/>\nIn lambda calculus, all iteration is done by recursion. In fact, recursion is a pretty natural way of expressing iteration. It takes a bit of getting used to, but anyone who&#8217;s programmed in a language like Scheme, ML, or Haskell for a while gets very used to idea, and feels frustrated coming back to a language like Java, where you need to write loops.<br \/>\nIt can be a bit difficult if you&#8217;re not used to thinking recursively. I wrote [an explanation of recursion][recursion] which you can go read if you&#8217;re not used to what recursion is or how it works.<br \/>\nbe found <a href=\"\">here<\/a>). But since functions in lambda calculus don&#8217;t have names, that means that we resort to something tricky. It&#8217;s called the Y combinator, aka the lambda fixed point operator.<br \/>\nLet&#8217;s start by looking at a simple recursive function outside of the lambda calculus. The factorial function, n!, is the standard example:<br \/>\n<code><br \/>\nfactorial(n) = 1 if n = 0,<br \/>\nor factorial(n) = n*factorial(n-1) if n &gt; 0<br \/>\n<\/code><br \/>\nIf we want to start trying to write that in lambda calculus, we&#8217;d need a couple of tools&#8230; We need a test for equality to zero, and we need a way of  multiplying numbers; and we need a way of subtracting one.<br \/>\nFor testing equality to zero, we&#8217;ll use a function named *IsZero*, which takes three parameters: a number, and two values. If the number is 0, it returns the first value; if it&#8217;s not 0, then it returns the second value.<br \/>\nFor multiplication &#8211; multiplication is an iterative algorithm, so we can&#8217;t write multiplication until we work out recursion. But we&#8217;ll just handwave that for now, and have a function *Mult x y*.<br \/>\nAnd finally, for subtracting one, we&#8217;ll use *Pred x* for the predecessor of x &#8211; that is, x &#8211; 1.<br \/>\nSo &#8211; a first stab at factorial, written with the recursive call left with a blank in it, would be:<\/p>\n<p>*&lambda; n . IsZero n 1 (Mult n (**something** (Pred n)))*<\/p>\n<p>Now, the question is, what kind of &#8220;something&#8221; can we plug in to there? What we&#8217;re really like to do is plug in a copy of the function itself:<\/p>\n<p>*Fact &equiv; &lambda; n . IsZero n 1 (Mult n (Fact (Pred n)))*<\/p>\n<p>How can we do that? Well, the usual way of plugging something in to a lambda calculus function is through a parameter:<\/p>\n<p>*Fact &equiv; (&lambda; f n . IsZero n 1 (Mult n (f (Pred n)))) Fact*<\/p>\n<p>Of course, we can&#8217;t plug in a copy of the function as its own parameter that way: the name *Fact* doesn&#8217;t exist in the expression in which we&#8217;re trying to use it. You can&#8217;t use an undefined name &#8211; and in lambda calculus, the *only* way to bind a name is by passing it as a parameter to a &lambda; expression. So what can we do?<br \/>\nThe answer is to use something called a *combinator*. A combinator is a special kind of *higher order function* which can be defined without reference to anything but function applications. (A higher order function is a function which takes functions as parameters and returns functions as results). The Y combinator is the special, almost magical function that makes recursion possible. Here&#8217;s what it looks like:<\/p>\n<p>Y &equiv; &lambda; y . (&lambda; x . y (x x)) (&lambda; x . y (x x))<\/p>\n<p>If you look at it, the reason for calling it Y is because it is *shaped* like a Y. To show you that more clearly, sometimes we write lambda calculus using trees. Here&#8217;s the tree for the Y combinator:<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/homepage.mac.com\/markcc\/lambda\/Y.jpg?w=625\" \/><br \/>\nWhy is the Y combinator an answer to our problem in defining the factorial function? The Y combinator is something called a *fixed point* combinator. What makes it special is the fact that for any function *f*, *Y f* evaluates to *f Y f*; which evaluates to *f (f Y f)*; which evaluates to *f (f (f Y f))*. See why it&#8217;s called Y?<br \/>\nLet&#8217;s try walking through &#8220;*Y f*&#8221;:<br \/>\n1. Expand Y: &#8220;*(&lambda; y . (&lambda; x . y (x x)) (&lambda; x . y (x x))) f*&#8221;<br \/>\n2. &beta;: &#8220;*(&lambda; x . f (x x)) (&lambda; x . f (x x))*<br \/>\n3. &beta; again: &#8220;*f  (&lambda; x . f (x x)) (&lambda; x . f (x x))*&#8221;<br \/>\n4. Since &#8220;*Y f = (&lambda; x . f (x x)) (&lambda; x . f (x x))*&#8221;, what we just got in step three is &#8220;f Y f&#8221;.<br \/>\nSee, there&#8217;s the magic of &#8220;Y&#8221;. No matter what you do, you can&#8217;t make it consume itself. Evaluating &#8220;*Y f*&#8221; will produce another copy of *f*, and leave the &#8220;Y f&#8221; part untouched.<br \/>\nSo how do we use this crazy thing?<br \/>\nRemember our last attempt at defining the factorial function? Let&#8217;s look at it again:<\/p>\n<p>*Fact &equiv; (&lambda; f n . IsZero n 1 (Mult n (f (Pred n)))) Fact*<\/p>\n<p>Let&#8217;s rewrite it just a tiny bit to make it easier to talk about:<\/p>\n<p>*Metafact &equiv; (&lambda; f n . IsZero n 1 (Mult n (f (Pred n))))*<br \/>\n*Fact &equiv; Metafact Fact*<\/p>\n<p>Now &#8211; the trick is, &#8220;Fact&#8221; is not an identifier defined inside of &#8220;Fact&#8221;. How do we let &#8220;Fact&#8221; reference &#8220;Fact&#8221;? Well, we did a lambda abstraction to let us pass the &#8220;Fact&#8221; function as a parameter; so what we needed to do is to find a way to write &#8220;Fact&#8221; that lets us pass it to itself as a parameter.<br \/>\nWhat does &#8220;Y f&#8221; do? It expands into a call to &#8220;f&#8221; with &#8220;Y f&#8221; as its first parameter. And &#8220;Y f&#8221; will expand into &#8220;f Y f&#8221; &#8211; that is, a call to &#8220;f&#8221; with &#8220;Y f&#8221; as its parameter. In other words, Y f turns &#8220;f&#8221; into a recursive function with *itself* as its first parameter.<br \/>\nSo the factorial function is:<\/p>\n<p>*Fact &equiv; Y Metafact*<\/p>\n<p>*(Y metafact)* is the parameter value of &#8220;f&#8221; in the metafact lambda; when we do &beta; on the function, if n is zero, then it just returns 1. If it&#8217;s not zero, then we get the call to *f (Pred n)*.  *f* betas to *Y metafact*. Which does that funky magic copying thing, giving us *metafact (Y metafact) (Pred n)*.<br \/>\nVoila, recursion. s<br \/>\nI learned about the Y combinator back in my undergrad days, which would place it around 1989 &#8211; and I still find it rather mystifying. I do understand it now, but I can&#8217;t imagine how on earth anyone ever figured it out!<br \/>\nIf you&#8217;re interested in this, then I <em>highly<\/em> recommend getting a copy of the book [The Little Schemer][schemer]. It&#8217;s a wonderful little book &#8211; set up like a childrens&#8217; book, where the front of each page is a question; and the back of each page is the answer. It&#8217;s written in a delightfully playful style, it&#8217;s very fun and engaging, and it will not only teach you to program in Scheme.<br \/>\nAs an important side-note there are actually a couple of different versions of the Y combinator. There are different ways of evaluating lambda calculus: given an expression like *(&lambda; x y . x * y) 3 ((&lambda; z. z * z) 4)*&#8221;<br \/>\nwe can do it in two different orders: we can first do the beta on &#8220;*(&lambda; x y . x * y)*&#8221;,which would give us: &#8220;*3 * ((&lambda; z . z * z) 4)*&#8221;.<br \/>\nOr, we could beta &#8220;*((&lambda; z . z * z) 4)*&#8221; first: &#8220;*(&lambda; x y . x * y) 3 (4 * 4)*&#8221;. Nn this case, the two orders end up with the same result; but that&#8217;s not always the case. Sometimes the order of evaluation matters &#8211; and the way that the Y combinator works is one of those times. One order will result in expanding the Y infinitely; the other will result in a clean recursive function.<br \/>\nThe first order is what we call *lazy evaluation*:  don&#8217;t evaluate the parameters to a function until they&#8217;re needed. (This is also pretty much the same thing as what we sometime call *by name* parameter passing.) The second is called *eager evaluation* : always evaluate parameters *before* the functions that they&#8217;re passed to. (In real programming languages, Lisp, Scheme, and ML are lambda-calculus based languages that use eager evaluation; Haskell and Miranda are lambda calculus based languages that use lazy evaluation.) The Y combinator I described above is the Y for *lazy* evaluation. If we used eager evaluation, then Y combinator above wouldn&#8217;t work &#8211; in fact, it would copy Ys forever. There is another version of Y which works for eager evaluation &#8211; in fact, it&#8217;s the one described in &#8220;The Little Schemer&#8221;, and they explain it so much better than I do that I&#8217;ll just recommend again that you head for whatever bookstore you prefer, and buy yourself a copy.<br \/>\n[recursion]: http:\/\/goodmath.blogspot.com\/2006\/03\/clarifying-recursion.html<br \/>\n[schemer]: http:\/\/www.amazon.com\/gp\/redirect.html?link_code=ur2&amp;tag=goodmathbadma-20&amp;camp=1789&amp;creative=9325&amp;location=\/gp\/search%3F%26index=books%26keywords=little%20schemer%26_encoding=UTF8<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So in the last few posts, I&#8217;ve been building up the bits and pieces that turn lambda calculus into a useful system. We&#8217;ve got numbers, booleans, and choice operators. The only thing we&#8217;re lacking is some kind of repetition or iteration. In lambda calculus, all iteration is done by recursion. In fact, recursion is a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[32],"tags":[],"class_list":["post-138","post","type-post","status-publish","format-standard","hentry","category-lambda-calculus"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-2e","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/138","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=138"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/138\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=138"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=138"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=138"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}