{"id":718,"date":"2008-12-19T11:49:44","date_gmt":"2008-12-19T11:49:44","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2008\/12\/19\/fitness-landscapes-evolution-and-smuggling-information\/"},"modified":"2008-12-19T11:49:44","modified_gmt":"2008-12-19T11:49:44","slug":"fitness-landscapes-evolution-and-smuggling-information","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2008\/12\/19\/fitness-landscapes-evolution-and-smuggling-information\/","title":{"rendered":"Fitness Landscapes, Evolution, and Smuggling Information"},"content":{"rendered":"<p> Since my post a couple of weeks ago about NASA and the antenna evolution experiment,<br \/>\nI&#8217;ve been meaning to write a followup. In both comments and private emails, I&#8217;ve gotten<br \/>\na number of interesting questions about the idea of fitness landscapes, and some of the things<br \/>\nI mentioned in brief throwaway comments. I&#8217;m finally finding the time to write about<br \/>\nit now.<\/p>\n<p><!--more--><\/p>\n<h3>Fitness Landscapes<\/h3>\n<p> First, let me review what a fitness landscape is.<\/p>\n<p> Suppose you&#8217;re searching for something. There are N parameters that define the thing you&#8217;re searching for. Then we can say that you&#8217;re doing a search for a target <em>within N dimensions<\/em>. The N-dimensional space containing all of the possible values for all of the dimensions is called the <em>search space<\/em>.<\/p>\n<p> You can model that search as an iterative function, F, which takes<br \/>\nas parameters a tuple of N values defining a current position in the search space,<br \/>\nand returns a new position in the search space.<\/p>\n<p> So you start at some arbitrary position in your search space, and that&#8217;s<br \/>\nthe first argument to F. Then each time you call F, it returns a new position in<br \/>\nthe search space, which is hopefully closer to the target of the search.<\/p>\n<p> How do you evaluate the quality of a position in the search space?<\/p>\n<p> You define an evaluation function, E. E takes a position in the search space,<br \/>\nand returns a real number which describes the quality of the position in terms<br \/>\nof the particular search. The higher the value returned by E for a point, the<br \/>\nbetter the quality of the target described by the position in the search space. Expressed using the evaluation, the goal is to find the position p within the space that<br \/>\nmaximizes E(p).<\/p>\n<p> Now, suppose you want a way of visualizing the search. It&#8217;s easy: define<br \/>\na new space with N+1 dimensions, where the N+1th has the value of<br \/>\ncalling E on the first N dimensions. Call this new, N+1th dimension &#8220;height&#8221;. Now,<br \/>\nthe surface produced by doing this is the evaluation landscape. If your search is<br \/>\nan evolutionary process, the evaluation landscape is called a <em>fitness landscape<\/em><\/p>\n<p> As usual, an example makes this easier. Suppose that you&#8217;re searching<br \/>\nin two dimensions. Then your position at any time is just a position in<br \/>\na two-dimensional cartesian graph. To visualize the search space, you create<br \/>\na three dimensional graph, where the quality of the search result at (x,y) is<br \/>\nthe value of the Z coordinate. <\/p>\n<p> For example, imagine that the fitness function is something nice and simple, with<br \/>\na single obvious maximum: E(x,y) = sin(x) + cos(y). The maximum will naturally be 2, and<br \/>\none of the places where it will occur is at (&pi;\/2,0). Then your fitness landscape ends up looking like the graph below.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"periodic-landscape.gif\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_346.gif?resize=366%2C172\" width=\"366\" height=\"172\" \/><\/p>\n<p> We can also get some much wilder landscapes, like the one below, which is defined by E(x,y)=sin(xy)cos(xy) + (3-xy)\/((xy)<sup>2<\/sup> + 1). It&#8217;s still structured,<br \/>\nbut it&#8217;s a lot more complicated.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"complex-landscape.gif\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_347.gif?resize=411%2C322\" width=\"411\" height=\"322\" \/><\/p>\n<p> And we can get totally unstructured landscapes, where each point is complely random, like this one:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"random-landscape.gif\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_348.gif?resize=357%2C262\" width=\"357\" height=\"262\" \/><\/p>\n<h3> Static Versus Dynamic Landscapes and Adding Dimensions<\/h3>\n<p> I&#8217;ve frequently criticized creationists for talking about a <em>static<\/em> fitness landscape. In a throwaway comment, I mentioned that you <em>can<\/em> use a static landscape by adding dimensions. This comment really confused a lot of people.<\/p>\n<p> Suppose you&#8217;ve got a search of a two-dimensional space. Now, we&#8217;re going to make it more<br \/>\ncomplex. Instead of the search function always being the same, we&#8217;re going to actually have each<br \/>\ninvocation of the search function return both a point in the search space, <em>and<\/em> a new<br \/>\nfitness function. So we start with an initial search function, f<sub>0<\/sub> and some initial<br \/>\npoint (x<sub>0<\/sub>, y<sub>0<\/sub>) in the search space. The first search step evaluates<br \/>\nf<sub>0<\/sub>(x<sub>0<\/sub>, y<sub>0<\/sub>), getting (x<sub>1<\/sub>,y<sub>1<\/sub>) and<br \/>\nf<sub>1<\/sub>. The second search step evaluates f<sub>1<\/sub>(x<sub>1<\/sub>, y<sub>1<\/sub>),<br \/>\ngetting (x<sub>2<\/sub>,y<sub>2<\/sub>) and f<sub>2<\/sub>. So the search function is<br \/>\n<em>changing<\/em> over time.<\/p>\n<p> Assuming we&#8217;re in the world of computable functions, every possible value for<br \/>\nf<sub>i<\/sub> can be represented by an integer, F<sub>i<\/sub>. So we can re-model this as three-dimensional search, where instead of evaluating f<sub>i<\/sub>(x<sub>i<\/sub>, y<sub>i<\/sub>) getting f<sub>i+1<\/sub> and (x<sub>i+1<\/sub>, y<sub>i+1<\/sub>), we&#8217;ll evaluate<br \/>\na new three-dimensional search function f'(x<sub>i<\/sub>, y<sub>i<\/sub>, F<sub>i<\/sub>)<br \/>\ngetting back (x<sub>i+1<\/sub>, y<sub>i+1<\/sub>, F<sub>i+1<\/sub>). So we&#8217;ve taken the<br \/>\nway that the search function itself can change, and incorporated it into the<br \/>\nsearch space, at the cost of making the search space itself more complex.<\/p>\n<p> For any way that a search space or search process changes over time, we can<br \/>\nincorporate that change into the search space by adding dimensions. This can give us<br \/>\na fixed fitness landscape &#8211; but it does so at the cost of adding lots of dimensions.<\/p>\n<p> So now we&#8217;ve got a measure of the quality of the search result at any particular step. How<br \/>\ncan we define the <em>performance<\/em> of a search over time? There are several ways of doing it; the easiest to understand is as the limit of a function P, such that at time T, P(T) = (&sigma;<sub>i=0,T<\/sub>E(x<sub>t<\/sub>, p<sub>t<\/sub>))\/T; that is, the mean of<br \/>\nthe evaluation result at each step, divided by the number of steps. So take P(T), and<br \/>\ncompute its limit as T approaches infinity &#8211; that&#8217;s one reasonable way of describing the<br \/>\nperformance of a search. What it&#8217;s really doing is computing the<br \/>\naverage fitness result of the search. There are others measures that also take<br \/>\nthe rate of convergence on a solution into account, but for simplicity, the simple measure is<br \/>\nadequate.<\/p>\n<h3>No Free Lunch<\/h3>\n<p> Now, to move on to a bit of NFL-related stuff.<\/p>\n<p> The main fitness-landscape argument used by creationists against evolution is based<br \/>\non Dembski&#8217;s No Free Lunch work. NFL is a family of theorems which (stated informally) says:<br \/>\naveraged over all possible fitness landscapes, no search algorithm can possibly do<br \/>\nbetter than a random walk.<\/p>\n<p> Given a search function and a landscape, you can compute a performance measure. Given a set of multiple landscapes, you can compute the performance of your search function on each landscape separately. Then you can describe the average performance of your search on your set of landscapes. If the set of landscapes is enumerable, then you can (theoretically) compute the average performance of your search function over all possible landscapes. (Even if the the set of landscapes isn&#8217;t enumerable, you can still talk theoretically about the average performance over all landscapes; you just can&#8217;t compute it.)<\/p>\n<p> So &#8211; given any reasonable performance metric, no search algorithm can possible do better than random guessing over the set of all possible fitness landscapes.<\/p>\n<p> There are places where this theorem is actually interesting. But they&#8217;re not the places<br \/>\nthat creationists like Dembski continually invoke it.  (The main interesting application<br \/>\nif it is in game theory, where you can show things like &#8220;There&#8217;s no universal<br \/>\ngame winning algorithm.&#8221;)  In something like evolution, it&#8217;s silly for one simple reason: the &#8220;all possible landscapes&#8221; clause.<\/p>\n<p> Evolution doen&#8217;t work over all possible landscapes, and no sane person would argue<br \/>\nthat it could.<\/p>\n<p> When we talk about fitness landscapes, we tend to think of nice, smooth, continuous<br \/>\nsurfaces, with clearly defined maxima and minima (hilltops and valley bottoms). But all<br \/>\npossible landscapes doesn&#8217;t just include smooth landscapes. It includes random ones. In<br \/>\nfact, if you work out the set of all possible landscapes, there are a <em>lot<\/em> more<br \/>\ndiscontinuous random ones than there are structured, smooth ones.<\/p>\n<p> So in the NFL scenario, you need to be able to work in a nice smooth landscape, like<br \/>\nthis one:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"2d-simple-landscape.gif\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_349.gif?resize=360%2C228\" width=\"360\" height=\"228\" \/><\/p>\n<p> It&#8217;s easy to see how a search function could find the maximum in that one-dimensional<br \/>\nseach space &#8211; it&#8217;s got a very clean fitness landscape. On the other hand, NFL also requires<br \/>\nyou to be able to find the maximum in a totally random landscape &#8211; like the one below:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"2d-random-landscape.gif\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_350.gif?resize=360%2C228\" width=\"360\" height=\"228\" \/><\/p>\n<p> It should be obvious that there&#8217;s absolutely no meaningful way to search for the maximum value in that seach space. There&#8217;s nothing in the fitness landscape that can provide the slightest clue of how to identify a maximum.<\/p>\n<p> Evolution works in landscapes with structure. Another way of putting that is that<br \/>\nevolution works in landscapes where the result of a search step provides feedback about the<br \/>\nstructure of the landscape. But the key takeaway here is that NFL doesn&#8217;t provide any meaningful rebuttal to information, because we don&#8217;t expect evolutionary search to<br \/>\nwork in all possible landscapes!<\/p>\n<p> In terms of evolution, the average performance over all possible landscapes is a strange<br \/>\nidea. And it&#8217;s obvious on an intuitive level why there&#8217;s no way that a single search function<br \/>\ncan possibly perform well on all possible landscapes. On some landscapes, it will do really<br \/>\nwell. On some (the random ones, which make up the majority of all possible landscapes), it<br \/>\nwill be effectively the same as a random walk. And on others, it will perform worse than<br \/>\nrandom, because the adversarial landscape is perfectly structured to make the search strategy<br \/>\ntravel towards the worst possible places.<\/p>\n<h3>Smuggling Information<\/h3>\n<p> Lately, Dembski and friends have been taking a new tack, which involves talking about<br \/>\n&#8220;smuggling information&#8221;. They&#8217;ve been using the NFL argument for years, but they&#8217;ve<br \/>\nrun into a rather serious problem: evolution works.<\/p>\n<p> As evolutionary algorithms have become well known, and have had some astonishing<br \/>\nsuccesses, it&#8217;s become difficult to make the argument that an evolutionary process<br \/>\ncan&#8217;t possibly succeed. So it became necessary to come up with an excuse for why<br \/>\nevolutionary algorithms work so amazingly well, even though the creationists<br \/>\nmathematical argument shows that it should be impossible.<\/p>\n<p> Their excuse is to basically accuse the experiments that use evolutionary<br \/>\nalgorithms of cheating.  They argue that the reason that the evolutionary mechanism can<br \/>\nsucceed is because information was &#8220;smuggled&#8221; into the system &#8211; that, essentially, the<br \/>\nsearch algorithm encodes information about the landscape which allows it to succeed. They<br \/>\nargue that this is cheating &#8211; that the evolutionary process really has nothing do with the<br \/>\nsuccess of the search, because the &#8220;solution&#8221; is really encoded into the structure<br \/>\nof the problem via the smuggled information.<\/p>\n<p> There are two responses to this.<\/p>\n<p> The first one is, in a word, &#8220;Duh!&#8221;. That is, <em>of course<\/em> there&#8217;s information<br \/>\nabout the landscape in the system. As I discussed above, there&#8217;s no such thing as a search<br \/>\nalgorithm that works on all landscapes, but for landscapes with particular properties, there<br \/>\nare search algorithms that are highly successful. If you look at it from an<br \/>\ninformation-theoretic viewpoint, <em>any<\/em> search algorithm which can successfully operate<br \/>\nin a particular search space encodes information about the space into its structure. From<br \/>\nthe viewpoint of math, this is just totally, blindingly obvious.<\/p>\n<p> And it&#8217;s not a problem for biological evolution. Biological evolution is based<br \/>\non mutation, reproduction, and differential success. That is, it&#8217;s a process<br \/>\nwhere you have a population of reproducing individuals, where the children are<br \/>\nslightly different from the parents. Some of the children survive and reproduce,<br \/>\nand some don&#8217;t. This process clearly only works within a particular kind of search space;<br \/>\nthat is, a search space where the survival to reproduction of a subset of the population<br \/>\nindicates that that subset of the population has a higher fitness value. <\/p>\n<p> Evolution, modelled as a search, requires certain properties in its search space<br \/>\nfor it to work. The information smuggling argument basically claims that that<br \/>\nmeans that it can&#8217;t work. But <em>every<\/em> successfull search algorithm<br \/>\nhas certain requirements for he search-space in which it operates. By the<br \/>\narguments of Demski and friends, there is no such thing as a successful search<br \/>\nthat doesn&#8217;t cheat.<\/p>\n<p> The second response to the argument is a bit more subtle.<\/p>\n<p> If you look at the evolutionary process, it&#8217;s most like the iterative search process<br \/>\ndescribed towards the beginning of this post. The &#8220;search function&#8221; isn&#8217;t really static over<br \/>\ntime; it&#8217;s constantly changing. At any point in time, you can loosely think of the search<br \/>\nfunction for a given species as exploring some set of mutations, and selecting the ones that<br \/>\nallow them to survive. After a step, you&#8217;ve got a new population, which is going to have new<br \/>\nmutations to explore. So the search function itself is changing. And how is it changing?<br \/>\nModelled mathematically, it&#8217;s changing by <em>incorporating information about the<br \/>\nlandscape<\/em>.<\/p>\n<p> There&#8217;s a really fascinating area of computer science called inductive learning machine<br \/>\ntheory. It was invented by a guy named John Case, who&#8217;s a friend of mine; he was a member of<br \/>\nmy PhD in grad school. John is interested in understanding how learning works on a<br \/>\nmechanistic level. What he does is model it as an inductive process.<\/p>\n<p> You&#8217;ve got a target function, T. The goal of the learning machine is<br \/>\nto find a program for T. The way that it does that is by continually refining its<br \/>\nprogram. So you give it T(0), and it outputs a guess at a program for T. Then you give<br \/>\nit T(1), and it outputs a better guess (it should <em>at least<\/em> be correct for<br \/>\nT(0) and T(1)!). Then you give it T(2), and it outputs another guess. Each step, it<br \/>\ngenerates a new program guessing how to compute T.<\/p>\n<p> According to Dembski, the process by which the learning machine improves its guesses is<br \/>\n<em>cheating<\/em>.<\/p>\n<p> Evolution is really doing exactly the same thing as John&#8217;s inductive learning machines:<br \/>\neach step, it&#8217;s gaining information from the environment. You can say that the evolutionary<br \/>\nsearch at step N has <em>incorporated the results<\/em> of steps 0..N-1. Information about the<br \/>\nenvironment has naturally been incorporated. In fact, if you think about it, it&#8217;s<br \/>\n<em>obvious<\/em> that this is the case &#8211; and that this is exactly <em>why<\/em>, on a<br \/>\nmathematical level, evolution works.<\/p>\n<p> Go back to the NASA experiment for a moment. The search space of possible antennas is<br \/>\nhuge &#8211; but it&#8217;s structured and continuous. How does evolution find the best antenna? It tries<br \/>\nlots of options &#8211; and based on the results of those options, it discovers something about the<br \/>\nstructure of the search space &#8211; it discovers that <em>some set<\/em> of solutions work better<br \/>\nthan others, and it tries to search forward from those. By doing that, it&#8217;s pruned out huge<br \/>\nareas of the search space &#8211; the areas rooted at the least successful tests. Eliminating a<br \/>\nlarge part of the search space <em>is a way of incorporating information about the search<br \/>\nspace<\/em>: it&#8217;s adding the information that &#8220;the solution isn&#8217;t over there&#8221;.<\/p>\n<p> Pointing out that an evolutionary process works in a particular set of landscapes, and<br \/>\nthat it&#8217;s got a way of acquiring and integrating information about the landscape that it&#8217;s<br \/>\nsearching isn&#8217;t a critique of evolution &#8211; it&#8217;s an explanation of how it works. Far from being<br \/>\na good argument against evolution, it&#8217;s an argument <em>for<\/em> evolution, because it shows<br \/>\nthe mechanism by which evolution is capable of finding solutions.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since my post a couple of weeks ago about NASA and the antenna evolution experiment, I&#8217;ve been meaning to write a followup. In both comments and private emails, I&#8217;ve gotten a number of interesting questions about the idea of fitness landscapes, and some of the things I mentioned in brief throwaway comments. I&#8217;m finally finding [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[30,31],"tags":[],"class_list":["post-718","post","type-post","status-publish","format-standard","hentry","category-information-theory","category-intelligent-design"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-bA","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/718","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=718"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/718\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=718"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=718"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=718"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}