The Problem with NFL: Breadth or Depth?

Despite having written about it before, I still get a lot of questions about William Dembski’s “No Free Lunch”
(NFL) theorems. One message recently contained the question in a particularly interesting form, so I thought I’d take
the opportunity to answer it with a post.

Here’s the question I received:

  1. Is the NFL theorem itself bad math?
  2. If the theorem itself is sound, what’s wrong with how it’s being applied? Is it a breadth issue
    or a depth issue?

The theorems themselves are actually good math. They’re valid given their premises; they’re even deep
and interesting theorems in their way. The problem isn’t the theorem: it’s the way that the theorem is abused to apply it to evolution. But what I found interesting is the idea of characterizing the problem as breadth or depth.

Before I get into the depth of the answer, let me quickly explain what the NFL theorems say.

Suppose you’ve got a multivariable function, f(x1,…,xn), called a landscape function. You want to use a search function to find the maximum value of f. NFL asks: if you know nothing about f, and all you can do is ask for the value of f applied to a specific set of parameters, can you write a search
function which can find the maximum of X?

NFL says no. If you know nothing about the landscape, no search function that you can write is guaranteed to do any
better at finding the optimum than a random walk through the landscape. More precisely, it says that given any
particular search function, the performance of that search function averaged over all possible landscapes is
no better than a random search.

Interestingly, you can view NFL as something pretty close to a variant of the halting problem. You can model
the halting problem is a landscape where each position in the landscape corresponds to a sequence of instructions, and the landscape value l at that point is a measure of the probability of the program halting after that sequence of instructions. (Of course, you can’t really construct that landscape; but you can construct one where you use a heuristic to approximate the halting probability, and only get it exactly right at places where the program
halts in an observed execution.) Then solving the halting problem is equivalent to searching that landscape for a point where l is either 0 or 1. In that case, NFL is saying, basically, that there’s no search function for that space
that does any better than randomly executing the program for a certain duration and seeing if it halts.

Hopefully, that gives you some idea of why I say there’s some actual depth to NFL. It is saying something about a
general problem: there is no universal search. If you don’t know anything about the properties of the landscape in
advance, then you can’t pick a good search function for that landscape. You need to understand something about what the
landscape looks like in order to design or select an effective search algorithm for it.

Back to the question: is the problem with NFL as an anti-evolution argument breadth or depth?

The answer is both.

The breadth problem is simple: the NFL formulation is too broad. NFL says that averaged over all fitness
landscapes, any given search function is no better than a random walk. Based in this, they say that if you model
evolution as a search function over the fitness landscape of survival, it can’t possibly do better than total randomness. That’s nonsense: evolution doesn’t need to work over all possible fitness landscapes. Biology and survival, modeled as a fitness landscape, isn’t all possible landscapes. It isn’t even a single randomly selected
landscape. It’s a highly constrained landscape. Evolution doesn’t need to be able to search any landscape – it just needs to be able to search one particular kind of landscape. If you constrain the properties of the landscape, then you absolutely can pick effective search functions.

For example: newtons method of root finding is a landscape search. It uses the slope at a point to guide it towards
the root of a function. It’s optimizing for the minimum distance from the true root: the landscape at any point
is the distance from a root. As anyone who’s taken college calculus knows that Newton’s method works extremely well for a large number of continuous functions. Dembski’s argument that NFL shows that evolution can’t possibly produce fit organisms could be applied, virtually without modification, to show that Newton’s method can’t possibly find roots. And it would be equally true: applied to arbitrary functions, Newton’s method will fail most of the time. Sometimes it
will work, if the function happens to have the right properties – but most functions don’t. Newton’s method requires functions to be continuous and differentiable – most functions aren’t even differentiable. And Newton’s method fails even on most differentiable functions.

Because NFL plays overly broadly with landscapes, its conclusions don’t make any sense applied to the real
phenomenon of life. You can’t use facts about the set of all possible landscapes to draw conclusions about
individual landscapes.

Now, let’s move on to the depth problem. If you look at how NFL models evolution in depth, you find that it’s
a terrible model. I’ve discussed the problems with modeling evolution as a fitness landscape before, so I won’t go in depth. But the entire idea of using a fitness landscape is a train wreck. Landscape search is based on static landscapes; life isn’t a static landscape. That, right there, is enough to utterly wreck the entire argument. Evolution isn’t landscape search. Fitness landscapes are a useful tool for modeling certain short term aspects of evolution; but as a general model, they do a terrible job. Looking at the problem in depth, and
you find that the model of biology and of evolution used in the NFL arguments are such a poor model of evolution that
there are no valid conclusions that can be drawn from it about the process as a whole.

0 thoughts on “The Problem with NFL: Breadth or Depth?

  1. Blake Stacey

    Before I get into the depth of the answer, let me quickly explain what the NFL theorems say.

    I was thinking you’d go for a pun here. You know, like this:

    Before I get into the depth of the answer, let me quickly explain the breadth of the NFL theorems.

    But hey, that’s just me.

    Reply
  2. Chris Harrison

    Interesting post Mark.
    Olle Häggström(http://www.math.chalmers.se/~olleh/) has published a paper called “Intelligent design and the NFL theorems”, part of the intro reads:
    After giving the necessary background on the mathematics of optimization theory and the NFL theorems in Sections ‘A few mathematical preliminaries’ and ‘Optimization and the NFL theorems,’ I will outline Dembski’s use of the latter in Section ‘Dembski’s application to evolution.’ Then, in Sections ‘A probabilistic interpretation of NFL’ and ‘Dembski’s error,’ I will demonstrate the main error in his argument and the irrelevance of NFL to evolutionary biology.
    If you are interested in Olle’s paper, I’d be happy to email it to you.

    Reply
  3. Steve P.

    Any mathematician would easily understand that the breadth of the theorem completely invalidates his claims about evolution? Therefore, he’s not just really bad at math, he’s lying and intentionally fooling people without some sort of mathematical education. I know that’s nothing new, just thought it needed to be pointed out again.
    Good article, Mark.

    Reply
  4. dave

    “Newton’s method requires functions to be continuous and differentiable – most functions aren’t even differentiable. And Newton’s method fails even on most differentiable functions.”
    It looks to me like all but the first “differentiable” there should have been “continuous”.

    Reply
  5. ToniPetrina

    Richard Wein gave an excellent explanation of NFL theorems and their misapplication in his debunking of Dembski’s book No Free Lunch.

    Reply
  6. Tyler DiPietro

    “Dembski’s argument that NFL shows that evolution can’t possibly produce fit organisms could be applied, virtually without modification, to show that Newton’s method can’t possibly find roots.”

    Not only that, but if you use Dembski’s logic to conclude that the entire field of Optimization Theory is useless. If Dembski’s position is that the NFL theorems prove that evolution can’t operate over their own constrained conditions, then you can use that same argument to show that no method for solving any constrained optimization problem (a TSP, a SAT, and NLP, etc.) will perform better than randomness. Why are we wasting our time?

    Reply
  7. Richard Gay

    Mark, I had you at “Suppose you’ve got a multivariable function…” I’m a reasonably intelligent engineer with only a few courses outside of the usual engineering requirements, but as soon as I read that paragraph it was clear where the fault lay. As you so eloquently clarify, NFL is true in the case of an *arbitrary* unknown function, but a fitness landscape function is anything but arbitrary. I’d venture to say that the only arbitrary functions are ones which we postulate in theory. Well done.

    Reply
  8. Walt

    The proof of the NFL theorems are themselves instructive. Assume that I know what search method you are using. Then the NFL argument is that I can construct search problems so that you do as badly as possible. So the theological implication is that in theory God could have created a world in which evolution could not work. Why God would do that is left unexplained.

    Reply
  9. Mark C. Chu-Carroll

    Steve:
    I tend to go back and forth between whether Dembski is a liar, or a raging incompetent. I look at the NFL stuff, and wonder how someone who could write that could be so clueless that they would not know that the anti-ev argument was nonsense.
    But then I look at some of the other stuff he’s written – like his specified complexity stuff – and it’s so dumb, with such egregious and pointless errors that do nothing to support his arguments – and I’m forced to question his competence. I know he’s dishonest enough to lie when it helps his argument. But when his errors do nothing to help his argument, but they make him look stupid, then I’m forced to think maybe he’s stupid. In which case, I want to know who wrote the math in NFL?

    Reply
  10. Daryl McCullough

    Mark,
    To me, the biggest obstacle with applying NFL (or any other optimization problem) to evolution is this: There is no reason to think that evolution ever discovers optimal solutions to anything. So proving that an optimization problem is impossible has no relevance.

    Reply
  11. Doug

    I did not expect NFL as “no free lunch”.
    ‘id’ is lower case bad math.
    Now NFL as “National Footbal League” does have interesting math. Does the quarterback rating system really define the value of that position. This may deserve analysis.
    Touchdown passes appear to be overrated when one considers that they are also included within completion percentage and yards per attempt.

    Reply
  12. Chris Noble

    To me, the biggest obstacle with applying NFL (or any other optimization problem) to evolution is this: There is no reason to think that evolution ever discovers optimal solutions to anything. So proving that an optimization problem is impossible has no relevance.

    That’s what strikes me immediately. Getting stuck in local minima is a problem for optimisation but who is to stay that there is anything wrong evolution getting stuck (at least for a while) in a local minimum. Does Dembski expect only one global solution with one single species?

    Reply
  13. Science Avenger

    NFL QB rating:
    (passing yards + 20 * completions + 80 * touchdowns – 100 * interceptions) / attempts
    + 0.5
    all multiplied by 25/6.
    No, I’m not kidding. That isn’t exactly how they list it, but if you algebraically simplify it, that’s it, assuming they haven’t changed it in the last 10 years.
    Oh yeah, Mark, nice dissection of Dembski, as usual.

    Reply
  14. Steve P.

    Mark,
    I haven’t read anything about NFL, but hasn’t something similar already been proposed in computer science? I’m recalling some proofs in my distributed systems course (I’m not a computer scientist), and they seem very similar to this theorem, especially in construction.
    Science Avenger,
    Your formula worked perfectly for Peyton Manning’s 2006 season. Based on the intersection with this single sample point, it’s probably right.

    Reply
  15. scottb

    Chris:
    > Does Dembski expect only one global solution with one single species?
    Of course he does. He’s not trying to do science, here. He’s trying to show that the god of the bible is the one true creator of everything and that humanity are his special project and that Christians are superior.

    Reply
  16. Torbjörn Larsson, OM

    As always when Divine Wind Dembski struts his stuff, lots of fun is had by all.
    Chris Harrison:
    As I remembered it Häggström has his papers posted on his site, and indeed his now published revision is here.
    Btw, since you made me look around, I noticed that he was among the signers from the Humanisterna (the Swedish Humanist Association) of an article in swedish press arguing for removing the law on freedom for religion. [Since the law on free speech is enough, and the current law only serves to grant religion special status.] I didn’t recognize him at the time.
    Thanks for making me look closer!

    Reply
  17. Torbjörn Larsson, OM

    ToniPetrina:
    Thanks. I was aware of Wein’s shorter critique of NFL on TalkReason, but I had missed the reference to his lengthier and mathier critique of the book.

    No Free Lunch consists of a collection of tired old antievolutionist arguments: god-of-the-gaps, irreducible complexity, tornado in a junkyard, and cosmological fine-tuning. Dembski attempts to give these old arguments a new lease of life by concealing them behind veils of confusing terminology and unnecessary mathematical notation. The standard of scholarship is abysmally low, and the book is best regarded as pseudoscientific rhetoric aimed at an unwary public which may mistake Dembski’s mathematical mumbo jumbo for academic erudition.

    Since Wesley Elsberry weighed in we should note that Wein thanks him.
    But also Jeffrey Shallit, the computer scientist and number theorist (Wikipedia) who famously characterized Dembski’s products pseudomathematics, as well as Erik Tellgren who IIRC published a paper indirectly showing that NFL is perhaps not applicable for coevolution even in a short term approximation (due to the dynamics that Mark notes). Good company all.

    Reply
  18. Torbjörn Larsson, OM

    Does Dembski expect only one global solution with one single species?

    Or as ERV points out, inspired by Mark’s latest post on Behe’s misuse of fitness landscapes, we have also quasispecies such as viruses that explores multiple maximum simultaneously, which may be another situation NFL doesn’t cover.
    Btw, in that situation the whole population cloud is never optimally globally fit but probably locally potentially maximally fit at any given time.

    Reply
  19. Noel Grandin

    You’ve said this before, but I don’t buy it, because, at any one point in time, the evolution fitness landscape is obviously static.
    Which means that the search function is chasing a constantly changing landscape.
    I admit that I couldn’t work this out myself, but I can’t obviously see how this would invalidate the NFL theorems.

    Reply
  20. SteveM

    #22 – I think the answer is the relative rates of change. If the landscape is changing faster than the search function, then, yes, you are doomed to random walk. I believe, though, that even though the landscape it is changing, the search function is faster and so is working with a relatively static landscape.
    original article: I can’t help thinking that the original question “is it a problem of breadth or depth” refers to common maze searching algorithms rather than referring to and “overly broad application” and a “not very deep understanding of the problem”. I’m probably wrong, but that’s just how I associate the set of words: “search”, “depth”, and “breadth”.

    Reply
  21. Rich

    Re #22.
    The Earth is changing due to tectonics / erosion, etc.. But I still think finding the highest points aren’t that hard to find. It would seem the biggest extinction events occurred after cataclysmic environmental changes (and I’m not talking Noah’s flood)…

    Reply
  22. franky172

    I always thought this was an interesting point:

    Sometimes it will work, if the function happens to have the right properties – but most functions don’t. Newton’s method requires functions to be continuous and differentiable – most functions aren’t even differentiable. And Newton’s method fails even on most differentiable functions.

    In fact, (AFAIK), the vast majority (i.e. with probability 1) any random function you pick from the set of all possible functions is a) nowhere differentiable, and b) nowhere continuous! So the vast majority of functions that the NFL theorem averages over are actually insanely complicated (imagine the following function: y = rand; that’s what most functions look like), and it’s easy to see that on these functions “hill climbing” is no better than “hill descending” for finding maxima!

    Reply
  23. Mark C. Chu-Carroll

    Noel, in #22:
    The point of the changing landscape is that modeling evolution as a search is a fundamentally flawed approach. It’s a bad model of evolution – and bad models give bad results.
    Search optimizations techniques, like what Dembski is proposing for a model of evolution, are based on a fixed terrain, which is independent of the properties of the searcher. But evolution, if you think of it as a search, is a system where the terrain doesn’t just change, but it changes in response to changes in the searcher.
    Fitness landscapes are a useful model for evolution in the short term – if you’re looking at mutations in an HIV virus in a pool of patients over a space of a couple of years, modeling it as an search over a fitness landscape is pretty useful. But that’s because you’ve constrained things in time enough that the changes to the landscape are effectively eliminated, because there isn’t enough time for the landscape to change significantly.
    Dembski wants to model evolution for the entire history of life on earth as traversal of a static fitness landscape. That’s just nonsense. Not because the basic idea
    of “over all possible landscapes, nothing is better than random walk” is necessarily wrong, but because the entire model is wrong, and conclusions from an invalid model are worthless.
    To give you a sort of metaphor: look at the expanding earth stuff by Neal Adams. It’s rubbish. But it does provide a supposed mechanism for gravity. If I draw conclusions about
    how gravity works from Adam’s model, even if my math is perfect, my conclusions will be rubbish – they won’t tell me anything about the real world. The basic model of how things work that they’re based on is incorrect, and so regardless of how hard I work, or how perfect my reasoning, how elegant my math – I can’t produce any meaningful result from that basis.
    NFL is perfectly good math. But if you try to use it as a model for something that doesn’t match its assumptions, it’s going to generate rubbish for results.

    Reply
  24. Brendan S

    #24:
    I just need to butt in here that finding the highest point on Earth is basically just a random search. You visually inspect until you find a local maxima, and then you measure the local maxima vs. the global maxima found everywhere. It’s just that for the vast majority of the landscape, the test is trivial.

    Reply
  25. Xanthir, FCD

    #27: That’s not correct. If it was a random search, you’d just be choosing random points. But when you use ‘visual inspection’ you aren’t choosing randomly, you are using a mental heuristic based on how things look to determine the highest point in view. Visual inspection, in this case, is a hill-climber with a bit of extra abilities bolted on that let it jump out of local maxima if a suspected higher maxima is within view, or to commit the current point to memory and make a random jump a short distance away.

    Reply
  26. franky172

    Rich –
    I was never officially banned; although some months ago my password at UD mysteriously changed, and I can no longer log in to UD. Oh well.

    Reply
  27. Coin

    I’d like at this point to link this interesting take on Dembski’s NFL work. It says mostly things that have shown up in this discussion already, but there is one little contextual thing that makes this particular critique interesting: It is written by David Wolpert, as in the Wolpert of Wolpert and MacReady, the original authors of the NFL theorem!
    Wolpert’s chief complaint here is incidentally not just that Dembski is using the NFL theorem improperly, but that Dembski is not using the NFL theorem at all. Dembski uses the NFL theorems in a general philosophical sense: he treats them as general principles. But this is not what they actually are; they are mathematical theorems. This means that they guarantee consequences given a fixed set of assumptions– if the assumptions do not hold, the theorem tells us nothing. Dembski, says Wolpert, does not do the prerequisite technical work to show that the concept he is describing is something to which the NFL theorem applies. Therefore, points out Wolpert, mathematically speaking, Dembski hasn’t said anything at all.
    The criticism here is basically a modified form of something that MarkCC likes to say frequently– that the worst math is no math at all.

    Reply
  28. Torbjörn Larsson, OM

    Noel Grandin:

    at any one point in time, the evolution fitness landscape is obviously static.

    As several pointed out, it is a matter of relative rates. Selection is felt on an individuals generation basis (procreation). This means both the environment and other parts of the population may change which may change selection pressures on an individual, for example under competition. If you follow the link to ERV’s post she describes a case of observable change (viral medication and its effect).
    One common and simple type of change in response to changes in a population is coevolution. It can be as simple as a prey-predator relationship (faster prey puts more pressure on the predator, and vice versa) but it can also be between genes in a species genome (for example a change in a protein coding gene may put pressures on its regulation, and vice versa).
    It is also a misunderstanding to think that rates are roughly the same for different genes or genomes (species), or that species doesn’t go extinct due to changes in selection pressures. AFAIK > 99 % of species have gone extinct during Earth history.
    New speciation takes up the slack. Currently the extinction rate is greater than the speciation rate since we live in what promise to be the largest mass extinction this Earth has seen, caused by human activities on top of a long series of rapid ice ages that broke the back of the old megafauna. Hopefully we will turn that around before we waste the majority of biological genomic capital – IIRC the earlier largest extinction seems to have taken 3 Myr to recover from.
    One common cause for extinction is when populations lose all abilities for sexual procreation (for example bacterias have at least 3 different methods of sexual procreation) and the faster evolution rate that brings. AFAIK such populations are thought by biologists to always be headed towards extinction.
    Biologists discuss evolution of evolvability, ie things such as mutation modes and rates may be adjusted by evolution itself. But it seems to be true that beyond some (contingent) rate the changes in the genome doesn’t have time to get fixated. (Spread and equilibrated through the population.)
    I think this is what happens at bottlenecks, either the population becomes temporarily small or the environmental changes become temporarily large. Afterwards the population genome is both starved for variation and not equilibrated, so it will evolve away to a new fixation equilibrium and slowly recover variation.

    Reply
  29. Jud

    Noel Grandin wrote: “[A]t any one point in time, the evolution fitness landscape is obviously static.”
    SteveM wrote: “I think the answer is the relative rates of change.”
    Or to put it another way: At any one point in time, a bullet is static. But I wouldn’t model my behavior around bullets based on that fact.

    Reply
  30. Erik 12345

    While I agree that Dembski has not in any way succeeded in undermining the theory of evolution, I disagree with much of the blog entry above.
    Firstly, in the NFL scenario, random search is neither extremely efficient nor extremely inefficient. It has a fairly OK performance, especially considering that the performance is independent of the size of the search space. Make the search space 1 gazillion times bigger and the NFL-averaged performance will change… not one bit. That’s how benign the NFL scenario is. For this reason it is wrong to say that “NFL says no.” in reply to the question “if you know nothing about f, […], can you write a search function which can find the maximum of X?”.
    Secondly, I don’t understand the sketch of how it relates to the halting problem. In Wolpert & Macready’s formulation, the search space is finite so that any search algorithm will find a global maximum in a finite number of computational steps, unlike the halting problem which cannot in general be solved in a finite number of steps. The blog entry seems to envision a function f(x) which is “a measure of the probability of the program halting after that sequence of instructions”. Is f a representation of a computer program and x its input here? And how could such a probability be anything other than either 0 or 1? Continuing, the blog entry says that the halting problem is equivalent to searching for an x such that either f(x) = 0 or f(x) = 1, but the it seems that the halting problem is instead the problem of, for *any* f (or any f that corresponds to a program), determining whether or not there exists an x such that f(x) = 0.
    Erik

    Reply
  31. Erik 12345

    Noel Grandin wrote:
    ———————————————
    “You’ve said this before, but I don’t buy it, because, at any one point in time, the evolution fitness landscape is obviously static.
    Which means that the search function is chasing a constantly changing landscape.
    I admit that I couldn’t work this out myself, but I can’t obviously see how this would invalidate the NFL theorems.”
    ———————————————-
    Wolpert & Macready’s paper contains (i) an NFL theorem for the general case of time-dependent fitness function, (ii) a note about a special kind of time-dependent fitness function (which is updated by a static update rule) for which the NFL result does not hold. In general, whether or not an NFL result holds for a scenario with a changing fitness function depends a bit on
    (a) exactly how the fitness function changes. The NFL result essentially relies on the problem of induction, i.e. we cannot use what we know about already sampled points to infer something about other, as-yet unexplored points. The crucial thing is not whether or not the fitness function changes but whether or not the fitness function has some form of statistical regularities that enable learning.
    (b) how we measure success. For example, Wolpert & Macready have a more recent paper where they analyze coevolution and for the particular way in which they measure success in a model of biological coevolution they conclude that an NFL result does hold. One can also imagine other ways of measuring success for which the NFL result would not hold.

    Reply
  32. Torbjörn Larsson, OM

    Wolpert & Macready have a more recent paper where they analyze coevolution and for the particular way in which they measure success in a model of biological coevolution they conclude that an NFL result does hold.

    Hmm. Maybe I misremembered W&M for Tellgren. I couldn’t find the reference at the time.
    Here and here are references to their works.

    Reply
  33. Xanthir, FCD

    Firstly, in the NFL scenario, random search is neither extremely efficient nor extremely inefficient. It has a fairly OK performance, especially considering that the performance is independent of the size of the search space. Make the search space 1 gazillion times bigger and the NFL-averaged performance will change… not one bit. That’s how benign the NFL scenario is. For this reason it is wrong to say that “NFL says no.” in reply to the question “if you know nothing about f, […], can you write a search function which can find the maximum of X?”.

    Are you kidding me? Random search is the worst possible ‘search’ you can write, ignoring functions which obviously aren’t actually designed to search for the answer (like a hill-descender). It just randomly chooses a point and declares it to be the highest – it’s completely blind to the search space.
    That’s why the NFL theorems say what they do. Given an arbitrary fitness function, you can’t rely on the previously visited points for information about future points. All functions are rendered equally blind, which puts them exactly equivalent to random search.
    The whole point of Mark’s post is that the fitness function that evolution operates under is *far* from arbitrary – it has structure which can be exploited by search functions to do better than random search. Thus, the NFL theorems say absolutely nothing about evolution, because they require the fitness function to be arbitrary.
    You seem to understand what NFL is about (or can at least talk like you do) which really makes me wonder how you can think that NFL has any relevance at all to evolution.

    Reply
  34. Ron

    A trivial, but instructive variation on the search problem arises in my work doing someting as simple as graphing y=f(x). Handheld graphing calculators can be easily fooled by functions as simple as sin(k*x) for judicious choices of frequency k at which the simple algorithm of evaluating f at some equally spaced points and connecting dots presents a simple, but wrong picture which doesn’t represent what happens between the sample points. For functions given as explicit formulas made of elementary functions, one can use interval arithmetic to compute rigorous bounds. This works for optimization algorithms, as well, relying on knowing a lot more about the function than just the ability to compute it’s value at discrete points. If that’s all you can do, it’s easy to imagine “adversery” functions constructed with fore-knowledge of the search or graphing algorithm specifically to fool them by evaluating to 0 everywhere sampled and behaving arbitrarily erratically everywhere else.

    Reply
  35. Mohammed

    One question about NFL: If it’s impossible to come up with an optimization scheme that does better (on all landscapes) than another, then how are able to come up with schemes that do just that?
    I.e., consider a meta-search scheme where for certain problems, like finding roots, it applies some of the methods we already know work well such as Newton’s method, but in all other cases it simply does a random search. Wouldn’t this meta-search scheme work better than a search that is random in all cases? Yes there’s a bit of an issue with identifying the right algorithm for the right problem, but a trivial case would be to simply hardcode the problems that we know can be trivially solved.
    Any thoughts? Also just a peeve. The way the first paragraph is worded makes it sound like Dembski came up with NFL!

    Reply
  36. Xanthir, FCD

    Mohammed:
    True, a search that is random unless it can tell that it is searching a known class of functions would do better. The problem is that any given class of functions that *can* be searched over is miniscule compared to the number of *all* functions, to the point where there is almost no (that’s a strictly defined mathematical term) chance of choosing it.
    As well, there are infinitely many functions that *look* like a given class of functions for finitely many points, but are otherwise arbitrary and structureless. *Just* these cases would dwarf the number of true functions that your algorithm will work on in the same way that all functions will – you have a 0% chance of choosing the actual functions that you can work on from the set of all functions that look like the chosen class of functions.
    So, mathematically speaking, your random-except-in-certain-cases search strategy doesn’t do any better than random, because the cases where it *does* do better occur with 0% chance.

    Reply
  37. Erik 12345

    Xanthir wrote:
    “Are you kidding me? Random search is the worst possible ‘search’ you can write, ignoring functions which obviously aren’t actually designed to search for the answer (like a hill-descender). It just randomly chooses a point and declares it to be the highest – it’s completely blind to the search space.”
    One must not confuse experience with real-world experience with the NFL scenario. Most of what you have learned from experience with difficult real-world cases is not relevant to the NFL scenario, because the latter is heavily biased towards functions that are actually fairly easy to optimize. Random search is indeed a poor strategy for difficult real-world cases, but in the NFL scenario it is a fairly OK strategy and certainly not a prohibitively inefficient one.
    T. M. English “Optimization Is Easy and Learning Is Hard In the Typical Function”, Proceedings of the 2000 Congress on Evolutionary Computation CEC00
    http://citeseer.ist.psu.edu/english00optimization.html
    “That’s why the NFL theorems say what they do. Given an arbitrary fitness function, you can’t rely on the previously visited points for information about future points. All functions are rendered equally blind, which puts them exactly equivalent to random search.”
    The NFL results say that all ways of sampling the search space perform equally well, a statement about relative performance. They do not say that this performance is prohibitively bad in an absolute sense, in fact the performance is fairly OK, see English’s paper above or do the calculation yourself. Very significantly, the performance is independent of the size of the search space, quite unlike the difficult optimization problems you have experienced in the real world. What makes most optimization problems hard is that they scale unfavourably with the size of the search space so that making the search space bigger makes the problem harder, but that does not happen in the benign NFL scenario which has constant scaling!
    The whole point of Mark’s post is that the fitness function that evolution operates under is *far* from arbitrary – it has structure which can be exploited by search functions to do better than random search. Thus, the NFL theorems say absolutely nothing about evolution, because they require the fitness function to be arbitrary.
    I agree with this.
    “You seem to understand what NFL is about (or can at least talk like you do) which really makes me wonder how you can think that NFL has any relevance at all to evolution.”
    I have not said that the NFL results have any relevance to biological evolution. My point is rather that the original blog entry seems to be wrong on some points. Can you explain why it implies that random sampling has bad absolute performance in the NFL scenario or how the NFL result can be related to the Halting problem?

    Reply
  38. Xanthir, FCD

    My point is rather that the original blog entry seems to be wrong on some points. Can you explain why it implies that random sampling has bad absolute performance in the NFL scenario or how the NFL result can be related to the Halting problem?

    After a rereading, I’m not seeing the former being expressed anywhere in the article. The latter, I have no idea – I didn’t really understand that paragraph of Mark’s, and I wouldn’t be willing to defend it.

    The NFL results say that all ways of sampling the search space perform equally well, a statement about relative performance. They do not say that this performance is prohibitively bad in an absolute sense, in fact the performance is fairly OK, see English’s paper above or do the calculation yourself. Very significantly, the performance is independent of the size of the search space, quite unlike the difficult optimization problems you have experienced in the real world. What makes most optimization problems hard is that they scale unfavourably with the size of the search space so that making the search space bigger makes the problem harder, but that does not happen in the benign NFL scenario which has constant scaling!

    Well, yeah. Random search is constant in time because it only ever performs a single operation – it generates a random point in the space. Thus, it’s probably the most efficient possible function to search an arbitrary landscape, barring implementation details. That seems trivial, and unimportant once you’ve established the base result that all function are blinded in an arbitrary space.
    As I said, random search is the worst ‘search’ that can be called one – no function that is actually intended to find a point will do worse on a particular space. Efficiency-wise, sure, it’s damned fast. That just doesn’t really matter here in the scheme of things.
    That’s why I’m not seeing your point. It’s either trivial (if we’re talking about an arbitrary landscape) or wrong (in the case of a structured landscape).

    Reply
  39. Anonymous

    Xanthir wrote:
    “After a rereading, I’m not seeing the former being expressed anywhere in the article.”
    See the context around the sentence “NFL says no.” in the blog entry.
    “Random search is constant in time because it only ever performs a single operation – it generates a random point in the space.”
    Then you have misunderstood the NFL results. The basic scenario is that a function f is chosen and that N points are sampled based on some sampling distribution (“search algorithm”) p(x_next | x1,f(x1),…,xK,f(xK)). “Random search” in this context means repeated i.i.d. sampling, not sampling a single point. The NFL result is that when performance is measured by some function which only depends on the sampled function values, then the NFL-averaged performance is independent of the sampling distribution. Thus, the comparison is made based on equally large samples produced by different sampling distributions.

    Reply

Leave a Reply to Brendan S Cancel reply