{"id":117,"date":"2006-08-15T09:48:19","date_gmt":"2006-08-15T09:48:19","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/08\/15\/messing-with-big-numbers-using-probability-badly\/"},"modified":"2006-08-15T09:48:19","modified_gmt":"2006-08-15T09:48:19","slug":"messing-with-big-numbers-using-probability-badly","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/08\/15\/messing-with-big-numbers-using-probability-badly\/","title":{"rendered":"Messing with big numbers: using probability badly"},"content":{"rendered":"<p>After yesterdays post about the sloppy probability from ann coulter&#8217;s chat site, I thought it would be good to bring back one of the earliest posts on Good Math\/Bad Math back when it was on blogger. As usual with reposts, I&#8217;ve revised it somewhat, but the basic meat of it is still the same.<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;<br \/>\nThere are a lot of really bad arguments out there written by anti-evolutionists based on incompetent use of probability. A typical example is [this one][crapcrap]. This article is a great example of the mistakes that commonly get made with probability based arguments, because it makes so many of them. (In fact, it makes every single category of error that I list below!)<br \/>\nTearing down probabilistic arguments takes a bit more time than tearing down the information theory arguments. 99% of the time, the IT arguments are built around the same fundamental mistake: they&#8217;ve built their argument on an invalid definition of information. But since they explicitly link it to mathematical information theory, all you really need to do is show why their definition is wrong, and then the whole thing falls apart.<br \/>\nThe probabilistic arguments are different. There isn&#8217;t one mistake that runs through all the arguments. There&#8217;s many possibly mistakes, and each argument typically stacks up multiple errors.<br \/>\nFor the sake of clarity, I&#8217;ve put together a taxonomy of the basic probabilistic errors that you typically see in creationist screeds.<br \/>\nBig Numbers<br \/>\n&#8212;&#8212;&#8212;&#8212;-<br \/>\nThis is the easiest one. This consists of using our difficulty in really comprehending how huge numbers work to say that beyond a certain probability, things become impossible. You can always identify these argument, by the phrase &#8220;the probability is effectively zero.&#8221;<br \/>\nYou typically see people claiming things like &#8220;Anything with a probability of less than 1 in 10^60 is effectively impossible&#8221;. It&#8217;s often conflated with some other numbers, to try to push the idea of &#8220;too improbable to ever happen&#8221;. For example, they&#8217;ll often throw in something like &#8220;the number of particles in the entire universe is estimated to be 3&#215;10^78, and the probability of blah happening is 1 in 10^100, so blah can&#8217;t happen&#8221;.<br \/>\nIt&#8217;s easy to disprove. Take two distinguishable decks of cards. Shuffle them together. Look at the ordering of the cards &#8211; it&#8217;s a list of 104 elements. What&#8217;s the probability of *that particular ordering* of those 104 elements?<br \/>\nThe likelihood of the resulting deck of shuffled cards having the particular ordering that you just produced is roughly 1 in 10<sup>166<\/sup>. There are more possible unique shuffles of two decks of cards than there are particles in the entire universe.<br \/>\nIf you look at it intuitively, it *seems* like something whose probability is<br \/>\n100 orders of magnitude worse than the odds of picking out a specific particle in the entire observable universe  *should* be impossible.  Our intuition says that any probability with a number that big in its denominator just can&#8217;t happen. Our intuition is wrong &#8211; because we&#8217;re quite bad at really grasping the meanings of big numbers.<br \/>\nPerspective Errors<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br \/>\nA perspective error is a relative of big numbers error. It&#8217;s part of an argument to try to say that the probability of something happening is just too small to be possible. The perspective error is taking the outcome of a random process &#8211; like the shuffling of cards that I mentioned above &#8211; and looking at the outcome *after* the fact, and calculating the likelihood of it happening.<br \/>\nRandom processes typically have a huge number of possible outcomes. Anytime you run a random process, you have to wind up with *some* outcome. There may be a mind-boggling number of possibilities; the probability of getting any specific one of them may be infinitessimally small; but you *will* end up with one of them. The probability of getting an outcome is 100%. The probability of your being able to predict which outcome is terribly small.<br \/>\nThe error here is taking the outcome of a random process which has already happened, and treating it as if you were predicting it in advance.<br \/>\nThe way that this comes up in creationist screeds is that they do probabilistic analyses of evolution built on the assumption that *the observed result is the only possible result*. You can view something like evolution as a search of a huge space; at any point in that spaces, there are *many* possible paths. In the history of life on earth, there are enough paths to utterly dwarf numbers like the card-shuffling above.<br \/>\nBy selecting the observed outcome *after the fact*, and then doing an *a priori* analysis of the probability of getting *that specific outcome*, you create a false impression that something impossible happened. Returning to the card shuffling example, shuffling a deck of cards is *not* a magical activity. Getting a result from shuffling a deck of cards is *not* improbable. But if you take the result of the shuffle *after the fact*, and try to compute the a priori probability of getting that result, you can make it look like something inexplicable happened.<br \/>\nBad Combinations<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;<br \/>\nCombining the probabilities of events can be very tricky, and easy to mess up. It&#8217;s often not what you would expect. You can make things seem a lot less likely than they really are by making a easy to miss mistake.<br \/>\nThe classic example of this is one that almost every first-semester probability instructor tries in their class. In a class of 20 people, what&#8217;s the probability of two people having the same birthday? Most of the time, you&#8217;ll have someone say that the probability of any two people having the same birthday is 1\/365<sup>2<\/sup>; so the probability of that happening in a group of 20 is the number of possible pairs over 365<sup>2<\/sup>, or 400\/365<sup>2<\/sup>, or about 1\/3 of 1 percent.<br \/>\nThat&#8217;s the wrong way to derive it. There&#8217;s more than one error there, but I&#8217;ve seen three introductory probability classes where that was the first guess. The correct answer is very close to 50%.<br \/>\nFake Numbers<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8211;<br \/>\nTo figure out the probability of some complex event or sequence of events, you need to know some correct numbers for the basic events that you&#8217;re using as building blocks. If you get those numbers wrong, then no matter how meticulous the rest of the probability calculation is, the result is garbage.<br \/>\nFor example, suppose I&#8217;m analyzing the odds in a game of craps. (Craps is a casino dice game using six sided dice.)  If I say that in rolling a fair die, the odds of rolling a 6 is 1\/6th the odds of rolling a one, then any probabilistic prediction that I make is going to be wrong. It doesn&#8217;t matter that from that point on, I do all of the analysis exactly right. I&#8217;ll get the wrong results, because I started with the wrong numbers.<br \/>\nThis one is incredibly common in evolution arguments: the initial probability numbers are just pulled out of thin air, with no justification.<br \/>\nMisshapen Search Space<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br \/>\nWhen you model a random process, one way of doing it is by modeling it as a random walk over a search space. Just like the fake numbers error, if your model of the search space has a different shape than the thing you&#8217;re modeling, then you&#8217;re not going to get correct results.  This is an astoundingly common error in anti-evolution arguments; in fact, this is the basis of Dembski&#8217;s NFL arguments.<br \/>\nLet&#8217;s look at an example to see why it&#8217;s wrong. We&#8217;ve got a search space which is a table. We&#8217;ve got a marble that we&#8217;re going to roll across the table. We want to know the probability of it winding up in a specific position.<br \/>\nThat&#8217;s obviously dependent on the surface of the table. If the surface of the table is concave, then the marble is going to wind up in nearly the same spot every time we try it: the lowest point of the concavity. If the surface is bumpy, it&#8217;s probably going to wind up a concavity between bumps. It&#8217;s *not* going to wind up balanced on the tip of one of the bumps.<br \/>\nIf we want to model the probability of the marble stopping in a particular position, we need to take the shape of the surface of the table into account. If the table is actually a smooth concave surface, but we build our probabilistic model on the assumption that the table is a flat surface covered with a large number of uniformly distributed bumps, then our probabilistic model *can&#8217;t* generate valid results. The model of the search space does not reflect the properties of the actual search space.<br \/>\nAnti-evolution arguments that talk about search are almost always built on invalid models of the search space. Dembski&#8217;s NFL is based on a sum of the success rates of searches over *all possible* search spaces.<br \/>\nFalse Independence<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br \/>\nIf you want to make something appear less likely than it really is, or you&#8217;re just not being careful, a common statistical mistake is to treat events as independent when they&#8217;re not. If two events with probability p<sub>1<\/sub> and p<sub>2<\/sub> are independent, then the probability of both p<sub>1<\/sub> and p<sub>2<\/sub> is p<sub>1<\/sub>&times;p<sub>2<\/sub>. But if they&#8217;re *not* independent, then you&#8217;re going to get the wrong answer.<br \/>\nFor example, take all of the spades from a deck of cards. Shuffle them, and them lay them out. What are the odds that you laid them out in numeric order? It&#8217;s 1\/13! = 1\/6,227,020,800. That&#8217;s a pretty ugly number. But if you wanted to make it look even worse, you could &#8220;forget&#8221; the fact that the sequential draws are dependent, in which case the odds would be 1\/13<sup>13<\/sup> &#8211; or 1\/3&times;10<sup>14<\/sup> &#8211; about 50,000 times worse.<br \/>\n[crapcrap]: http:\/\/www.parentcompany.com\/creation_essays\/essay44.htm<\/p>\n","protected":false},"excerpt":{"rendered":"<p>After yesterdays post about the sloppy probability from ann coulter&#8217;s chat site, I thought it would be good to bring back one of the earliest posts on Good Math\/Bad Math back when it was on blogger. As usual with reposts, I&#8217;ve revised it somewhat, but the basic meat of it is still the same. &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211; [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[61],"tags":[],"class_list":["post-117","post","type-post","status-publish","format-standard","hentry","category-statistics"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-1T","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/117","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=117"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/117\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=117"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=117"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=117"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}