{"id":268,"date":"2007-01-09T16:16:40","date_gmt":"2007-01-09T16:16:40","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2007\/01\/09\/probabilistic-complexity\/"},"modified":"2007-01-09T16:16:40","modified_gmt":"2007-01-09T16:16:40","slug":"probabilistic-complexity","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2007\/01\/09\/probabilistic-complexity\/","title":{"rendered":"Probabilistic Complexity"},"content":{"rendered":"<p> As I&#8217;ve mentioned in the past, complexity theory isn&#8217;t really one of my favorite topics. As a professional computer scientist, knowing and loving complexity up to the level of NP-completeness<br \/>\nis practically a requirement. But once you start to get beyond P and NP, there are thousands of complexity classes that have been proposed for one reason or another, and really understanding all of them can get remarkably difficult, and what&#8217;s worse, can feel like an utterly pointless exercise,<br \/>\nparticularly if the writer\/teacher you&#8217;re learning from isn&#8217;t very good.<\/p>\n<p> I&#8217;m not going to write a long series of posts on the more esoteric complexity classes. But there<br \/>\nare a few that are interesting, and might even have some relevance to actual practical problems in<br \/>\nselecting algorithms for a software system.<\/p>\n<p><!--more--><\/p>\n<p> Today I&#8217;m going to talk about one of those practical classes, called BPP. BPP is the class of<br \/>\nproblems that can be solved on a probabilistic machine in P-time with a <em>bounded probability<\/em> of<br \/>\nless than 1\/2 of generating an incorrect result. (In general, we use 1\/3 as the probability of failure,<br \/>\nbut that&#8217;s just for convenience; you can pick <em>any<\/em> bound b you want from the range 0 &lt; b<br \/>\n&lt; 1\/2, and you&#8217;ll wind up with the <em>same<\/em> set of problems in BPP.) <\/p>\n<p> This is going to sound like a strange idea, but it&#8217;s actually quite useful. A probabilistic machine<br \/>\nis one which can, whenever it needs to during the execution of its program, flip a coin to select one<br \/>\nof two paths. If you can write a program which uses these random branches, and which will on average<br \/>\ngenerate correct results for better than 2 out of 3 executions, then the program is considered a<br \/>\n<em>correct<\/em> probabilistic solution to the problem, and the problem that it solves in in BPP.<\/p>\n<p> Why is this useful?<\/p>\n<p> Randomized algorithms are something we use in the real world. It&#8217;s rare that we can accurately<br \/>\ncharacterize the precise probability of generating a successful result, but BPP provides a model that<br \/>\nallows us to describe the capabilities of algorithms that use randomness. And even in cases where we<br \/>\ndon&#8217;t really use randomness, we often approach exponential-time problems using heuristics &#8211; essentially<br \/>\nguesses about how to find a solution that work <em>most<\/em> of the time. To get some sense of what<br \/>\nbenefit we can get from heuristics, we can model them as probabilistic branches &#8211; so again, BPP gives<br \/>\nus a way of exploring what the heuristics can do to the worst-case performance for a problem.<\/p>\n<p> The most common kind of probabilistic algorithm is called a <em>Monte Carlo<\/em> algorithm. The<br \/>\nidea of a Monte Carlo algorithm is pretty simple. Imagine that you&#8217;ve got a barn with a target painted<br \/>\non it, and your goal is to have a blind man put a bullet through the target. A Monte Carlo approach is<br \/>\nbasically handing him a rapid-fire machine gun, pointing him in the right general direction, and having<br \/>\nhim pull the trigger while moving the gun barrel back and forth until he&#8217;s fired a couple of hundred<br \/>\nshots. There&#8217;s a pretty good chance that at least one of those shots will wind up in the target. In<br \/>\na Monte Carlo algorithm, you use some form of making a series of random guesses, each of which eliminates a group of possible solutions. After enough tries, whatever&#8217;s left is probably right.<\/p>\n<p> The easiest example of this is one of the common algorithms for rapidly determining whether a<br \/>\nnumber is prime. Given an integer N which you think might be prime, randomly pick 10000 integers<br \/>\nbetween 2 and N<sup>1\/2<\/sup>. For each one, check if it divides N evenly. If none of them divide it,<br \/>\nthen there&#8217;s a high probability that it&#8217;s prime. <\/p>\n<p> A slightly harder example of a probabilistic algorithm is one of the classic NP <em>hard<\/em> problems<br \/>\nis called the traveling salesman problem (TSP). In the TSP, you are a salesman who is given a route<br \/>\nincluding a set of N cities (nodes). You know the distance between any two cities in the list. The question is: what is the shortest path that allows you to visit all of the cities? <\/p>\n<p> There&#8217;s a P-time probabilistic algorithm for the TSP based on a randomized heuristic which<br \/>\ngenerates an optimal solution with high probability. Basically, generate a random path. Then randomly pick a group of four nodes that are close together, and randomly rearrange them. If the<br \/>\nresult is a shorter path, then that becomes the new path. If after some number *N* tries, the path doesn&#8217;t improve, then stop. (There are more details than that, but that&#8217;s the basic idea.) This algorithm can run in P-time, and generates optimal results with extremely high probability for problem sizes in the <em>millions<\/em> of nodes &#8211; problem sizes which would be absolutely intractable using<br \/>\nany conventional deterministic algorithm. It&#8217;s not guaranteed to generate a correct result; and in fact, I&#8217;ve<br \/>\nnever seen an analysis that manages to specifically pin down its rate of success. But in practice,<br \/>\nits results are astonishingly good; I don&#8217;t think that for any test on hundreds on thousands of nodes where we know the correct answer that this algorithm didn&#8217;t find it.<\/p>\n<p><em>(Note: The above explanation around the TSP was originally wrong. I managed to make<br \/>\na major error in saying that the TSP was NP-complete. The TSP is <b>not<\/b> NP-complete;<br \/>\nit is NP-hard which is quite a lot worse. The TSP <b>decision problem<\/b> in NP-complete. The decision problem is: Given a path-length P, <em>is there<\/em> a path whose length is less than or equal to P? Just that decision part is NP-complete. Generating a path as a solution is quite a lot harder; well into EXP territory, even if it turns out that P=NP.)<\/em><\/p>\n<p> There are also a couple of other interesting complexity classes that are related to BPP:<\/p>\n<ul>\n<li><b>RP<\/b>: randomized polynomial time. This is a computational class for decision problems<br \/>\n&#8211; that is, problems which take an input, and return &#8220;Yes&#8221; or &#8220;No&#8221;. A problem is in RP if<br \/>\nthere&#8217;s a probabilistic machine that <em>always<\/em> answers &#8220;No&#8221; if the correct answer<br \/>\nis &#8220;No&#8221;, and returns &#8220;Yes&#8221; more than 50% of the time if the correct answer is &#8220;Yes&#8221;. (What<br \/>\nthis means is sort of the opposite of what it sounds like: if it answers &#8220;Yes&#8221;,<br \/>\nthen it&#8217;s always right. If it answers &#8220;No&#8221;, then there&#8217;s some chance that it was<br \/>\nwrong.) We know that P &sube; RP, and  RP &sube; BPP.  <em>(There was originally a type in this paragraph, where I wrote &#8220;A problem is in NP&#8221; where it correctly reads &#8220;A problem is NP&#8221;. Thanks to commenter Coin for the catch!)<\/em><\/li>\n<li><b>PP<\/b>: probabilistic polynomial time. This is also a computational class for<br \/>\ndecision problems. For a problem in PP, there is<br \/>\na polynomial-time probabilistic algorithm where  if the correct answer is &#8220;Yes&#8221;, the<br \/>\nalgorithm will answer &#8220;Yes&#8221; <em>more than<\/em> half the time; if the correct answer<br \/>\nis &#8220;No&#8221;, it will answer &#8220;Yes&#8221; <em>less than<\/em> half the time. BPP &sube; PP; PP<br \/>\nis a much weaker (in the sense of the strength of results required) class &#8211; an<br \/>\nalgorithm in PP can end up arbitrarily close to being correct as often as<br \/>\na coin-flip, no matter how many times you repeat the computation; whereas in BPP,<br \/>\nthere is a probability bound, and by running the computation multiple times, you<br \/>\ncan reduce the bound to get arbitrarily close to always being correct. In fact,<br \/>\nit turns out that NP &sube; PP.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>As I&#8217;ve mentioned in the past, complexity theory isn&#8217;t really one of my favorite topics. As a professional computer scientist, knowing and loving complexity up to the level of NP-completeness is practically a requirement. But once you start to get beyond P and NP, there are thousands of complexity classes that have been proposed for [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[80],"tags":[],"class_list":["post-268","post","type-post","status-publish","format-standard","hentry","category-computational-complexity"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-4k","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/268","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=268"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/268\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=268"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=268"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=268"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}