{"id":267,"date":"2007-01-08T15:45:04","date_gmt":"2007-01-08T15:45:04","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2007\/01\/08\/basic-complexity-classes-p-and-np\/"},"modified":"2007-01-08T15:45:04","modified_gmt":"2007-01-08T15:45:04","slug":"basic-complexity-classes-p-and-np","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2007\/01\/08\/basic-complexity-classes-p-and-np\/","title":{"rendered":"Basic Complexity Classes: P and NP"},"content":{"rendered":"<p> Now that we&#8217;ve gone through a very basic introduction to computational complexity, we&#8217;re ready<br \/>\nto take a high-level glimpse at some of the more interesting things that arise from it. The one<br \/>\nthat you&#8217;ll hear about most often is &#8220;P vs NP&#8221;, which is what I&#8217;m going to write about today.<\/p>\n<p><!--more--><\/p>\n<p> So, what are P and NP? <\/p>\n<p> P and NP are two very broad classes of computational problems. When we&#8217;re talking about P and NP, we&#8217;re talking about the intrinsic complexity of a <em>problem<\/em> &#8211; that is, a minimum complexity bound on the growth rate of the worst case performance of <em>any<\/em> algorithm that solves the problem.<\/p>\n<p> P is the class of problems that can be solved in polynomial time on a deterministic effective<br \/>\ncomputing system (ECS). Loosely speaking, all computing machines that exist in the real world are<br \/>\ndeterministic ECSs. So P is the class of things that can be computed in polynomial time on real computers.<\/p>\n<p> NP is the class of problems that can be solved in polynomial time on <em>non-deterministic<\/em><br \/>\nECSs.  A non-deterministic machine is a machine which can execute programs in a way where whenever there are multiple choices, rather than iterate through them one at a time, it can follow <em>all choices\/all paths<\/em> at the same time, and the computation will succeed if <em>any<\/em> of those<br \/>\npaths succeed; if multiple paths lead to success, one of them will be selected by some unspecified mechanism; we usually say that they&#8217;ll pick the first path to lead to a successful result.<\/p>\n<p> The idea of non-determinism in effective\/Turing-equivalent computing systems is a tough<br \/>\nthing to grasp. There are two ways of thinking about it that help most people understand what it<br \/>\nmeans: there&#8217;s the low-level explanation, and the high level explanation.<\/p>\n<p> The low level explanation is based on the basic concept of the Turing machine. (If you need<br \/>\na reminder of how a Turing machine works, I wrote an explanation <a href=\"http:\/\/goodmath.blogspot.com\/2006\/03\/playing-with-mathematical-machines.html\">here<\/a>.) In a Turing machine, you&#8217;ve got a tape full of data cells, with a head that can read and\/or write data on<br \/>\nthe cell under it, and you&#8217;ve got a <em>state<\/em>. Each step of the computation, you can look at the contents of the tape cell under the head, and then using the data on that cell and the current state of the machine, and use those to decide what (if anything) to write onto the current tape cell; what state to adopt, and what direction to move the head on the tape. In a deterministic Turing machine, for a given pair (cell,state) of tape cell data and machine state, there can be only one possible (write,state,move) action for the machine. In a non-deterministic machine, there can be <em>multiple<\/em> actions for a given (cell,state) pair. When there are multiple choices, the machine  picks <em>all of them<\/em>; you can think of it as the machine splitting itself into multiple copies, each of which picks one of the choices; and whichever copy succeeds first becomes the result of the<br \/>\ncomputation.<\/p>\n<p> The high level explanation is a simpler idea, and it&#8217;s closer to what we use in practice for<br \/>\nspecifying when some problem is in NP, but it sounds pretty weird. A problem whose solution can<br \/>\nbe computed in polynomial time on a non-deterministic machine is equivalent to a problem where<br \/>\na proposed solution can be verified correct in polynomial time on a deterministic machine. You can<br \/>\nspecify a non-deterministic machine that guesses all possible solution, following all of those paths at once, and returning a result whenever it finds a solution that it can verify is correct. So if you<br \/>\ncan check a solution deterministically in polynomial time (not produce a solution, but just verify that the solution is correct), then it&#8217;s in NP.<\/p>\n<p> The distinction can become much clearer with an example. There&#8217;s a classic problem called the<br \/>\nsubset\/sum problem (SS). In the SS problem, you&#8217;re given an arbitrary set of integers. The question is,<br \/>\ncan you find any non-empty subset of values in the set whose sum is 0? It should be pretty obvious that<br \/>\n<em>checking<\/em> a solution is in P: a solution is a list of integers whose maximum length is the size<br \/>\nof the entire set; to check a potential solution, sum the values in the solution, and see if the result<br \/>\nis 0. That&#8217;s <b>O<\/b>(n) where n is the number of values in the set. But <em>finding<\/em> a solution is<br \/>\nhard. The solution could be <em>any<\/em> subset of any size larger than 0; for a set of n elements,<br \/>\nthat&#8217;s 2<sup>n<\/sup>-1 possible subsets. Even if you use clever tricks to reduce the number of possible<br \/>\nsolutions, you&#8217;re still well into exponential territory in the worse case. So you can non-deterministically guess a solution and test it in linear time; but no one has found any way<br \/>\nof producing a correct solution deterministically in less than <b>O<\/b>(2<sup>n<\/sup>) steps.<\/p>\n<p> One of the great unsolved problems in theoretical computer science is: does P = NP? That is, is<br \/>\nthe set of problems that <em>can<\/em> be solved in polynomial time on a non-deterministic machine the same as the set of problems that can be solved in polynomial time on a deterministic machine? We know for certain that P &sube; NP &#8211; that is, that all problems that can be solved in polynomial time on a deterministic machine can also be solved in polynomial time on a non-deterministic machine. We simple do not know.<\/p>\n<p> For computing devices simpler than Turing-equivalent, we know the relationships between the deterministic and non-deterministic problem families. For example, for finite state machines,<br \/>\nnon-deterministic FSMs are no faster or more capable than deterministic ones (So DFA = NFA);  for pushdown automata, the deterministic versions are strictly weaker than the non-deterministic ones (D-PDA &sub; N-PDA). But what about Turing machines, or any other Turing-equivalent ECS? Not a clue. I know people who study computational complexity who think that P=NP, and that we will find a polynomial time deterministic solution to an NP problem any year now. Personally, I suspect that P &ne; NP,<br \/>\nbased on intuition from the PDA case; as computing devices get more complex, we start to see more<br \/>\ndifferences between the deterministic and non-deterministic machines, and I personally think that that<br \/>\nwill hold true all the way up to full ECS&#8217;s. But I have no proof to back that up, just intuition.<\/p>\n<p> Within NP, there are a set of particularly interesting problems which are called <em>NP-complete<\/em>. An NP-complete problem is one where we can prove that if there is a<br \/>\nP-time computation that solved that problem, it would mean that there was a P-time solution<br \/>\nfor <em>every<\/em> problem in NP, and thus that P=NP. We&#8217;ve identified hundreds (thousands?) of<br \/>\nNP-complete problems, but no one has ever found a P-time solution for any of them.<\/p>\n<p> The SS problem mentioned above is NP-complete. So are a lot of other interesting problems:<br \/>\n<a href=\"http:\/\/web.cs.ualberta.ca\/~joe\/Coloring\/\">graph coloring<\/a>, <a href=\"http:\/\/members.pcug.org.au\/~dakin\/tsp.htm\">the traveling salesman<\/a>, <a href=\"http:\/\/www.cs.utk.edu\/~yzhang\/ce\/\">clique detection in graphs<\/a>, and many more.<\/p>\n<p> So how do we show that something is NP complete? It sounds like a very difficult thing to do. And doing it from first principles <em>is<\/em> extremely difficult. Fortunately, once it&#8217;s been done for one problem, we can piggyback on that solution. The way to prove NP-completeness is based on<br \/>\nthe idea of <em>problem reduction<\/em>. If we have two problems S and T, and we can show that<br \/>\nany instance of S can be transformed into an instance of T in polynomial time, then we say that S is polynomial-time reducible to T. If T is NP-complete, then <em>every<\/em> problem in NP is polynomial-time reducible to T.<\/p>\n<p> The trick is: once we know a problem T which is NP complete, then for any other problem U, if we can show that T is polynomial-time reducible to U, then U must be NP-complete as well. If we can reduce T to U, but we don&#8217;t know how to reduce U to T, then we don&#8217;t just say that U is NP-complete; we say that it&#8217;s NP-hard. Many people think that if we could show whether or not NP=NP-Hard, then we&#8217;d also know whether P=NP; others disagree.<\/p>\n<p> Anyway, there are a lot of problems which have been proven NP-complete. So given a new problem,<br \/>\nthere are a lot of different NP-complete problems that you can use as a springboard for proving that<br \/>\nyour new problem is also NP complete.<\/p>\n<p> If you trace back, you&#8217;ll find that most NP-completeness proofs are ultimately built on top of NP-completeness proof of one fundamental problem, whose nature makes it particularly appropriate as<br \/>\na universal reduction target &#8211; and it&#8217;s also <a href=\"http:\/\/portal.acm.org\/citation.cfm?coll=GUIDE&amp;dl=GUIDE&amp;id=805047\">the first problem that was proven to be NP-complete.<\/a> It&#8217;s called the propositional satisfaction problem (aka SAT), which was shown to be NP-complete via a rather difficult poof. For any other problem, if we can show that we can translate any instance of a SAT problem <em>to<\/em> an instance of some other problem <em>in polynomial time<\/em>, then that other problem must also be NP-complete. And SAT (or one of its simpler variations, 3-SAT) is particularly easy to work with, and show how to translate instances of SAT to instances of other problems.<\/p>\n<p> The SAT problem is a neat one. Take a set of boolean variables of size <em>n<\/em>,<br \/>\nb<sub>1<\/sub>,&#8230;,b<sub>n<\/sub>, and their complements, not(b<sub>1<\/sub>),&#8230;,not(b<sub>n<\/sub>);<br \/>\nthese are called <em>atoms<\/em>. Now, form an arbitrary logical statement using atoms joined by logical ANDs and ORs. The problem is, can you find an assignment of the boolean variables to true and<br \/>\nfalse so that the statement is true (satisfied). <\/p>\n<p> The only problem with SAT is that it&#8217;s so general. The set of all logical equations consisting of<br \/>\nan arbitrary set of atoms joined together in any syntactically valid form &#8211; it&#8217;s an enormous set; but worse, it&#8217;s got a huge amount of structural variation. To work with it, and show a P-time reduction from SAT to some other problem, and be absolutely sure that you&#8217;ve covered every possible case &#8211; that&#8217;s a tough job. Too tough; it&#8217;s awfully easy to miss just one little case; and missing one case can<br \/>\ngive you an apparent reduction that shows P=NP.<\/p>\n<p> What we typically do, then, is used a constrained version of SAT. Any arbitrary SAT equation can be<br \/>\nP-time transformed into a equivalent equation in a particular form consisting of groups of 3 atoms<br \/>\n(triples), where the members of a triple are joined by logical &#8220;OR&#8221;s, and triples are joined by logical<br \/>\n&#8220;AND&#8221;s. Solving the SAT problem for equations in the constrained triple form is called 3-SAT. 3-SAT is NP-complete, just like the general SAT problem; and because it is so constrained, 3-SAT is a much easier basis for constructing reduction-based NP-completeness proofs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Now that we&#8217;ve gone through a very basic introduction to computational complexity, we&#8217;re ready to take a high-level glimpse at some of the more interesting things that arise from it. The one that you&#8217;ll hear about most often is &#8220;P vs NP&#8221;, which is what I&#8217;m going to write about today.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[80],"tags":[],"class_list":["post-267","post","type-post","status-publish","format-standard","hentry","category-computational-complexity"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-4j","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/267","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=267"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/267\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=267"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=267"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=267"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}