{"id":535,"date":"2007-10-24T09:59:33","date_gmt":"2007-10-24T09:59:33","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2007\/10\/24\/making-graph-algorithms-fast-using-strongly-connected-components\/"},"modified":"2007-10-24T09:59:33","modified_gmt":"2007-10-24T09:59:33","slug":"making-graph-algorithms-fast-using-strongly-connected-components","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2007\/10\/24\/making-graph-algorithms-fast-using-strongly-connected-components\/","title":{"rendered":"Making Graph Algorithms Fast, using Strongly Connected Components"},"content":{"rendered":"<p> One of the problems with many of the interesting graph algorithms is that they&#8217;re complex. As graphs get large, computations over them can become extremely slow. For example, graph coloring is NP-complete &#8211; so the time to run a true optimal graph coloring algorithm on an arbitrary graph grows exponentially in the size of the graph!<\/p>\n<p> So, quite naturally, we look for ways to make it faster. We&#8217;ve already talked about using<br \/>\nheuristics to get fast approximations. But what if we really need the true optimum? The other approach to making it faster, when you really want the true optimum, is parallelism: finding some way to subdivide<br \/>\nthe problem into smaller pieces which can be executed at the same time. This is obviously a limited<br \/>\ntechnique &#8211; you can&#8217;t stop the exponential growth in execution time by adding a specific finite<br \/>\nnumber of processors. But for particular problems, parallelism can make the difference between<br \/>\nbeing able to do something fast enough to be useful, and not being able to. For a non-graph<br \/>\ntheoretic but compelling example, there&#8217;s been work in something called <em>microforecasting<\/em>, which is precise weather forecasting for extreme small areas. It&#8217;s useful for predicting things like severe lightning storms in local areas during events where there are large numbers of people. (For example, it was used at the Atlanta olympics.) Microforecasting inputs a huge amount of information about local conditions &#8211; temperature, air pressure, wind direction, wind velocity, humidity, and such, and computes a short-term forecast for that area. Microforecasting is completely useless if it takes longer to compute the forecast than it takes for the actual weather to develop. If you can find a way to split the computation among a hundred processors, and get a forecast in 5 minutes, you&#8217;ve got something really useful. If you have to run it on a single processor, and it takes 2 hours &#8211; well, by then, any storm<br \/>\nthat the forecast predicts is already over.<\/p>\n<p> For graphs, there&#8217;s a very natural way of decomposing problems for parallelization. Many complex graphs for real problems can be divided into a set of subgraphs, which can be handled in parallel. In<br \/>\nfact, this can happen on multiple levels: graphs can be divided into subgraphs, which can be divided<br \/>\ninto further subgraphs. But for now, we&#8217;ll mainly focus on doing it once &#8211; one decomposition of<br \/>\nthe graph into subgraphs for parallelization.<\/p>\n<p><!--more--><\/p>\n<p> For any graph, directed or undirected, we have a notion of <em>connectedness<\/em>. A graph is<br \/>\nconnected if, for any two vertices A and B in the graph, there&#8217;s a path between them. For directed<br \/>\ngraphs, we can define a closely related concept: strong connection. Two vertices v<sub>1<\/sub> and v<sub>2<\/sub> in a digraph are connected if\/f there&#8217;s a path between them &#8211; that is,<br \/>\nif there&#8217;s a path from v<sub>1<\/sub> to v<sub>2<\/sub>, <em>and<\/em> there&#8217;s also a path<br \/>\nfrom v<sub>2<\/sub> to v<sub>1<\/sub>. The digraph is <em>strongly connected<\/em> if and only if every pair of vertices in it are strongly connected.<\/p>\n<p> Now, suppose we have a digraph which is connected, but <em>not<\/em> strongly connected. Then<br \/>\none of the things we can do with that graph is look at ways of partitioning the graph into<br \/>\nstrongly-connected subgraphs. There are a bunch of reasons why doing that it useful: for example, there are many algorithms that can be implemented in a divide-and-conquer fashion by identifying the<br \/>\nstrongly connected components of the graph, performing the algorithm on those components, and then merging the results.<\/p>\n<p> The way that works is pretty simple. You find the strongly connected components, and contract each of<br \/>\nthem to a single vertex. You end up with a new graph. You can compute properties of the graph by<br \/>\ndividing them into the strongly connected components &#8211; computing the things for each of the components, and them merging the results by using the contracted graph. If the graph is large enough, you can divide that contracted graph into its own strongly connected subgraphs.<\/p>\n<p> For example, you can use this for graph coloring. Find the strongly connected components &#8211; and color those. Then you can merge the coloring results by doing a join of the strongly connected components, adding colors when necessary at the joints. <\/p>\n<p> For example, look at the graph below. We can divide it into its strongly connected components, which I&#8217;ve \tmarked off by dotted lines.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"connected-components-pre.png\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_287.png?resize=470%2C308\" width=\"470\" height=\"308\" \/><\/p>\n<p> Now, we can color each of the strongly connected components separately, producing the graph below.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"particolor.png\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_288.png?resize=469%2C293\" width=\"469\" height=\"293\" \/><\/p>\n<p> Then we go back to the full graph &#8211; but we treat only look at the edges that join<br \/>\nthe strongly connected components. And we re-shuffle colors to avoid conflicts where we&#8217;re joining subgraphs. When we can&#8217;t avoid a conflict by re-shuffling, we would add a color &#8211; but that isn&#8217;t necessary in this case.  The end-result is shown below. <\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"fully-colored.png\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_289.png?resize=469%2C293\" width=\"469\" height=\"293\" \/><\/p>\n<p> Doing it this way has two advantages. One is the fact that it&#8217;s so easily parallelizable:  each of the strongly connected components can be colored simultaneously, and then<br \/>\nthe results merged. In terms of the common current models of parallel programming, this &#8211; like many other graph algorithms built using subdivision to strongly connected components &#8211; is a perfect candidate for implementation as a <a href=\"http:\/\/en.wikipedia.org\/wiki\/MapReduce\">map-reduce.<\/a><\/p>\n<p> The other advantage is that it&#8217;s actually a potential complexity reducer. That is,<br \/>\ndoing this sequentially, finding a coloring for the strongly connected components and then merging, is an application of a useful heuristic that can, in general, speed up<br \/>\nthe graph coloring. In some cases, the merging of component colorings can be expensive enough to outweigh the savings of doing a group of smaller colorings, but<br \/>\nin general, for most graphs that appear in many real problems, the component coloring and merge is a faster way of finding the optimal coloring.<\/p>\n<p> Next post, I&#8217;ll talk about algorithms for finding the strongly connected components: after all, using SCCs as a tool in this way only makes sense if we can find the SCCs quickly!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the problems with many of the interesting graph algorithms is that they&#8217;re complex. As graphs get large, computations over them can become extremely slow. For example, graph coloring is NP-complete &#8211; so the time to run a true optimal graph coloring algorithm on an arbitrary graph grows exponentially in the size of the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[54],"tags":[],"class_list":["post-535","post","type-post","status-publish","format-standard","hentry","category-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-8D","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/535","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=535"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/535\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=535"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=535"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=535"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}