{"id":502,"date":"2007-09-03T20:53:24","date_gmt":"2007-09-03T20:53:24","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2007\/09\/03\/amortized-complexity-a-tool-for-graph-algorithms-among-others\/"},"modified":"2007-09-03T20:53:24","modified_gmt":"2007-09-03T20:53:24","slug":"amortized-complexity-a-tool-for-graph-algorithms-among-others","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2007\/09\/03\/amortized-complexity-a-tool-for-graph-algorithms-among-others\/","title":{"rendered":"Amortized Complexity &#8211; a Tool for Graph Algorithms (among others)"},"content":{"rendered":"<p> There are a lot of very cool problems in computer science that can be solved by using<br \/>\nan appropriate data structure; and the data structures are often easiest to describe in terms<br \/>\nof graphs. And of those data structures, one thing that often comes up is <em>amortized algorithmic complexity. <\/em> Amortized complexity is something which has been occupying my thoughts lately,<br \/>\nbecause it&#8217;s come up in several real problems, so I&#8217;m in the mood to write about it, and it&#8217;ll be<br \/>\nuseful later.<\/p>\n<p> The idea of amortized complexity is that for some structures, the worst case complexity<br \/>\ncost of a <em>series<\/em> of operations is different from the worst-case complexity of a single operation. In amortized complexity, you consider cases where some operation is inexpensive most of the time &#8211; but to keep it inexpensive most of the time, you need to periodically do something expensive.<\/p>\n<p><!--more--><\/p>\n<p> As usual, the idea is clearer by example. Suppose you want to have a list of things. You want to be able to access any element by position quickly. You also want to be able to quickly add new things at the end of the list.<\/p>\n<p> If you didn&#8217;t want to be able to add things to the end of the list &#8211; that is, you knew the size of the list in advance &#8211; then it would be easy. You could just use an array, a sequence of values stored in adjacent memory locations. Then accessing an element by position is simple &#8211; just take the location of the beginning of the array, and add to it the position of the element you want. Updating is the same thing &#8211; just store things in the position. For example, in C, for a list of integers:<\/p>\n<pre>\ntypedef int* int_list;\nint_list create_list(int size) {\nreturn (int*) malloc(size*sizeof(int));\n}\nint get_from_list(int_list list, int position) {\nreturn *(list + position);\n}\nvoid set_in_list(int_list list, int position, int value) {\n*(list + position) = value;\n}\n<\/pre>\n<p> But if you want to be able to extend the list, then you can&#8217;t just use a simple array. An array gives you fixed length. So what do you do? You pull out the old classic hacking aphorism: &#8220;When all else fails, add indirection&#8221;. Instead of using an array, you use a structure containing a pointer to an array,<br \/>\nand a length indicator. That way, you can replace the array with a longer one when you need to. Done<br \/>\nnaively, you get something like:<\/p>\n<pre>\ntypedef struct s_list {\nint length;\nint *array;\n} *list;\nlist createList(int size) {\nlist l = (l) malloc(sizeof(struct s_list));\nl.length = size;\nl.array = (int *)malloc(size * sizeof(int));\n}\nint get_from_list(list l, int pos) {\nreturn *(l-&gt;array + pos);\n}\nvoid add_to_list(list l, int val) {\n\/* allocate a new array, and copy the old volues into it. *\/\nint *new_array = (int *)malloc((size + 1) * sizeof(int));\nfor (int i=0; i &lt; l-&gt;length; i++) {\n*(new_array + i) = *(l-&gt;array + i);\n}\n*(new_array + size) = val;\n\/* replace the old array *\/\nl-&gt;array = new_array;\nl-&gt;length++;\n}\n<\/pre>\n<p> Now, this obviously stinks. Every time you add an element to the list, you need<br \/>\nto copy the entire list. So adding an element to a list has complexity O(N) (where N is the size of the list); starting from an empty list and adding N elements is O(N<sup>2<\/sup>). That&#8217;s not good.<\/p>\n<p> The problem is, how can you do better? If you stop using an array, then you&#8217;ll lose the<br \/>\nfast retrieval and update by position. If you keep using an array, and you don&#8217;t know its<br \/>\nmaximum size, you&#8217;re stuck &#8211; you&#8217;re eventually going to need to copy the array, and copying<br \/>\nthe array is O(n) &#8211; so the worst case performance of adding to the array is O(n).<\/p>\n<p>\t The trick is to consider a <em>series<\/em> of operations. Suppose that you use the following<br \/>\ntrick:<\/p>\n<ul>\n<li> Keep the real length of the array, and the apparent length of the list separately;<\/li>\n<li> Each time you add an element, if the real length of the array is larger than<br \/>\nthe length of the list, add the value in an unused part of the array, and<br \/>\nadd one to the apparent length of the list.<\/li>\n<li> If you want to add an element, and the size of the array is the same as the size of the<br \/>\nlist, then re-allocate the array, making it <em>double<\/em> the size of the old array,<br \/>\nand then copy the elements to the new array.<\/li>\n<\/ul>\n<pre>\ntypedef struct s_elist {\nint list_length;\nint array_size;\nint *array;\n} *elist;\nvoid add_to_elist(elist el, int val) {\nif (el-&gt;list_length &lt; el-&gt;array_size - 1) {\n*(el-&gt;array + el-&gt;list_length) = val;\nel-&gt;list_length++;\n} else {\nint new_size = el-&gt;array_size * 2;\nint *new_array = (int *)malloc(new_size * sizeof(int));\nfor (int i=0; i &lt; el-&gt;list_length; i++) {\n*(new_array + i) = *(el-&gt;array + i);\n}\n*(new_array + el-&gt;array_size) = val;\nel-&gt;array = new_array;\nel-&gt;list_length = new_size;\nel-&gt;list_length++;\n}\n}\n<\/pre>\n<p> What&#8217;s the complexity of that? Well, worst case is O(N), where N is the current size of the list. But suppose you actually added N elements to the list. The complexity is <em>not<\/em><br \/>\nO(N<sup>2<\/sup>). It&#8217;s just O(N). From the time you reallocate the array from size C to size 2&times;C,<br \/>\nyou get to do C insertions with cost 1; and then you do one insert of cost C. So if you&#8217;re looking at the performance of the <em>series<\/em>, you can <em>amortize<\/em> the cost over the operations, and say that<br \/>\nthe amortized cost of each insert is 2; the amortized cost of C inserts is thus 2&times;C &#8211; that is, exactly the actual cost of C inserts.<\/p>\n<p> There are two different ways of working out amortized costs: credit and debt based approaches.<br \/>\nIn a credit based approach, you <em>charge<\/em> a cost to perform an expensive operation; and each time you do the cheap operation, you charge it extra time, <em>earning<\/em> some quantity of time as credit. If you can show that you&#8217;ll never perform the expensive operation without having enough credit to<br \/>\npay for it, then that cost you charged to the cheap operation is the amortized cost.<\/p>\n<p> The credit based approach is similar &#8211; but you do the expensive operation first. So, if you don&#8217;t owe<br \/>\nany time, you can perform the expensive operations. But when you do, you acquire a debt, and while you&#8217;re<br \/>\nin debt, you can&#8217;t perform the expensive operation. Each time you perform the cheap operation, you charge<br \/>\nit more than it actually costs to perform, and pay off part of the debt. If you can show that you never<br \/>\nneed to perform the expensive operation while in debt, then the cost you assigned to the cheap operation<br \/>\nis the amortized cost.<\/p>\n<p> Let&#8217;s look at an example of pre-payment. We start with a list of size N, and start adding. Each<br \/>\ninexpensive operation really costs 1, but we&#8217;ll charge it as 2. So each time we do a cheap add, we get 1<br \/>\nunit of credit. After N additions, we&#8217;ll have accumulated N units of credit. Then we need to reallocate<br \/>\nthe list &#8211; which costs N. So we&#8217;re in the clear. The amortized cost of inserts is 2 &#8211; or O(1).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are a lot of very cool problems in computer science that can be solved by using an appropriate data structure; and the data structures are often easiest to describe in terms of graphs. And of those data structures, one thing that often comes up is amortized algorithmic complexity. Amortized complexity is something which has [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[80],"tags":[],"class_list":["post-502","post","type-post","status-publish","format-standard","hentry","category-computational-complexity"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-86","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/502","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=502"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/502\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=502"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=502"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=502"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}