{"id":234,"date":"2006-12-06T17:20:01","date_gmt":"2006-12-06T17:20:01","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/12\/06\/a-tree-grows-up-in-haskell-building-a-dictionary-type\/"},"modified":"2006-12-06T17:20:01","modified_gmt":"2006-12-06T17:20:01","slug":"a-tree-grows-up-in-haskell-building-a-dictionary-type","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/12\/06\/a-tree-grows-up-in-haskell-building-a-dictionary-type\/","title":{"rendered":"A Tree Grows Up in Haskell: Building a Dictionary Type"},"content":{"rendered":"<p> Last time around, I walked through the implementation of<br \/>\na very simple binary search tree. The implementation that I showed<br \/>\nwas reasonable, if not great, but it had two major limitations. First,<br \/>\nit uses equality testing for the search, so that the implementation is only really suitable for use as a set; and second, because it&#8217;s such<br \/>\na trivial tree implementation, it&#8217;s very easy for the tree to become<br \/>\nhighly unbalanced, resulting in poor performance.<\/p>\n<p> Today, we&#8217;ll look at how to extend the implementation so that our BST<br \/>\nbecomes useful as a key\/value dictionary type. We&#8217;ll take two different<br \/>\napproaches to that, each of which demonstrates some common Haskell<br \/>\ntechniques: one based on on using higher order functions, and one based on<br \/>\nusing tuples. Balancing the tree will be a task for another day.<\/p>\n<p> As an aside: in a real Haskell program,  you would probably never write a simple type like this. Haskell has a great library, and it&#8217;s got plenty of library types that do a much better job than this. This<br \/>\nis really only for tutorial purposes.<\/p>\n<p><!--more--><\/p>\n<p> Here&#8217;s what we&#8217;d like to do. Instead of just being able to answer the<br \/>\nquestion &#8220;Is the value X in this tree?&#8221;, we want to be able to ask &#8220;Is<br \/>\nthere a value in this tree is associated with X, and if so, what is it?&#8221;.\n<\/p>\n<p> The way that we&#8217;ll do that is to use <em>second-order functions<\/em>.<br \/>\nInstead of using the built in equality operator for comparing search keys<br \/>\nand nodes, we&#8217;ll use a function, which we&#8217;ll pass as a parameter to the<br \/>\nsearch operations.<\/p>\n<p> First, we need to define a type for the comparison function. Given two<br \/>\nvalues &#8211; a key, and a value &#8211; we need to be able to determine whether the<br \/>\nkey is <em>greater than<\/em>, <em>equal to<\/em>, or <em>less than<\/em> the<br \/>\nvalue. Second, given a value stored in the tree, we need to be able to<br \/>\nextract a key from it.<\/p>\n<pre>\n&gt;data Comparison = Greater | Equal | Less\n&gt;\n&gt;type CompareFunc a = a -&gt; a -&gt; Comparison\n&gt;\n&gt;type KeyExtractor a b = a -&gt; b\n<\/pre>\n<p> Now, we can build a new version of the search tree. We&#8217;ll start<br \/>\nwith a type declaration, and the implementation of the insert operation.<\/p>\n<pre>\n&gt;data KeyedSearchTree a = KSTNode a (KeyedSearchTree a) (KeyedSearchTree a)\n&gt;                       | KSTEmpty deriving (Show)\n&gt;\n&gt;kstInsert :: Ord b =&gt; (KeyExtractor a b) -&gt; CompareFunc b -&gt; (KeyedSearchTree a) -&gt; a -&gt;  (KeyedSearchTree a)\n&gt;kstInsert kextract comp (KSTNode k left right) v  =\n&gt;           case (comp (kextract k) (kextract v)) of\n&gt;               Greater -&gt; KSTNode k (kstInsert kextract comp left v) right\n&gt;               Equal -&gt; KSTNode v left right\n&gt;               Less -&gt; KSTNode k left (kstInsert kextract comp right v)\n&gt;kstInsert kextract comp KSTEmpty v = KSTNode v KSTEmpty KSTEmpty\n&gt;\n<\/pre>\n<p> There are two main differences between this and our original version. <\/p>\n<ol>\n<li> In this version, instead of using the builtin &#8220;&gt;&#8221; comparison operator, we use a comparison function passed as a parameter to the<br \/>\ninsert.<\/li>\n<li> Instead of performing comparisons using the <em>entire<\/em><br \/>\nvalues, we perform comparison on <em>keys<\/em> using an extraction function passed as a parameter to the insert.<\/li>\n<\/ol>\n<p> For searching the tree, we want a different kind of operator than<br \/>\nwe used in the generic BST. We don&#8217;t just want to know if there&#8217;s<br \/>\na value with a particular key in the tree; we want to <em>get<\/em><br \/>\nthat value. And that leads us to a problem: What if there is no value<br \/>\nwith the specified key in the tree? What do we return?<\/p>\n<p> As it happens, Haskell has a helpful solution for that. There&#8217;s a type<br \/>\ncalled <code>Maybe<\/code> designed for exactly this purpose: it&#8217;s used for<br \/>\nboth parameters and return values where a value is optional. You can think<br \/>\nof it as being something like a metatype that allows you to distinguish<br \/>\nbetween a type that *must* have a value, and a type that allows null<br \/>\nvalues. A value of type <code>Maybe a<\/code> can be either <code>Just<br \/>\nx<\/code>, where x is of type <code>a<\/code>, or <code>Nothing<\/code>. <code>Maybe<\/code> also happens to be a monad, which gives it some really neat properties that we&#8217;ll see later.<\/p>\n<p> We&#8217;re going to put the two function parameters <em>first<\/em> in the implementation; you&#8217;ll see why in a moment.<\/p>\n<pre>\n&gt;kstSearch :: (KeyExtractor a b) -&gt; (CompareFunc b) -&gt; (KeyedSearchTree a) -&gt; b -&gt; Maybe a\n&gt;kstSearch  keyextract compare (KSTNode v left right) key =\n&gt;         case (compare (keyextract v) key) of\n&gt;             Greater -&gt; kstSearch keyextract compare left key\n&gt;             Equal -&gt; Just v\n&gt;             Less -&gt; kstSearch  keyextract compare right key\n&gt;kstSearch _ _ KSTEmpty _ = Nothing\n&gt;\n<\/pre>\n<p> Before moving on, there&#8217;s one new thing in that implementation: the <code>_<\/code> pattern. In Haskell, that&#8217;s a pattern that means &#8220;I don&#8217;t care&#8221;. It matches any value, but doesn&#8217;t bind it to anything. So you can&#8217;t access the value of a parameter matched by an &#8220;_&#8221;, which makes sense, since by using the &#8220;_&#8221; you&#8217;ve said that you <em>don&#8217;t care<\/em> what value<br \/>\nis used there. Since searching an empty tree will always fail, we don&#8217;t<br \/>\ncare what functions are provided for doing compares, and we don&#8217;t care what key is being searched for. We know the result of the function without<br \/>\never looking at them: nothing can be fund in an empty tree.<\/p>\n<p> To be able to use this new tree, we need to provide a type for its<br \/>\nelements, and the comparison and key-extraction functions for that type.<br \/>\nWe&#8217;ll start with a tuple implementation. In Haskell, tuples are a kind of<br \/>\nfixed-length sequence where the tuple type specifies a distinct type for each of values in the sequence. For example, <code>(1, 3.14, \"foo\")<\/code> is a tuple with three elements (a <em>3-tuple<\/em>), and the first element has type <code>Integer<\/code>, the second has type <code>Float<\/code>, and the third element has type <code>String<\/code>. Tuple types are written in the same way as tuples, but with types instead of values, so our example tuple as a whole has the type <code>(Integer,Float,String)<\/code>.<\/p>\n<p> For a search tree that looks like a typical dictionary, we can use an element type that&#8217;s a simple 2-tuple; the first element of the tuple will be a key, and the second will be the value. We don&#8217;t need to completely specify the types of the two elements of the tuple; we&#8217;ll have a more flexible implementation if we use type variables. For the search tree to work, the key type will need to be an ordered type, so we&#8217;ll constrain it with a type-class clause. <\/p>\n<pre>\n&gt;trivialCompare a b | a &gt; b = Greater\n&gt;            | a <b>            | otherwise = Equal\n&gt;\n&gt;tupleGetkey :: Ord k =&gt; (k,v) -&gt; k\n&gt;tupleGetkey (k,v) = k\n<\/pre>\n<p> Now we get to the reason for putting the function parameters first. Remember how we talked about currying the other day? Here&#8217;s a great example of how you might use it. We&#8217;ll create two new custom insert and search functions for our tuple based binary search tree, by providing<br \/>\nonly the first two parameters (the key extractor and compare functions)  to the generic keyed search tree insert and search functions; the result of doing that will be to return a two-parameter function with the key extract and compare functions internally bound, so that users of our tree don&#8217;t need to worry about the details. To them, the search tree <em>just knows<\/em> how to work with tuples. <\/p>\n<pre>\n&gt;\n&gt;tupleKSTInsert :: Ord k =&gt; KeyedSearchTree (k,v) -&gt; (k,v) -&gt; KeyedSearchTree (k,v)\n&gt;tupleKSTInsert = kstInsert tupleGetkey trivialCompare\n&gt;\n&gt;tupleKSTSearch :: Ord k =&gt; KeyedSearchTree (k,v) -&gt; k -&gt; Maybe (k,v)\n&gt;tupleKSTSearch = kstSearch tupleGetkey trivialCompare\n<\/pre>\n<p> We can get even a bit fancier: we can use another form of type definition to define a tuple tree type, and hide the fact that the<br \/>\nimplementation is even based on our <code>KeyedSearchTree<\/code>. In fact, we can even mask out the fact that it&#8217;s tuples!<\/p>\n<pre>\n&gt;type DictionaryTree k v = KeyedSearchTree (k,v)\n&gt;\n&gt;dtInsert :: (Ord k) =&gt; DictionaryTree k v -&gt; k -&gt; v -&gt; DictionaryTree k v\n&gt;dtInsert tree k v = tupleKSTInsert tree (k,v)\n&gt;\n&gt;dtSearch :: (Ord k) =&gt; DictionaryTree k v -&gt; k -&gt; Maybe v\n&gt;dtSearch tree key =\n&gt;          case (tupleKSTSearch tree key) of\n&gt;             Just (k,v) -&gt; Just v\n&gt;             Nothing -&gt; Nothing\n&gt;\n&gt;dtEmpty = KSTEmpty\n<\/pre>\n<p> Just for the sake of being twisted, here&#8217;s a little bugger to populate<br \/>\na <code>DictionaryTree<\/code> using a list of key\/value tuples. Since a user of a <code>DictionaryTree<\/code> doesn&#8217;t actually know that it&#8217;s got tuples inside, this actually decomposes the tuples from the list in order to call <code>dtInsert<\/code>. (Hey, I <em>said<\/em> it was twisted!)<\/p>\n<pre>\n&gt;populateDict :: Ord k =&gt; [(k,v)] -&gt; DictionaryTree k v\n&gt;populateDict pairs = foldr ( (k,v) t -&gt; dtInsert t k v) dtEmpty pairs\n&gt;\n&gt;exampleDict = populateDict  [(\"a\",4),(\"b\",17),(\"c\",21), (\"abc\",13), (\"bd\", 18)]\n<\/pre>\n<p> And here&#8217;s a quick demonstration of using it in GHCI:<\/p>\n<pre>\nkokopelli:~\/Documents\/Blog markcc$ ghci tree2.lhs\n___         ___ _\n\/ _  \/  \/\/ __(_)\n\/ \/_\/\/ \/_\/ \/ \/  | |      GHC Interactive, version 6.6, for Haskell 98.\n\/ \/_\\\/ __  \/ \/___| |      http:\/\/www.haskell.org\/ghc\/\n____\/\/ \/_\/____\/|_|      Type :? for help.\nLoading package base ... linking ... done.\n[1 of 1] Compiling Main             ( tree2.lhs, interpreted )\nOk, modules loaded: Main.\n*Main&gt;\n*Main&gt; exampleDict\nKSTNode (\"bd\",18) (KSTNode (\"abc\",13) (KSTNode (\"a\",4) KSTEmpty KSTEmpty) (KSTNode (\"b\",17) KSTEmpty KSTEmpty)) (KSTNode (\"c\",21) KSTEmpty KSTEmpty)\n*Main&gt; dtSearch exampleDict \"c\"\nJust 21\n*Main&gt; dtSearch exampleDict \"q\"\nNothing\n*Main&gt;\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Last time around, I walked through the implementation of a very simple binary search tree. The implementation that I showed was reasonable, if not great, but it had two major limitations. First, it uses equality testing for the search, so that the implementation is only really suitable for use as a set; and second, because [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[89],"tags":[],"class_list":["post-234","post","type-post","status-publish","format-standard","hentry","category-haskell"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-3M","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/234","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=234"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/234\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=234"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=234"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=234"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}