Why Haskell?

Before diving in and starting to explain Haskell, I thought it would be good to take a moment and answer the most important question before we start:

**Why should you want to learn Haskell?**

It’s always surprised me how many people *don’t* ask questions like that.
Haskell is a valuable language for a lot of different reasons, but the most important one is that *it
changes the way that you think about programming*. Haskell does things in a very different way from the imperative languages that most of us are familiar with. And it’s an extremely well-designed language, so there isn’t a ton of crap standing between you and the concepts that you can learn from it.

Before I get into what’s so interesting about Haskell, I want to get one thing out of the way. One of the most commonly cited reasons for why Haskell is good is, in my opinion, silly.
The number one thing that you’ll often hear from functional programming afficionados is that Haskell
is *referentially transparent*, which makes it easy to do things like reason about programs, and write correctness proofs of programs. Writing a correctness proof of a C program (or even an Objective CaML program) ranges from difficult to effectively impossible – the presence of side effects in the code is *very* hard to cope with in formal reasoning, and when programs get beyond trivial simplicity, they rapidly become intractable for proofs.
Referential transparency is a good thing. But the reason why it’s good has very little to do with
correctness proofs. The whole “correctness proof” line is silly because *no one writes correctness proofs*. Real programs – programs that solve real problems – are generally large and complex enough that even if you’re using a language like Haskell, writing a correctness proof *still* isn’t practical. Sure, you can easily write a correctness proof for an implementation of a binary search tree in Haskell. You *can* write a correctness proof for that in, say, Java. But the Haskell proof will be easier and cleaner, and you’ll be able to trust the connection between the code and the proof much more than you could in Java. But really, a good set of tests – which you *should* be writing, whether you’re programming in Java, C, or Haskell – does as good a job of verifying the correctness of something simple like that. And for anything significantly more complicated than that, people just *don’t* write proofs. I’ve certainly *never* seen a correctness proof for a Haskell program, except in papers arguing about how provable correctness is in Haskell. (Does GHC have correctness proofs? I doubt it. It’s certainly had quite its share of bugs, which means that either it’s correctness isn’t proven, or the proof is wrong!)
So what makes Haskell so wonderful? Or, to ask the question in a slightly better way: what is so great about the pure functional programming model as exemplified by Haskell?
The answer is simple: *glue*.
Languages like Haskell have absolutely *amazing* support for modular development. You build the basic semantics of your system in terms of primitive functions, and then the language provides tons of very elegant *glue* for putting the pieces together. In an imperative language, the fact that the order of execution is highly constrained by side effects means that the ways of connecting components are limited by the fact that you could break things if the glue doesn’t work quite right.
But the basic properties of Haskell make it much easier to build generic glue. Lazy evaluation
combined with no side effects means that many of the problems that make generic glue difficult in
imperative languages just simple don’t exist in Haskell. You don’t need to worry about whether the
iteration glue will correctly ensure that array element *n* is updated before array element *n+1*; lazy evaluation will take care of making sure that the data dependency order is respected. And the fact that building new functions, and currying existing ones is a lightweight, syntactically trivial operation means that the glue can be written clearly and concisely.
The Monad concept, which I’ll get to in a later post, is an amazing example of this. Monads are a basic tool for *sequencing*: when things need to be done in sequence, the Monad constructs allow you to make that happen. But there’s not just *one* monad that does sequencing. There’s a basic *concept* of what a monad is, and *anything* you implement that fits that concept can use *all* of the monad tools provided by a language. So I can use the same constructs for writing sequencing code for list processing, for performing IO, for writing numeric code that does updates to mutable arrays, for combining non-deterministic computations, and more.
A very typical example of that elegance of glue in Haskell is this implementation of quicksort:
qsort [] = []
qsort (x:xs) = qsort smaller ++ [x] ++ qsort bigger
where smaller = filter (=x) xs
You can probably understand that code even *without* knowing anything about Haskell. What it says is:
1. The result of sorting an empty list is an empty list.
2. To sort anything bigger than an empty list, do a divide-and-conquer:
1. Get the first element of the list, `x`, and use it to divide the list into two sublists: things smaller than `x`, and things bigger than `x`.
2. The sorted list is the result of concatenating the results of sorting the list of smaller elements, `x`, and the results of sorting the list of bigger elements.
3. To get the list of smaller elements, select the list of elements from the list that are less than `x`; and to get the list of bigger elements, select the elements from the list that are greater than or equal to `x`.
This uses a generic concatenation function (“`++`”), a generic filter function for selecting the sublists (“`filter`”), and a simple currying operation to generate the comparison functions for using filter to create the sublists. (“`(<x)`" is Haskell notation for the currying of "`<`"; it's equivalent to "((λ x . (λ y . y < x) x)" in lambda calculus. For the purposes of what I've been discussing, the concatenation and filter operations are typical Haskell glue, and the
"`where`" clause is a typical Haskell idiom for splitting things up to make the glue
easier to read.
There's a lot more about Haskell that's great like the strong type system, the Monad constructs,
user definable operators, type classes, and more. But in my own humble opinion, the thing that makes it all really come together is the way that the language makes it so easy to build big complex things by gluing together small simple pieces.

0 thoughts on “Why Haskell?

  1. Mark C. Chu-Carroll

    I took the initiative of fixing an HTML error in your comment; greater than and less than signs need to be written as HTML entities.
    Anyway – no, (< x) isn’t quite the same as (λ y . y < x); in (λ y . y < x), the “x” is free. But in (< x), “x” is bound. To make “x” bound in the function without introducing a lot of funny semantics about environments (which aren’t necessary in a language without assignment) is to do the expansion that I showed.

  2. Sean

    I see a lot of introductory-type articles like this for good but lesser-known languages such as Haskell. A lot of them go on, as you do, about how language X is awesome when building large systems and then proceed to show some tiny function (like Quicksort in this case) as evidence. It’d be nice to see an actual large scale example rather just blindly accepting the assertion that it’s all peaches and cream. An example where there’s a comparable program written in, say, C++ or Java or something, would be nice because then we who haven’t used the language can take a look at the two versions side-by-side and see where things are different and for what reasons.
    In the above case, I’m not very familiar with Haskell, but something that frequently comes up in real life is the need to sort by some alternate criteria than, say, less-than. I’d like to see how Haskell handles such a situation using the above quicksort implementation and modifying it or whatever is necessary in order to support sorting by some arbitrary rule, or a specific property of an object, or something like that. (I release that reverse sorting could simply be done by reversing the resulting list – but I’m more interested in how you might sort by a property of an object or a custom function or something along those lines.)

  3. Mark C. Chu-Carroll

    Patience please! I haven’t even really started to introduce the language yet. I promise that I will get to the point of showing some real, complicated examples of glue for non-trivial applications. But I need to teach the basics of the language first!

  4. Dan Glick

    Question from an imperative programmer: what memory allocations actually happen when this qsort algorithm is executed? At each recursive step, are we allocating three new lists (one for smaller, one for bigger, and one for the result of the concatenation)?
    I assume that’s dependent on the specific Haskell implementation; but for real-world problems, this would seem to make a significant difference.

  5. Mark C. Chu-Carroll

    As you suggest, it really depends on the Haskell compiler.
    Most Haskell compilers (maybe all?) won’t transform it to an in-place sort, but the good ones should be capable of recognizing that once the partition is over, the original list is never re-used, and therefore using a two-copy strategy for generating code: allocate two lists equal in length to list to be sorted; first partition copies from L1 to L2; next L2 to L1, and so on.

  6. John Armstrong

    Just to see if I’m on the right track: “Monad” in Haskell talk is supposed to be the same as (or at least a close cousin of) “monad” in topoi? The thing they used to call a “triple”?

  7. Tomas Studva

    that are myths. In reality it is hard to write applications in functional languages, but it is possible to write small programs(or critical sections) and I agree that it is suitable for proofs(many proofing and auto proofing systems are based on logic or functional languages).
    First, I don’t think functionality is more closer to human thinking (functional = recursive, imperative = structured).
    Second, I haven’t seen good functional language, I don’t know haskell, but I’ve learned Lisp and CL. Another famous are ML based, for example F# from Microsoft. Problems in standard functional language: they aren’t object oriented. They are not typed, there is very few programming security(private, public, …).
    But I think it is possible to use such languages, but it requires to be language developed in more practical style.

  8. Mark C. Chu-Carroll

    I like Hugs for its interactivity, but I’m using the EclipseFP plugin, which uses Hugs for interactive mode, and GHC for compilation, so every example I show should definitely work in both.

  9. Mark C. Chu-Carroll

    Pardon me if I sound rude, but you definitely are not very-well clued when it comes to functional languages. Talking about shortcomings of Lisp isn’t relevant to a discussion of modern functional languages.
    Most people haven’t considered Lisp a functional language for years, because it’s so dependent on non-functional behavior. The real functional languages for about the last 20 years have been *very* strongly typed. Much more strongly typed than Java, C++, C#, or anything in that vein.
    Functional languages are *not* object-oriented – precisely because functional programming is an *alternative* to OO. The main things that object-orientation gives you are modules, state encapsulation, and polymorphism. Haskell (and most other functional languages) provide strong module systems with excellent (and flexible) security; Haskell is designed to avoid the use of state, and encapsulates it via monads when necessary; and Haskell provides a really beautiful mechanism called type classes, which provides a great mechanism for polymorphism.
    Haskell and similar languages *do* provide support for modular security. One of the fundamental constructs of the language is the module – and module-based security is *better* than object-based; instead of “public/private/protected”, you can provide *as many security modes as makes sense for your module.*

  10. Doub

    “((λ x . (λ y . y < x) x)”
    There are more opening parentheses than closing ones. If that’s a language feature it seems quite confusing. If its a missing parentheses, I guess it comes just before the last x.
    ((λ a . (λ b . b < a)) x) would be clearer to novices (like me) than ((λ x . (λ y . y < x)) x).

  11. u221e

    Sorry Mark, but I just don’t dig the lazy evaluation that Haskell uses. Then there are the monads…
    About CL not being OO, I’m surprised that you haven’t heard of CLOS (Common Lisp Object System). It does OO programming a hell of a lot better then C++ that’s for sure (but what doesn’t?).

  12. Pseudonym

    First off, Alexei is right. (<x) is translated to y -> y < x, where y is a variable which does not occur in x. (This is how you write λy. y < x in Haskell.) Mark’s translation finesses the problem of name capture, but the fact is that a Haskell compiler can always manufacture a new name.
    It gets confusing because infix operators are themselves shorthand for function application in Haskell. I won’t preempt Mark, because I’m sure he’ll mention this in due time.
    Tomas: As Mark said, functional programming is an alternative to OO programming in that they solve the same problem. The reason why large amounts of C code can get quite unmaintainable is global state. Pure functional languages like Haskell solve the problem by having no mutable state at all, but providing the tools to simulate it if it turns out you really need it. Object-oriented languages like C++ solve this by using lots of local states instead. (Erlang, while not object-oriented, has a similar approach, using threads each of which has a local state. Erlang takes this to an extreme by having no global heap, only thread-local heaps with inter-thread communication primitives. I digress.)
    Haskell does have a very expressive form of OO programming via typeclasses. Again, I won’t preempt Mark too much, but I’ll just say that in C++ or Java, the keyword class actually means three separate things: It declares a set (“class”) of types, it declares a concrete type, and it declares that the concrete type is a member of that set of types. In Haskell, these are three separate steps.
    The lack of traditional OO-ness hasn’t proved a problem for Haskell for the most part, as there are plenty of big-enough applications which do the job. Component systems, such as CORBA or DCOM, can be done well enough using typeclasses. There’s really only one thing that OO does better and that’s modern GUI programming.
    I don’t think it’s a coincidence that modern GUIs and OO languages grew up together, because they’re a very natural fit. Having said that, there are a number of interesting approaches, such as Clean’s Object-IO (based on linear types) and Haskell’s Fudgets (based on Freyd categories, also known in Haskell as “arrows”). You certainly can get GUI stuff done in Haskell, but it’s fair to say that the “right” solution is still a research problem.
    I occasionally hear the claim that OO programming is “closer to human thinking” than other approaches, but I’m yet to hear any data to back that up. I think that OO’s main strength is that it comes with a design technique, OOAD. The fact is, though, that all nontrivial OO programs go outside OO at some point, using other paradigms such as generics (e.g. C++ templates) or relational modelling (in the form of calling out to SQL-based databases).
    I think you’ll like Haskell if you try it. I used to be one of those people who flamed C++ until I actually wrote a real program in it. Then I was hooked. Many people report the same of Haskell.
    And if nothing else, learning Haskell will make your other-language programs better.

  13. Pseudonym

    Oh, one more thing.
    The qsort algorithm given above is quite inefficient in both transient memory allocations and run-time complexity. Quick sort is not designed for linked lists, it’s designed for in-place mutable data structures. If you wanted to write an efficient sort in Haskell, you’d be more likely to use something like merge sort.
    Even if you were tempted to write quick sort on lists, a seasoned Haskell programmer would not write it that way. More on this when the time comes…

  14. Mark C. Chu-Carroll

    I didn’t say Lisp didn’t include OO; what I said what that Lisp isn’t considered a functional language by most people. I’m actually a big fan of CLOS, although I tend to prefer
    CLOS-like implementations in Scheme; CL gets on my nerves something awful.

  15. Mark C. Chu-Carroll

    Pseudonym is quite right that the quicksort I showed as an example is not really how a good Haskell program would write it – the algorithm is a poor match for the data structure. But
    it’s a good as a very simple example of how you glue things together in haskell. It’s almost identical to a standard textbook pseudo-code description of quicksort, and that’s what I find compelling about it as an example. (And about a lot of Haskell code; even very sophisticated code for big systems generally reads very smoothly – moreso than for any other language I’ve seen.)
    Haskell’s not my favorite language, but I’ve got to admit that there’s something very amazing and compelling about
    its kind of expressiveness. It makes decomposing a system into small parts, and then gluing those parts together into such an easy, natural process.

  16. Koray

    MarkCC, since you wrote on lambda calculus before, I hope you get into the (officially unspecified) operational semantics of Haskell, something like starting with the main function and tracing how things are evaluated.

  17. u221e

    The CLOS comment was aimed at Tomas, sorry about the confusion.
    Even though I’m not into Haskell anymore I will say that it’s a beautifully designed language, just not quite in the way I prefer.

  18. mgsloan

    (in reply to “sort by some alternate criteria than, say, less-than. I’d like to see how Haskell handles such a situation”)
    I’d use the base libs sortBy function in reality, however just for you:
    qsortBy _ [] = []
    qsortBy f (x:xs) = qsortBy f smaller ++ [x] ++ qsortBy f bigger
    where smaller = filter (y -> (f y x) == LT) xs
    bigger = filter (y -> (f y x) == GT || (f y x) == EQ) xs
    This will indeed handle any sorting order you can imagine, on any type.
    Mark C(author):
    Nice article, definitely glue is what makes functional programming languages, well, functional. I’d give some props to John Hughes’s paper “Why Functional Programming Matters”, a popular paper which makes basically the same point about glue (even calls it glue), but in more detail.
    That is also a pretty bad example of glue – it only uses one other function. Also, it’s already implemented in Data.List. Haskellers make a point of actually using libs. Sometimes you might write a rather general function, do a search (http://haskell.org/hoogle/
    ) on its type, and find the function you just wrote (though usually better, and better tested).
    Also, your lambda calculus expression has an extra opening paren, and serves no purpose but to confuse, and make you look smart (well, until you notice the extra paren). Just express it in regular haskell, (y -> y

  19. mgsloan

    Hmm, looks like the crappy blog software botched my haskell lamdba. lets try again, with xml literals.
    (y -> y &lt x)

  20. Mark C. Chu-Carroll

    Rehashing a *good* statement isn’t necessarily a bad thing. I haven’t seen Hughes’ paper (if I had, I would have linked it),
    but the glue property of functional languages is really fundamental – if I’d missed that in my understanding of Haskell, it would just mean that I didn’t get Haskell. And making things accessible is my major goal on this blog – I’m not pretending that anything I’m writing here is new, profound, or even particularly original. What I try to do is make stuff that I think is cool and exciting accessible to other people, so that they can see why it’s cool and exciting.
    My point in the original article with the lambda expression was to try to avoid one confusion that bit me once upon a time. What confused me at the time was that functional language folks talk about referential transparency and state independence and all that – but then all of the anonymous functions *capture* a kind of state – their lexical environment – and then carry it with them as they’re passed around. That bugged me – I didn’t get it, until someone (I wish I remember who!) pointed out that in terms of primitive lambda calculus, the
    lexical scope capture was really a shorthand for a new lambda-binding implicit in the function definition. In writing this article, I thought it was clearer to make the way that the lexical scope captures the surrounding bindngs explicit.
    And finally – it’s not a terribly deep example of glue, but it’s *something* that demonstrates the basic idea of what I was saying, and shows a taste of Haskell makes code clear. Given that I haven’t really introduced *any* Haskell syntax or semantics yet, it’s pretty hard to show a more interesting example of glue that can be comprehended by people who don’t already know Haskell. Showing something like parser combinators, which are an amazing example of building interesting things with glue, just wouldn’t be appropriate until I’ve covered enough stuff so that all of the people reading the blog could understand it.

  21. mgsloan

    Fair enough. The reason I figured the lambda calc stuff might be confusing is that few know it, and very few know it yet don’t know haskell (or some related language). It’s also not really required for understanding haskell to a deep level.
    On the glue, yes, I suppose it is difficult to have profound examples of glue when the audience is assumed to not yet be knowledgeable. I was actually thinking of more interesting list actions, those using multiple prelude/data.list commands, however I suppose qsort is good because most already know it. I should have read more carefully and caught the fact that it’s the ‘first of many’ haskell articles.
    Good luck with the rest!

  22. Keith

    For those interested in a bigger haskell app,
    have a look at darcs.net
    it’s a cool source control system, plus it’s an open source haskell app

  23. Flaky

    I haven’t had the time to learn Haskell, but I’m quite fascinated with the concept of having no mutable state, so I’m looking forward to Mark’s future articles.
    As it seems that parallel programming will become the norm rather than the exception in the near future, I’d like to know what you think of the possibility of using a language with this no mutable state property in multithreaded programming. It seems to me that it could solve many of the problems due to faulty synchronization between threads, since read-access-only does not need any.

  24. Tomas Studva

    I am pleased 🙂 with the discussion. What is discussion if there is no other side opinion. Thanks for replies.
    I must something explain. I’ve used Scheme lisp R5RS standard.
    The CL stands for Clausal Language and is academic logic programming language and proofing system, based on natural numbers and pair function. Suitable for learning and proofing.
    I know lisp has object oriented implementation, but that is not truly lisp for me.
    I don’t like c++, awful. I like Java and even more C#.
    Yes, new declarative/functional programming languages have modern features. And declarative programming is in great demand, even in .Net and Java, but not as an alternative to Java.
    So, if Haskel is so excellent, why is not so popular? OK, gui is problem, but what anything other? Maybe the OOAM as said Pseudonym? I don’t know. I know that objects are very good to meaningfully divide responsibility. But something is faked in OO, for example logging, that must be every where. Solution = AOP = Aspect Oriented Programming, practical, but nothing nice.
    At the end, I must admit, I read very less about Haskell, but Haskell was on my learning list, but today it is not possible. So words “that are myths” were too strong.
    Regards Tomas

  25. Tomas Studva

    One more comment. Why to use Haskell (I’ve taken some tutorial)? Because there you make less bugs and code is written fast. If this is true, it rocks. That was the main thing, I’ve found in Java, migrating from C++.

  26. Mikael Johansson

    John Armstrong: You already got the long answer. The short one is that the Monad type class is supposed to follow a set of axioms that are equivalent to the classical triple/monad/triad/whateveryoumaycallit category theory axioms.

  27. Harald Korneliussen

    What, mergesort is easier on lists than quicksort? I don’t quite see how… in mergesort, you have to divide the list/array in the middle, and for linked lists that means counting forwards to it. With quicksort, you don’t have to do that, so, shouldn’t it be faster in practice?
    I like Haskell, I’m trying to learn Haskell, but… As I see it, Haskell’s system of type classes is a way of structuring your application around your data, and as such it is a form of object-orientation. Implementation inheritance is neither needed nor sufficient for OO, it’s interface inheritance that really matters. And Java supports those, and Haskell too, in what is probably a more extensible fashion (from what I’ve seen so far).
    Also, higher order functions have been around a long time, and the revolution hasn’t come yet. I can think of a couple of more or less plausible reasons for that:
    1. Efficiency. There is always some loss of efficiency, just look at the example above: filter is called twice, once it gets out (=x), the second time it does it the other way around. Now laziness can fix inefficiency in some instances (or so they say) as can a truly clever compiler, which is probably more likely to materialize for haskell, but you’re outsorcing a lot of the efficiency to the language implementors.
    2. Side effects make things complex. Higher order functions with side effects can make things pretty hard to follow, debug and prove (Ha! as if anyone did that. Interesting to see that just about all commercial work on provably correct programs is done in subsets of imperative languages, like SPARK Ada). This should not be a problem for Haskell.
    3. Higher order functions quickly become so abstract that they are hard to read and use. Sure, map is easy, filter is easy, which is illustrated by the fact that it is possible to give good english names to them. But foldl and foldr? well, we learn what they mean after a while, at least they are standard across several functional languages. But what do you call a function which takes a predicate, two functions and a list, which is so that for every element in the list, the first function is run on it if the predicate is true, and the second if it is false? And what if you wanted to give the functions the integer position of the list as an argument as well? Should you make another function then, or just zip the list with [1..] (and adapt the passed functions accordingly)?
    A problem with 3, perhaps the problem, is that it seems that functional programmers give up finding good names for their higher order functions, and take the “mathemathical” shortcut of just adding a little tick, or a subscript, or some random letter, or just making up an identifier that is as short as possible (has anyone come across `ap` in haskell recently?). Now, when the object-oriented community ran across a similar problem with handling high levels of abstraction, the gang of four came to the rescue: The important thing they did was not so much describing design patterns as _naming_ them. Now non-OO programmers may laugh at names like SingletonStrategyFactory, but they actually convey quite a lot about what the objects are.
    There are very many ways to generalise, and choices will have to be made about which ones are most useful, practical and understandable (and alas, “powerful”). Also, perhaps more important than generalisation is respecialisation, so that people can write product l instead of foldl * 1 l.
    It seems to me that the paths that abstraction take in functional programming is dictated very much in a top-down manner. Arrows in Haskell, for instance, were developed because they are a nice mathemathical generalisation of monads, not because they filled a need in the day-to-day life of developers (correct me if I am wrong).
    I realise that I should perhaps have made some choices about the direction this post should have taken as well, there’s just too much to say about all that’s wonderful and all that’s problematic about functional programming…

  28. Don Stewart

    On the topic of parallel programming in purely functional languages, like Haskell, there has been a lot of research. Its finally coming to fruition now though, with the wide availability of multicore SMP machines. GHC 6.6 supports smp parallel threads right now, and you can take advantage of at least 3 kinds of parallel programming: explicit threads with ‘forkIO’, implicit threads with ‘par’, and, newly, support for data parallel arrays (i.e transparently parallel arrays).
    GHC Haskell has become something of a parallelism playground, with new smp/multicore-supporting libraries and applications appearing weekly.

  29. daniel

    this is cool. I’ve been interested in functional langauges for a while, though have yet to really grasp.
    what are the prerequisites to get this though? Do I need to understand (for instance) what “currying” is to have any kind of a chance at getting Haskell?

  30. Flaky

    Daniel: Seeing Mark’s past style, I’d say that the prerequisites aren’t all that high (which is good). But, a good way to get in the right set of mind for functional programming is to look at lambda-calculus. Mark had a fairly good introduction to that as well on his old blog, which you can probably find easily enough. I personally was hugely helped by a basic knowledge of lambda-calculus when I stumbled upon Standard ML and O’Caml.

  31. Mark C. Chu-Carroll

    I’ll try to get to some of your other points in more detail later, but I have time to answer the quicksort/mergesort question now.
    Quicksort, as an algorithm, is well-suited towards arrays, where values can be updated in-place. Written on linked lists, as in the Haskell code, it causes a lot of wasteful allocation
    and copying of data. (That’s not really a statement about Haskell, but rather a statement about what kind of data structures the quicksort algorithm is suitable for.) If you’re working on data which is stored in a linked list, mergesort is a better choice.
    If we were writing a Haskell sort for its equivalent of an array, then we’d prefer quicksort – mergesort isn’t particularly well-suited towards array.

  32. Peter Rosenbeck

    When I started my company in 1985, I was mainly influenced by the Lisp machine and the incredibly elegant code which went with it (the OS and the windowing system were written in Lisp and Flavors, the latter being one of the early OO extensions of Lisp). I wanted to use PCs as a platform and we first experimented with a windowing system similar to the one I knew from the Lips machine which was done in Scheme (which at that time was distributed by Texas Instruments). But when Digitalk released its Smalltalk implementation around 1987 this was he language we started to work with and have not stopped using as our main development tool since then. It had a GUI, windows and mouse support long before MS Windows was created. This helped us win our first big contracts. It still gives us a huge producticvity advantage when compared to other tools and languages. OO, when taken seriously, has its rewards in money, because the productivity of a serious Smalltalker is many times that of any other developer (and this is also true for Java, as we had to learn).
    Why am I telling you this? If I had not known this cool stuff (Lisp machines) at that time, I’d never dreamed of doing what I did. I’d probably started with Pascal (as so many did), then migrated to C, from there to some 4GL and then to C++ and Java. I’d thrown away and rewritten everything at least four times! Most of the companies which started at the same time as I did have now gone out of business and this was certainly one of the reasons.
    In my opinion and experience, it pays to be ahead of the main stream. You need to know the crazy stuff of your time to make the right decisions. If I’d had to start now, I surely would not consider Java (or even Smalltalk), but I would have a very close look at Haskell…

  33. ahmad

    I made the choice a couple of weeks ago to study Ocaml instead of Haskell as my first functional language, and I am really enjoying it.
    I heard that one of the main difference between the two is that Haskell uses Monads. From what I understand Monads are used to take care of the GUI and other imperative stuff. My question is, if Ocaml doesn’t use Monads how does it handle GUI ? and is it poorer for missing it? (A general comparsion between the two would be more helpful, but I guess I can find tons of that stuff on the web)
    Also, from the above comments I can see an OO VS functional debate raging on. From what I understand Ocaml does have support for objects, isn’t this the case? ( I havne’t reached the OO support in Ocaml part in my study so I don’t know yet.)
    Anyway I am looking forward for your upcomming posts on the subject, keep it up!

  34. Mark C. Chu-Carroll

    OCaml is not a pure functional language; it’s actually a hybrid. It’s got mutable state that you use when you need to (ref types and mutable fields in record and object types). With the refs and mutable fields, you don’t need something like monads to encapsulate state.
    Most modern GUI frameworks are object based; so in OCaml, you just implement them using objects.
    In Haskell, you don’t have objects, which can make working with a lot of modern GUI kits rather awkward. I don’t think that there’s anything *in principle* that makes Haskell poorly suited to writing GUIs; but it’s a bit of a rough slog with a lot of current toolkits. The general approach in a language like Haskell would be to provide callback functions that handle GUI actions; those callbacks would be Monad based to allow for GUI state.

  35. Pupeno

    Even when the quicksort wasn’t really efficient, when reading that implementation in Haskell I’ve understood quicksort completely for the first time in my life.

  36. gary lardo

    What are you guys smoking?
    There’s just no way that Haskell will ever become mainstream. You couldn’t discuss it without pulling out “monads”, (without a clear explanation of them) which people have been trying to explain on the LtU site for years now, unsuccessfully.
    As for it being a good _glue_ language: is Haskell as good a _glue_ as Perl and the CPAN libraries? I think you’re grasping for some way, any way, to describe Haskell to make it appealing. But it isn’t. It’s an academic language and will remain one until it dies. You can’t fool anyone with this language: it’s too committed to it’s academic origins.

  37. cleek

    is there a real-life non-mathematical/non-programming analogue for ‘functional’ programming ?
    imperative programming is analogous to writing a recipe, or giving someone step-by-step instructions on how to build a birdhouse. it’s a familiar, intuitive way of communication. but, is there a real-life activity that is typically (not cleverly) described in recursive terms ?

  38. Mark C. Chu-Carroll

    I do not intend to let this degenerate into a language war. I’m not saying that Haskell should become the new language that everyone uses for developing software. The only thing that I’m saying is that it’s worth the effort of learning. Period.
    No one is forcing you to read these articles on my blog.
    If all you’re going to do is gripe about how Haskell is worthless compared to some other language X, then please just don’t waste your time reading the tutorial.

  39. Mark C. Chu-Carroll

    Here’s the best metaphor I can come up with.
    One of the things that I like to do for fun is cook. In cooking, if you’re trying to tell another chef how to make a dish, there’s two ways of doing it:
    1. Give the cook the recipe. (Put duck break into hot pan skin side down for 5 minutes; turn and cook for two minutes. Meanwhile, simmer an ancho pepper in two cups of stock, and then puree in a blender until smooth, mixing in 1tsp salt, 2 tbs honey, and 1Tbs fresh cilantro. Slice the duck, and spoon sauce over it.)
    2. Describe the dish to the cook. (It’s a rare seared duck breast, and it’s served in an sweet ancho chili sauce.)
    The first says *how to do it*; the second describes *what the result is*. Both involve some amount of instruction (sear the duck), and some amount of description (sauce should be smooth). But the recipe is mainly focused on step-by-step procedure, and the second is mainly focused on what the end-result should look like.
    The functional vs imperative is a lot like that: imperative is step-by-step instructions of *exactly* how to do everything; the functional is more like the description, where it tells you how to do something when it’s important, but it focuses mainly on telling you what the end result is supposed to be.

  40. Mark C. Chu-Carroll

    My usual question about automatic verification: What does it verify?
    I have no doubt that many common bugs can be detected by a good analysis system; in fact, my work project uses a system that does that. But for *correctness* proofs of programs, you need some kind of specification of just what the program is supposed to do. And even Haskell programmers, in general, won’t write those.
    I’ve spent some serious time working on formal specifications of software systems. For example, I was the lead of the open-source Stellation project. When I started Stellation, one of the first things I did was write a complete specification of the semantics I planned for the SCM system in Z. It was a very useful exercise, but there was no way to connect it to the code. Later in the project, when I was having trouble figuring out the correct algorithm for directory merges, I tried out Dan Jackson’s Alloy tool. Again, a *very* useful exercise (we went through 6 cycles of attempted over a space of at least two weeks, missing something every time; three days working things out with Alloy, and we were absolutely sure that we knew what every possible case was, and the fix that followed fixed the problem.
    But writing a formal spec is *as hard* as writing the program. And it’s as likely to to have bugs of its own! The original Z spec for stellation had numerous bugs, which I only found when tests failed in the implementation. (Alloy was better on that count; it gives you a way of asking if certain invariants hold in your specification, and visualizing counter-examples. So the Alloy spec was, I believe, complete and correct – but I spent at least a full day of debugging time in the Alloy visualizer figuring out why invariants weren’t holding.)

  41. Flaky

    And if writing a specification is hard, proving that a program corresponds to one is even harder. In fact, it’s generally not computable. While some languages might make such efforts easier, there are no miracle solutions. I’ve some ideas/vision of how something like that might be made workable, but Haskell is not it (and I can fairly confidently say that without even knowing the language.)

  42. cleek

    The first says *how to do it*; the second describes *what the result is*.
    doesn’t describing the result, but not describing the details of how to achieve the result, imply that the listener needs some external knowledge (ex. familiarity with searing duck breast in general) ?

  43. Pseudonym

    Merge sort, conceptually has two phases. The first phase is to come up with some initial sorted “runs”, and the second is to recursively merge the runs, either top-down or bottom-up.
    Here’s the key: There is more than one way to perform the initial phase. One common way to sort a large file, for example, is to read in a bunch of records, sort them in memory, then spit that out as a “run”. When you’ve processed the whole file, merge the runs.
    Here’s one simple way to do it in Haskell:

    initialRuns :: [a] -> [[a]]
    initialRuns = map (:[])

    Pulling that definition apart is an interesting exercise, because it demonstrates currying, operator sections and higher-order programming. Basically, this function takes a list (say, [1,2,3]) and turns each element into a singleton list ([[1],[2],[3]]). There are your initial runs. Now do a bottom-up polyphase merge pass and you’re done.
    The function which turns an element into a singleton list (1 -> [1]) is (:[]), commonly known as “box”, but also occasionally referred to as the “robot ninja monkey operator”. (Think of it as an emoticon and you’ll see what I mean.) It’s an operator section, and it expands to x -> x : [].
    There are even more elegant and efficient ways to produce an initial set of runs in Haskell, but I’m going to hold off, because the coolest ones requires some understanding of lazy evaluation and continuation-passing style.
    One final comment: Some people are saying that “Haskell will never become mainstream”. Whether or not that’s true, it’s completely beside the point, as Mark says.
    Learning other programming languages is good for you, especially if they’re really different from your current languages. Take C++, for example: Template-based metaprograms are pure functional programs, so learning Haskell will help you “get” the style of programming required. STL/Boost-style iterators are lazy streams, so learning Haskell will give you new ideas about how to use them. On a personal note, many is the time that I’ve prototyped a difficult structure-hacking algorithm in Haskell before I re-implemented it in C++ for my paid job. It saved a lot of time overall.
    It has been said that every programmer should learn a new programming language every year. Not because you’ll ever use it again, but because it’ll help you understand your craft better.

  44. Harald Hanche-Olsen

    This brings back memories. I messed around a bit with Haskell in 2000, and enjoyed it until I got stuck on monads. Moreover, I think it made me a better Lisp programmer, as I naturally tended to look for functional ways to do stuff in Lisp. (There is a point beyond which that becomes counterproductive, though.)
    I just wanted to share this little gem, from Tim Peters at comp.lang.python. It returns all the primes:

    primes = sieve [2..]
    where sieve (p:rest) = p : sieve [n | n <- rest, mod n p /= 0]

    Interestingly, however, my own much clunkier version of the sieve ran much more efficiently on hugs. So writing elegant code and writing efficient code are not the same.
    PS. There seems to be a bug in the preview/post mechanism: When I write &lt; and hit the Preview button it becomes < in the edit field, so I have to change it back to &lt; before hitting Post. (With this PS, my post has so many entities in it I am going to take a chance and post it without previewing it.)

  45. JimDesu

    Looking forward to the rest of your explication of Haskell. I’ve learned enough of it to respect it if not want to use it. Sadly, they seem more excited about stuff that’d be real helpful for implementing your own cosmos (like GADTs), but last I saw still didn’t have decent record support! Maybe you’ve found the secret handshake for actually getting something done in the language? 🙂

  46. joe

    Flaky: you’re exactly right. Haskell people are working on something they call Software Transactional Memory, which is designed for concurrency. Another language without mutable state is Erlang, which is designed from the ground up for massive concurrency. Ericcson built it for telephone switching applications, then opensourced it. It uses lightweight threads with message-passing, can easily scale up to tens of thousands of threads, and is great for distributed processing. Ericcson claims to have an application using it that only gets 30ms downtime per year…99.9999999% availability. Now people are starting to apply it to other applications; there’s an online poker server, a Jabber server, and people working on web app stuff.

  47. Coin

    Basically here’s my concern with that quicksort implementation, with the a:b and a++b++c things. That all looks well and good, but what if you are not using the default haskell lists? What if you’re using some alternate list-like datatype, or structure, or something really exotic like a packed binary string or a parsed xml tree structure which just happens to offer the possibility of sequential data item access and some kind of item comparison? Can you reuse that fancy-looking 3 line quicksort implementation? Can you even do anything remotely resembling the same thing? Can haskell be taught ++ and : and

  48. Coin

    Messed up by the html entity bug. Assume my first paragraph in the post above was supposed to end with:
    < can mean different things in different contexts or on different datatypes, allowing you to use the same qsort() implementation for lists as you do for alternate structures?

  49. Mark C. Chu-Carroll

    In Haskell, pretty much all functions are overloadable, and overloadable in a rigourous way, using the type-class mechanism. The quicksort up there is specific to lists, because I was trying to present the simplest example that I possibly could. Normally, you’d write it a little bit different, and that difference would be enough to make it work.
    Basically, the sort operates on a sequence, and relies on two basic operations:
    1. Separate the first element of a sequence from the rest of
    the it.
    2. Concatenate two sequences.
    That’s easy to express in Haskell using typeclasses. I’ll show how when we get to that point; for now, I’d have to explain too much in the way of preliminaries.
    2. Concatenate two

  50. Coin

    Great, thanks.
    I’ll be looking forward to the rest of this series. I tried to learn Haskell once and couldn’t wrap my head around it, but that was a really, really long time ago and I didn’t really understand functional programming at the time. I’ve been wanting to give it a second shot.

  51. kawazu

    Had to deal with Haskell some years earlier while still at university, now getting back to this is a comfortable “flashback”. Looking forward to reading the rest of your series. Right now, what makes me think the most is how to include languages like Haskell (or Prolog, for that matters) into software systems built using Java (or, more notably, J(2)EE). I really would like to leverage the power of functional programming when it comes to all sorts of data processing, but I don’t want to do, say, web-based things or remote communication this way. I’d “just” like to embed a “functional core” into my JEE applications and feel content it does what it is supposed to. 🙂

  52. James

    Haskell is just a language. One very important difference with Java is the number of free libraries/frameworks available to do almost anything. That’s the real value of Java and the reason I’m not using some lesser known languages.

  53. Harald Korneliussen

    Pseudonym, thanks for the explanation! Since I heard the comment that mergesort could be faster on linked lists I started thinking about different ways to do it, and I did realise that if we could do it “bottom up” that would remove the need to split the list in the middle, as I had been taught to do (as an imperative programmer).
    That quicksort should be slower still suprises me a bit, though… I realise there is a lot of consing and subsequent garbage generation going on, but I’ve been taught (as a schemer) to not worry too much about those 🙂 Is there no way to use tail recursion to get at least some of it to be in-place?

  54. david1947

    For the past 15 years or so I have been working on a system which has demonstrated to me (at least) that the most productive way to solve the language wars is to build a system which allows the use of whatever language feels most convenient for any and every sub-problem, i.e. the final system contains code fragments from multiple languages co-executing, each fragment written in a language of local convenience.
    I have not yet found a language that handles UI and program structure+infrastructure well, so I and some friends wrote VNOS (Visual Network Operating System) which does the structure bit, and facilitates building programs out of simple comnputational objects graphically interconnected using a dataflow paradigm. This has allowed hard-core programmer-phobes to write useful programs, so I have to conclude that programming is most accessible using visual dataflow methods, and that any text based system, especially those (most) that require the whole program including IO to be in the same compile units, can only be accessible to those few people capable of effectively “running” the program in their mind. Source code is not user-friendly. However English-like it appears.
    Within this framework, it would be interesting to wrap the Haskell engine as a VNOS drop-in. Is there an embeddable build?

  55. sebulba

    i have had some experience with stateless-programming, but i’ve always had trouble to see how stateful/mutable things (OS-resources, hardware, etc.) are handled.
    i’d love to hear more about those monads.

  56. Pseudonym

    Harald, the only reason why quicksort is unreasonably slow is because it’s on linked lists. Quicksort is designed for mutable in-place data structures (e.g. arrays).
    Yes, as a Schemer (and has a Haskeller!) you don’t worry about things like that, at least at first. One of the strengths of declarative languages is that it’s easy to make your code modular and reusable. So write a simple quicksort. If it turns out that’s the bottleneck, replace it with a better sort algorithm, safe in the knowledge that so long as your replacement is correct, you won’t have broken anything.

  57. Anonymous

    Funny you should start out by emphasizing “referential transparency” … because it is completely meaningless. Perhaps this first post should be categorized under “bad math”? This oft-repeated description of Haskell has no meaning. The closest you can come to a definition is ” you may replace equals by equals”, but then every single language enjoys this property—of course you can replace equals by equals!
    Another common description of Haskell is that it is “declarative”, but this again has no meaning, despite being oft-repeated. Of course the language has an operational semantics (ie, a definition of how to execute programs in it), otherwise it wouldn’t be a programming language. Haskell is no more or less “declarative” than any other language.
    Yet another misapprehension is that Haskell is a “pure” language. This, too, is false. Haskell is, in fact, an imperative programming language. It has state, I/O, exceptions, concurrency (in some forms), in short all the crufty imperativeness that you could want. Moreover, you cannot write serious programs without them. Consequently most Haskell code “lives in the IO monad”, which means that it is just imperative programming. Haskell is no different from Standard ML or O’Caml in this respect.
    Yet another error is to emphasize Haskell’s use of monads for I/O and state. It is true that Haskell uses monads for this purpose. But one could just as well do this in O’Caml or Standard ML—it is not a language design decision, but a library design decision. A monad is just a particularly simple-minded use of modules. One could structure the libraries for I/O and state in O’Caml to use a monad. It would look just like Haskell. Why don’t people do it? Because monads are both a good and a bad idea: they make some things nice, and they royally screw up other things. Neither approach dominates, but in ML one can pick and choose whether to use monads, whereas Haskell forces them on you.
    Finally, Haskell’s type class mechanism is just a use of modules. That is, if you had modules, you wouldn’t need type classes, and you would also have a lot more. Plus recent work shows that one can easily integrate Haskell-style classes with modules to give the best of both worlds. But classes alone don’t cut it, you need modules.
    The bottom line is that Haskell is a cool language, but it suffers from the anomaly that practically everything that is commonly said about it is either meaningless or wrong!

  58. Tony Morris

    “Consequently most Haskell code “lives in the IO monad”, which means that it is just imperative programming.”
    “…practically everything that is commonly said about it is either meaningless or wrong!”
    I find that funny 🙂
    I wonder if you took a breath in-between? 🙂


Leave a Reply