Why am I doing this Pi-Calculus Language Thing?

Since my post on datatypes for my π-calculus language, I’ve gotten a bunch of questions from people who (I guess) picked up on the series after the original post where I said that the idea of the series was to see if I could create a programming language based on it. The questions are all variations on “Why design another programming language? Do you really think anyone will ever use it?”

I’ll answer the second question first. Of course I realize that the chances that anyone but me, and maybe a couple of readers, will ever use it are close to zero. But that’s not the point.

Which brings us back to the first question. And for that, silly (but true) answer, and the serious (but also true) answer.

The silly answer is: because creating a programming language is interesting and fun! Even if it doesn’t end up getting used, it’s still an interesting thing which I’ll enjoy doing, and which I think a lot of readers will enjoying reading and participating in.

The serious reason is that there’s a problem that needs solving, and I think that the right way to solve it involves a different kind of programming language. Modern software is pretty much inevitably running in an environment with concurrency and communication. Networks and multiple core processors are an inevitable part of our computing environment, and we need to program for them. But our current languages absolutely suck for that kind of programming.

Why do I think that a language based on π-calculus will be better? Well, that goes back to something about my philosophy on programming languages. One of the things that I believe is a very big issue in software engineering is the fact that most programming languages force programmers to work out detailed knowledge about their systems in their minds, then write their programs omitting most of that detailed knowledge, and then make the compiler/development tools attempt to recompute the information that they knew in the first place.

For example, when I was in grad school, one of the very hot research areas was compilation of scientific code for massively parallel execution. The main focus area was doing work on numerical arrays in Fortran – looking at Fortran code, analyzing it to determine the data dependencies, and then using the dependency information to figure out how to generate the fastest possible parallel code that respected those dependencies.

The crazy thing about this was that the people who wrote scientific code for those compilers usually knew what the dependencies were. They generally knew how they wanted the code to execute. And so they were spending tons of times learning how the compiler was going to analyze it, so that they could write their code in a way that would result in the compiler generating the code they wanted it to. So they were starting with the knowledge of how they wanted the code to be parallelized. But they couldn’t write that down in the program explicitly. They had to figure out how to implicitly encode it to get the results they wanted, because the language didn’t let them express their knowledge.

I see the same kind of thing happening all over the place, but particularly, I see it becoming a huge issue is in multi-threaded programming. Like I said, with the increasing use of multi-core processors, and distributed systems, we’re all writing code that involves concurrency, threading, and communication. But the programming languages that we use are terrible at it. Look at Java or C++: you see languages that are first and foremost sequential languages. Almost everything about them is built around the good old fashioned single-threaded vonNeumann model of computing. Then threading is added on top, primarily through libraries. So almost all of the code is written in, at best, a threading/concurrency-unaware style.

What that means is that when you go to write multi-threaded code, you’re really screwed. Sure, you know how you want to set up the threading. But you can’t really tell that to the compiler. So all of those functions in the libraries you want to use… Can you call them? Are they compatible with your threading strategy? Who knows? Even if the author of a library wrote his library in a thread-aware fashion, there’s no particularly good way for him to make that explicit in the code. And so the multithreaded system has bugs, deadlocks, improper synchs, race conditions, etc.

I’m not saying that a π-calculus based language is going to miraculously get rid of deadlocks, race conditions, etc. But I do believe that the problems of multi-threaded programming aren’t as difficult as many people think they are. The problem is that the way we program and the languages we use are particularly ill-suited towards solving problems that involve concurrency and threading, because (1) they let us write most of our code without thinking about how it will behave in a multi-threaded environment, and (2) even when we do think about how the code should behave in a threaded environment, the languages don’t let us express enough of our knowledge about their concurrency behaviors.

So on a deep level, the point of building this language is to see if a π-calculus based approach to describing concurrency, wired deep into the language semantics, can make concurrent/multi-threaded programming easier.

0 thoughts on “Why am I doing this Pi-Calculus Language Thing?”

  1. I think there really is a need for an end-user style language which supports concurrent programming at a really fundamental level, so fundamental that the programmer doesn’t even have to be specifically aware they’re doing it. At some point the hardware is just going to start demanding it. For awhile now a lot of the cutting edge stuff in microchips has been about parallelization, vectorization, etc, and this is something which (1) programmers are not good at and (2) current programming languages are even worse at facilitating compilers that can do these things automatically…
    Which is basically what Mark said, but I mostly just want to add one detail: Specifically, have you looked at the Cell chip? Yeesh. And we’re supposed to believe that within five years average game programmers will be able to make use of this thing on a regular basis…

  2. I am database researcher, and my group is looking at the same issues (multicore processing) that you are. Databases are an clear example where the programming language can be highly optimized and parallelized by the compiler. In fact, some would go as far as saying that declarative languages are the only real success story for parallel processing in comercial systems.

  3. Mark, could you recommend a good introduction to parallel programming, something that would cover the issues that the Pi Calculus is trying to solve?

  4. I find it fascinating, way above my skill level but interesting. I like learning about it and your explanations combined with the comments make things clear.
    Learning for learnings sake; I’ll never make practical use of it but it helps keep the remaining brain cells active.

  5. I respectfully dissent.
    Unlike dressed-up lambda calculus, dressed-up pi calculus is difficult to write useful programs in. The reason, I suspect, relates to the fact that there’s apparently no logic that it’s isomorphic to, a la Curry-Howard/Girard-Reynolds/whatever.
    I don’t think we’ve solved the problem in general yet, but the best thing I can see so far is software transactional memory. This is partly because not only is it easy to program in, it also works the way that hardware could work. CPUs already have caches, and providing a transaction layer on top of them is feasable.

  6. I’m with Walker. Pi-caclulus is fascinating, but it won’t make an easy programming language. It might be useful, but not easy. Maybe if you compiled a declarative language into a pi-calculus language.

  7. I have to disagree with Pseudonym regarding the use of hardware transactional memory.
    I was at a conference 2 weeks ago, and in one session, a group presented hardware transactional memory. They had built it onto a real test platform to see how it would work. And while it received some nice speedups, there were some programs it could not handle at all (for instance a hashtable), which nonetheless have been made lockfree (I remember an article on lockfree hashtables). Not only that, it does’t scale as you’re still serializing on your commit blocks. As for the “CPUs already have caches, …” that was exactly their reasoning, they built the thing on top of L1 data cache. First of all, they made the cache hit latency go up from 2 cycles to 13 cycles, secondly, every commit would force a cacheflush, and thirdly, your cache-coherency problem is now ‘solved’ by constantly flushing, hardly an appealing result. You can read more about it at my blog (observations on DATE).
    Anyways, I think the work that Mark is doing is definitely interesting, for there is a dire need for a parallel programming model.

  8. Like Geoff, I’m interested to know what you think about Erlang.
    Is this not already a (production strength) language and runtime that addresses your concerns? Or if not that, isn’t it likely a good place to start from?

  9. Maybe if you compiled a declarative language into a pi-calculus language.

    I wasn’t trying to claim that declarative languages are the only way to go. They have the problem that they have no side effects. You can convert a purely functional language to a declarative one, but many would claim that is too limiting. I think Mark wants to do something more here…
    With that said, declarative languages are often overlooked in places where they would be useful for parallel processing. They make sense for anything in which you need to alternate functional computations and updates, such as simulations. {shameless plug]I have a paper in this year’s SIGMOD on using them to script AI in computer games[/shameless plug].

  10. The silly answer is: because creating a programming language is interesting and fun! Even if it doesn’t end up getting used, it’s still an interesting thing which I’ll enjoy doing, and which I think a lot of readers will enjoying reading and participating in.

    What makes this answer “silly”? And why on earth would you need another answer other than “I think it might be fun”?
    I mean, I can see needing one at work, even for your 20% project, but here, on your blog?
    Incidentally, could you nudge the scienceblogs.com site admins toward consideration that they are vulnerable to this XSS vulnerability?

  11. for me all it sounds more like adding more syntactic sugar to pi-calculus and creating ascii notation for it. probably i’ve missed something, but why pi-calculus, not css, csp, acp, lotos πŸ™‚ ? do you really need mobility in a parallel programming language?

  12. Tomas:
    Yes, I think we really do need mobility for a concurrent programming language. I think that the parallel/concurrent distinction is an important one. In a parallel system, you’re just taking a single workload, and breaking it into parts for efficiency. For that constrained case of concurrency, fixed communication structures are fine. But for general concurrent applications, I think that dynamic communication structures are essential. You need, on some level, to be able to talk about entities joining and going away, about communication channels being created and rearranged, etc. CCS/CSP based systems simply don’t capture that kind of semantics well.
    Even with the requirement for some kind of dynamicism/mobility, there are plenty of possibilities. I just happen to have a fondness for π-calculus.

  13. Geoff, Keith:
    My opinion on Erlang is very mixed.
    In some ways, it’s a remarkably elegant language. It’s got a nice semantics, and it does a good job of capturing the ideas of concurrency in a useful way. And it’s *extremely* impressive performance-wise – the things that they’ve pulled off with Erlang are really amazing.
    On the other hand, there’s a lot of ad-hoc feeling about it. It’s got this functional part, without types, and you can write pretty typical functional programs in that part. And then there’s the message passing stuff, which feels sort of grafted on.
    Really, it just sort of shows its heritage in unfortunate ways. It was developed over time to fulfill the specific needs of the switch developers at Ericson. And you can see that: when they needed to solve a problem, they solved it. But they didn’t start with a consistent idea of how it was suppose to work – they tried to backfit consistency, but finding solutions to the problems they faced, and then trying to fit them into the language.
    But it’s still an amazing language. Despite its problems, I think it’s a huge step in the right direction. I’m definitely looking at it as an inspiration for what I’m doing; but I also want to try to see if I can do something more consistent.
    I still haven’t really decided what language I’m going to use for prototyping my little pi monstrosity; Erlang is one of the front-runners.

  14. I hope you’re not going to make your language too “prolog-ish”, with everything necessarily having to be reduced into ever simpler messages until a “base case” is reached.
    Are you going to allow the programmer to break the cycle at some point and descend into imperative code? The chips are ultimately imperative anyway aren’t they, so in the interests of scalability perhaps it would be useful to allow this?
    I’m not a programmer so my terminology might not be correct here but essentially I mean allowing atomic subroutines to execute and return without passing any messages at all. For example, imagine some code to (naively) compute exp(x) using its taylor series. Will your language insist that the series be evaluated by sending a new message to evaluate each term in the series(up to a limit), or will you allow some kind of for loop block?

  15. On the other hand, there’s a lot of ad-hoc feeling about it. It’s got this functional part, without types, and you can write pretty typical functional programs in that part. And then there’s the message passing stuff, which feels sort of grafted on.

    I see where you are trying to go with this, but I disagree that the message passing is “hacked on”. If what you are interested is coordinating several concurrent programs (and not a single workload), this language strikes me as a natural embodiment of the dependency graph. The functional language is nice because we can extract the dependency graph from the AST. The message passing structure allows us to extract it from the message graph.
    It seems to me that the real reason you believe it is hacked on because it is a two-tier approach. It does the functional, and then it does the message passing. A “nice” language would allow me to do some functional, pass some messages in between, and then do more functions on top of that structure, and so on…
    Except that, in that case, I think your distinction between concurrent and parallel falls apart. The only way that I can make sense of putting a functional layer on top of the message network of several subprograms is if I am programming for a single workload. And if all you have is one workload, then other solutions may be more effective.

  16. What about something based on Petri Nets?
    I always thought of it as the only “friendly” way of modeling concurrency.
    Although it may lack the mobility dynamism of pi-calculus that you’re looking for…

  17. What about something based on Petri Nets?
    I always thought of it as the only “friendly” way of modeling concurrency.

    That is essentially a representation of the message passing graph between processes. They don’t give a real formal specification for what happens inside each process. Which is important, because I may want to spread a single logical process over several cores/CPUs.

  18. Daniel:
    I’ve actually never liked Petri nets. They’re useful for things like describing networking protocols, but for describing high-level computations, I’ve never seen anyone make them comprehensible.
    I see them as being something like the program dependency graph for conventional computation: a useful tool for describing and analyzing one aspect of the system, but a terrible tool for understanding or expressing the totality of the system.

  19. I’ve been programming using an explicit pi-calculus language (Occam-pi/KROC) for quite a while, and I agree with MarkCC’s reasoning that it opens up new ways to be more explicit and expressive about distributed systems. The design process largely focuses on deriving the dataflow for a use case, and a protocol for that dataflow. The nodes that implement the actions on the protocol are the easy/familiar bit, since they are more like sequential code hanging off a select. However you have to be careful with output as its easy to deadlock.
    One nice feature of KROC is that if the code deadlocks, it prints out a dump of all the threads and which line of code they were blocked at.
    I wrote a few thousand lines of code in occam-pi last year, and built a very efficient simulator for a large scale p2p network.

  20. Coloured Petri Nets (where tokens can hold different values, and therefore transactions may be seen as mathematical functions) can be a simple way of expressing complex systems.
    I never actually used them in “real-life”, but I think they’re turing-complete, so they can model anything that is computable.

  21. I never actually used them in “real-life”, but I think they’re turing-complete, so they can model anything that is computable.

    Turing completeness should not be your only metric when you evaluate the fitness of a programming language. A lot of things that are Turing complete use unusual coding tricks in order to gain this expressivess. Their computation model is rarely efficient. In the case with Petri nets, the computation would be all in network message passing, with the local cores/CPUs functioning as no more than finite state machines. Given issues with bus/network latency, this is hardly high performance computation.
    When you evaluate a programming language, you care about its expressiveness, but you also care about how easy it is to implement the underlying semantics. If the efficient means of implementation deviates too much from the semantic model, then compilation and optimization becomes very difficult.
    Personally, I don’t believe that all languages need to be Turing complete (or in the case of declarative languages, first order complete). People are often willing to accept less expressive languages for special purpose situations. Pub/sub is the classic example of this. It is orders of magnitude weaker than SQL, but is remarkably more efficient on streaming data, and so people use it all the time.

  22. Daniel:
    Walker beat me to the main point of my response, but I’ll reiterate just because I’m an annoying wordy guy πŸ™‚
    Turing completeness isn’t enough for a real useful programming language. Just think about where the term comes from. Turing machines are, of course, Turing complete – but any programming language based on Turing machines is going to be an agonizing wreck to actually write real programs in.
    Petri nets strike me as being similar to Turing machines in some ways. When you’re talking about simple algorithms, or theoretical concepts of computing, Turing machines can be great – simple, clear, and easy to understand. But they’re a total train wreck for describing complex systems – try writing a variable length string quicksort for a Turing machine!
    For some relatively simple things, Petri nets can be fantastic. They’re very simple, and they’ve got a natural graphical way of expressing things, so that you can see the structure of a system in the shape of its net. It’s really wonderful for describing basic distributed algorithms. But when you start looking at the petri net of a real system – like, for example, the Petri net that describes the TCP/IP protocol stack, you get a horribly tangled incomprehensible map. Back in grad school, I tried to trace out the Petri net system desribed by the Esterelle specification of IPv4 (Esterelle is a variant of the Estelle protocol specification language which tried to fix some of Estelle’s screwy semantics by replacing them with something based on Petrinets.) The printout of it was something like 60 pages, and it covered up half of the floor of my room in my apartment. I don’t think that anyone could actually figure out a complex system like that from its Petrinet.

  23. Actually, Mark, we were talking about two different difficulties here in response to Daniel. One is performance (which I focused on, because the trade-off of performance and expressiveness is my area of research). The other is ease of design, which is really what you are focusing on in your response (and which, from your posting and later comments, appears to be your main concern). Both are very important in the choice of a programming language. It is just that I know how to measure peformance. πŸ˜‰ Ease of design? Not so much.

  24. I agree – Turing completeness might be a “necessary” condition, but it’s not nearly “sufficient”. Conway’s Game of Life is Turing Complete, but if you think a Turing machine is hard to program, have a look at this…
    http://rendell-attic.org/gol/tm.htm
    It’s a Turing machine coded in a Life array.

  25. The world’s most massively scalable and performant parallel codes are primarily written in C, C++, and Fortran (of various flavors) and are designed for doing large-scale PDE and ODE simulation. They’re all all based on a message-passing paradigm of one sort or another. Most use MPI. The rest generally have their own, home-grown communications layer that can sit directly atop native high-performance networking APIs, and MPI-1 and MPI-2, and things like sockets. These hand-created layers are there for increased portability and to exploit features of the networking hardware (like RDMA) unavailable, or inconveniently expressed, in MPI-1 and MPI-2.
    Were these programs easy to write? Some of them were, and some of them weren’t. Were they hampered by their language? Some of them were and some weren’t. Is their parallelism/concurrency trivail? Absolutely not. Were _any_ of them written in a functional language? To my knowledge, not a one.
    To the designers’ credit, Erlang seems to be the only language with a functional style that sees any large-scale use for parallel or concurrent programming. I’m not clear on the details of how the phone switches that the Erlang programs run on are designed, but I believe folks when they tell me they are bona fide parallel computers.
    I put the challenge to Marc and the other afficionados here that I’ve been putting to folks on /. recently when this “functional languages kick ass for parallel computing” meme has come up: Show me a parallel weather forecasting code like WRF, or a global ocean modelling code, like HYCOM or POP, or a molecular dynamics code like NAMD, or a ground wave propagation code (i.e. an earthquake modeler) like TeraShake, etc. that’s written in an existing functional language, and I’ll start paying attention.
    At my employer, we’re getting ready to deploy a parallel computer of truly remarkable scale. It will cost ~$50k/day for its 4-year lifespan when all costs are taken into account. That means that we have a strong interest in seeing as many cycles as possible go to productive use. If there’s a compelling language out there that will drastically reduce the time to solution, increase efficiency, and lead to more maintainable and more true to the mathematics being coded, we’ll sit up and take notice.
    Psueduo-disclaimer: The above is my more or less “put up or shut up” rant. I don’t mean it quite as emphatically as it may seem, and I certainly don’t mean to be rude. I see the value in functions as first-class things (I’ll skip the contentious word “objects” for now) in a language. Heck, we use functional concepts in our parallel CFD code (also quite scalable, but not yet freely available like most or all of the above). But the proof of a language, and of language paradigms, is in the implementation of real programs.
    My area is in PDE and ODE simulation, so I use these examples. However, I also put them out there because I believe them to be examples of where parallel computing is hardest. The amount of code is not large, but it’s importance to society is probably immeasurable. On the technical side, there are also a few issues: The latencies of the communication paths matter tremendously because the coupling between distributed entities tends to be tight. At the coarse level, there are very few or no independent tasks that can be done simultaneously to hide the latencies. Bandwidth to main memory is at a premium. And floating point operations dominate.
    Do we need better languages for day-to-day parallelism? Absolutely. Are those languages likely to be functional? I wouldn’t bet on it. Will those be the languages of massive parallel simulation codes? Probably not, IMO. I’d be much more likely to bet on the future being entirely message-passing based.
    After having said all that, threads sharing memory in concurrent/parallel programs is probably the biggest mistake of in modern thinking on parallelism. It’s too big a crutch, and leads to too many of the traps that Marc describes.
    Missed item: Now that I’ve written that, I’ve realized that I’ve left out Google. I understand that some of their programs should be included among the ranks of massively parallel, scalable programs, though they throw the tight coupling out the window altogether, so I don’t feel so bad for having skipped them.

  26. Christophe Poucet:
    I take your point (and a very interesting blog you have, BTW), but I think the problem here is that we don’t quite know what the right software/hardware boundary for TM is.
    The traditional “pthreads”-style synchronisation primitves (mutexes, semaphores, condition variables etc) are achieved pretty efficiently on hardware which only supports some very limited primitives, like atomic-test-and-set, atomic-compare-and-swap or an unreliable form of load-linked/store-conditional, usually in conjunction with some memory barrier instructions. I don’t see chip manufacturers rushing to implement mutexes directly in hardware, because it’s unnecessary.
    I think that we don’t know what the “right” set of hardware primitives for TM are yet.

  27. Mark, what about Mozart/Oz, isn’t it another more or less produciton ready concurrent stuff ? What’s your opinion on that ?

  28. billb:
    First: I’ve *never* made the argument that functional languages like Haskell are better for massively parallel code. So you’re arguing with the wrong person there.
    Second: I’m not looking at massively parallel code as the domain for this experiment. There’s a lot more to distributed/concurrent systems than massively parallel simulations. Distributed systems development is a big issue in a lot of different domains, and there isn’t one solution that’s going to work in all of them. The right solution for massively parallel sim code isn’t going to be the same as the right solution for massively replicated multithreaded programming of web-app servers, which isn’t going to be the same as the the right solution for taking advantage of a half-dozen cores in a desktop PC.
    Third: A lot of your argument comes down to the old popularity versus quality argument. Almost all of the code we see in large non-parallel systems today is written in C or C++. Therefore C/C++ must be the best programming language for it, right?
    It doesn’t work that way. Decisions about programming languages aren’t always (or even often) made solely on the basis of technical quality. If most of the people who you can hire to write simulations code know Fortran, then most businesses are going to have their systems written in Fortran – which becomes a self-perpetuating phenomenon: all of the best sim programmers use Fortran, so companies require sim programmers to use Fortran, so the best experienced sim programmers are all Fortran programmers, so…
    That’s exactly the argument that AT&T/Lucent used to use about developing switch software. The best switch programmers were C hackers. So they hired C hackers to write switch code. And yet, Ericson has been able to document faster development, lower bug rates, and better systems performance using Erlang.
    And finally: ever heard of Sisal? It was an experimental language from, I think, Los Alamos. It was an array-oriented functional language for massive numerical simulations code. Sisal was able to consistently outperform Fortran for massively parallel code. It pretty much disappeared because the domain experts who wrote their simulations code knew Fortran, and didn’t want to learn a new language, even if it had documented advantages.

  29. Wrong. I never used the word “best” in my post, and I don’t believe C++ is the best language for anything. Please don’t attribute to me arugments I did not make.
    I really, really want to be able write better code in less time with fewer bugs. I don’t really give a rat’s ass about which language is better than another or which is best for anything. I do give a rat’s ass about being able to get the best performance out of large-scale parallel computing systems, and right now the way to do that is with a message-passing API and C, C++, or Fortran. I really want someone to design a language that supports the scientific simulation community’s needs in this area. I’m even willing to help.
    And, yes, I’ve heard of Sisal. I even support the idea behind it. However, it was limited to SMP and vector systems, which were massive for the time, but pale in comparison to what’s out there now. Additionally, the future seems to be in distributed memory, at least at the largest scales. A good language for parallel simulation has to be able to handle that explicitly. Sisal is actually a pretty neat language, BTW. It has functions as first-class thingys and it has for-loops and if-blocks that looking highly similar to procedural code. It was, at the very least, approachable. It withered and died because it’s proponents gave up. DARPA’s HPCS program is trying to revive an interest in high-productivity and high-performance languages with things like Chapel, Fortress, and X10. We’ll see how it goes. I think the funding agencies are starting to learn that it’s OK to make funding for writing scientific programs contingent on doing it in a reasonable language.
    For the record, my PDE simulation code gets shorter, faster, and does more every time we undertake a major rewrite or refactoring. It also gets a little more functional. The first code was C and Fortran, the second straight C, then came the conversion to C++ (without going over the top, remember, TMTOWTDI), and the current version, also in C++, introduced a compiled input language that drives the actual equations to be solved (the compiler for this very simple language is written in Perl). The next version will replace the vast majority of the driver of the main code with Lua. All that being said, the vast, vast majority of the time is spent in tight floating-point loops written in C-ish C++, or taken up in calls to hand-tuned BLAS routines coded by the system vendor or third-parties in C and assembly. The code that we wrote totals about 25kloc of C++ and Perl, and it can solve a vast array of parabolic and elliptic PDE problems at reasonble scale. I haven’t been working on it much lately, though, so it’s stagnated a bit in comparison to the state of the art.
    Anyway, I’m not wedded to a language. I’m willing to learn a new one. But someone has to show me one worth learning. And in that respect, I stand behind my challenge.

Leave a Reply