Moving on: π-calculus and common programming idioms

Before diving off into real content, the name of my little π-calculus monstrosity has been chosen. As several people recommended in the comments, I’m going to go with Pica.

So, today, we’re going to take the dive, and start looking at some interesting semantics of the language. The goal of this is still to work on Pica’s type system: I want to work towards a first draft of what a process type will look like. But to figure out what a process type should look like, we need to think about how process abstractions will be defined and used in Pica, what control structures are going to look like. So that’s what I’m going to do in this post: ramble on a bit about what processes are really going to end up looking like. Just a warning: this post is definitely on the half-baked side – it’s much more of a train-of-thought/brainstorming thing than the things I usually post.

The overriding principle of Pica is making communication and synchronization explicit. After all, the basic problem that I’m hoping to address is the weakness of languages for dealing with the issues of distribution and concurrency, so those issues need to be brought up front and center. So the point of process typing is to allow us to express something about how a process is going to interact with the world around it in terms of communication.

Obviously, since Pica is based on π-calculus, the primary abstraction is going to be the process. Just about everything is ultimately going to reduce to process-based abstractions. What does this mean, in terms of some of the common programming language constructs?

  • Procedure/method/function calls are not primitives. In almost every language that most of us are familiar with, the idea of the synchronous function call is a basic concept. But for Pica, it’s a derived abstraction. A function call with type (A×B)→C is going to become a process in Pica, which exports a channel of type (A×B×Channel[C]). To call the function, you’ll send a message to that exported channel, which will include the result channel. Instead of returning a result, the function will send the result to the final channel parameter. So function calls in Pica end up looking a lot like function calls in continuation passing style code in a functional language. This isn’t really changing anything about how function calls really work: normal function calls push the parameters plus a return address onto a stack. This is equivalent to that, expressed in terms of massages, and making the expected synchronization behavior around the call and return explicit. With calls expressed this way, we can still provide syntactic sugar to make it easy to do synchronous calls if we want to – but we can also make any call into an asynchronous operation, or separate the invoker of a call from the receiver of the result.
  • Data types/classes aren’t really primitive. Most modern languages have a concept of object, which is basically a chunk of data (the object state) with associated code for accessing and manipulating the data (the object methods). In Pica, an object is just a process. The object state is the process state; the object methods will be defined in terms of the messages that can be sent to the process.
  • Basic computational primitives are relatively unaffected. Things like conditionals, comparisons, and arithmetic are just below the level where the message-passing semantics has a huge impact.
  • Control structure have an interesting dual-nature in a language like Pica. There’s obviously a need for basic controls like conditionals, loops, etc., which exist below the level of message passing – but there’s also obviously a need for controls to interact with message passing. The way I’m going to work around that is by separating the low-level in-process controls from the higher level inter-process controls.

That dual-level control structure is an interesting thing. On the low level, we want to be able to do things like simple conditionals and iteration without getting too involved in communication semantics. So we definitely want things like a basic “in/then/else” construct, and probably something like a “for” loop.

And there’s also an analogue of the basic control structures at the process level. But they need to fit into the basic concept and patterns of the process and messaging system. There are two basic constructs we need to capture at the process level: conditionals, and iteration.

What’s a conditional in terms of a process calculus like π? Suppose I’ve got a process, which has a port where it can receive a message. A conditional is something that allows the process to take different actions depending on the contents of the message. The natural way to do that given the structure we’ve been building is to steal another idea from functional languages: pattern matching. So we’ll allow message receive clauses to specify a pattern for the values to be received: a message receive clause can only receive a message whose payload matches the clause’s pattern.

Iteration is even easier. We’ve already seen the solution there: recursion, exactly the way that I demonstrated in some of the examples in earlier articles.

0 thoughts on “Moving on: π-calculus and common programming idioms”

  1. Pattern matching against a received message sounds a _lot_ like erlang’s recieve syntax. It looks a lot like a Haskell case expression of... construct, but blocks the process until receiving a message that matches one of it’s expressions. There’s more too it of course (timeouts, &c.) but that’s the basic idea.

  2. I think you meant “if/then/else”, not “in/then/else”.
    What also sounds like erlang is the dual life of control structures above and below the message passing layer. foreach is a standard library function, but pmap performs a very similar function above the messaging level.
    I imagine it would be very useful to have this duality acknowledged in the language design.

  3. Alright! Pica! Some rambling thoughts as I try to comprehend.
    This is looking more and more like some extension of a state machine, with processes for states and any number of transitions (channels … well messages) happening at the same time – and allowable operations including adding and removing states and channels as well as the usual internal state changes. That is kind of the picture for me.
    But the channel thing is still a little unclear to me. You have named channels and named values on channels. Is the name of the value just a placeholder (macro say) in the language? Or is it a “real” tag that a process could interogate? Otherwise you get into pattern matching like event queues?
    Are channels automatically synchronized? That is, If process me sends 2 messages to a named channel and process you are set to receive them do you get them in an order? Also, relatedly, messages just sit in a channel until some process decides it is looking for them or do they fall off the world?
    Just trying to get concrete metaphors, channels as you’ve defined them seem to be one-to-one process wise. I would think that you must have multiplexing and/or broadcasting of some type … I could very well have just missed it. This also reminds me somewhat of Informatica transformations if you have ever seen that ETL product.
    This may (probably is) all be implicit in the definitions or reduction rules, but I am long gone from that kind of stuff.
    I can definitely see how you can garbage collect stuff if you start off with one process and then grow – once a process gets “cut off” you could kill it – or it could kill itself really – functional self-destructors.

  4. “This is equivalent to that, expressed in terms of massages”
    Ah, the famous massage passing style. You rub my neck, I’ll rub yours.

  5. Interesting thoughts, for reference the core constructs in Occam-pi are (sorry for mangled formatting with extra lines)

    SEQ
    sequential
    stuff

    PAR
    proc1
    proc2

    where all procs are started in parallel and have to exit for the block to complete – i.e. implicit barrier synch

    ALT
    guard1 & input1 ?
    handle input 1
    guard2 & input2 ?
    handle input 2
    guard3 & TIME AFTER then
    handle timeout

    This works like a select system call on the input channels, where boolean conditionals can be used to toggle inputs. There are also PRI PAR and PRI ALT that provide a priority ordering on the subsequent blocks.
    Finally there is a generic replicator that is applied anywhere in the language, so

    PAR i = 1 FOR 1000
    proc(i)

    will create 1000 processes in parallel, and

    SEQ i = 1 FOR 1000
    proc(i)

    is exactly the same as a for loop around proc(i)
    Also, there is a lot of theory and deep thinking around safe program transformation which gives rise to a “big rule” about banning side effects from the language. i.e. you can’t pass a value by reference that can get changed by another process unless you explicitly declare it shared and claim it before writing it. The same construct works for shared channels, which allow an array of procs to use one channel to fan into a single proc by claiming it before writes.

  6. Mark, why do you think that you need two kinds of control structures? Couldn’t the if/then/else be implemented as syntactic sugar over the pattern matching message reception? Something like: new x. !x(v). {?x(true).T|?x(false).E}
    The x of course will be some new, otherwise unused variable.
    Scoping ensures that the branch not taken will be left hanging.
    How would this language be implemented in practice? A simple prototype would obviously dynamically schedule all of the processes and use a GC to get rid of dead processes. But what about a practical, medium to high performance implementation? Processes would likely be scheduled, to a degree, statically by the compiler, but how exactly? I.e. what sort of transformations should be done to produce efficient target code?

  7. Have you ever looked at fortress? (some of) the control structures are defined to be in parallel, for example:

    for x <- 1:10
    print x
    end

    would work (potentially) in parallel, but

    for x <- seq(1:10)
    print x
    end

    would always be in sequential order.

  8. Markk:
    π-calulus is rendezvous based. So there’s no notion of a message queue, or of ordering in message delivery. So any matching pair of a process waiting to recieve on a particular channel, and a a process waiting to send on that same channel can be selected, and the matched send and recieve will be executed simultaneously. So both sends and recieves block until they’re selected for a rendezvous.
    So a process can’t send two messages the way you suggested. The first message blocks and doesn’t get sent until there’s a rendezvous with a reciever. There’s no way for a process to have two outstanding sequentialized sends.

  9. Flaky:
    The two kinds of control structures *could* be merged; but I think that there’s a worthwile distinction between local process-internal control flow, and externally visible inter-process control flow. Message passing always involves synchronization behaviors; I don’t think that tangling synch behaviors into simple local control flow is a good idea.

  10. MaxPolun:
    The parallel control structures of languages like Fortress, X10, and friends are built for expressing massively parallel computation. (I’ll mostly mention X10 from here on; X10 and Fortress were developed under the same research grant program, but I’m more familiar with X10, because I used to work with the people who were building it.) That’s not what I’m trying to do with Pica. I’m trying to look at smaller scale concurrency distribution.
    So the focus of X10 type languages is really expressing what things can be done at the same time, and leaving the synchronization and communication behaviors to be computed by the compiler in order to maximize the specified parallelism; the focus of Pica is to specify the synchronization and communication behaviors of small systems.
    The result of those different foci is that things like parallel “for” loops are just about the most important construct for X10 type things. But for Pica, message passing and syncronization are the most important things. For Pica, it’s OK to run a bit slower if it makes the synchronization and message passing behaviors more clear and less error-prone; for X10, it’s OK to make the synchronization and message passing behaviors arbritrarily ugly if that makes the program run a little bit faster.
    I don’t really think that the X10 parallel loops really make sense for Pica.

  11. I’ve been working with parallel processing systems for only about 6 years, and I’ve not used Erlang, but I have used MPI, wrote an MPI work-alike for Python, drooled over C-Linda, and have written a Linda work-alike for Python.
    The mechanisms for Pica you are describing sound very much like a Linda tuple space; which is fine, tuple spaces are very convenient. Though what I found when writing a Linda work-alike in Python is that when actually using it, one usually ends up disambiguating functions that take identical arguments; like int,int->int for any two-argument function on integers (add, subtract, multiply, divide, exponentiate, etc.) by prefixing everything with an intended operation like (‘add’, int, int) for the pattern to match, (‘add’, 4, 5) for the actual ‘call’, and (‘addresult’, 9) for a result.
    What I realized is that what I had ended up doing with a tuple space system is basically create a large set of queues defined on a tree for a minimally-specified RPC. Which made me realize that tuple spaces, at least as far as I could make use of and conceive of them, are a lazy RPC dispatch mechanism. You toss arguments to a function into a queue somewhere, later someone pulls those arguments, executes the function, and maybe puts some “results” back into the space.
    In the last couple months, I have been turned onto the Processing package for Python by its author, Richard Oudkerk. While I haven’t had the chance to use it for anything yet, it seeks to offer the same kind of abstraction over processes as the Threading module has offered over threads. Interprocess communication is handled with data structures synchronized by a managing process, dict, list, Queue, and locking primitives among them. While it’s not as pure as a tuple space, or your Pica, it seems to have all of the primitives necessary for many distributed processing and message passing tasks. You may want to look at it.

  12. I have limited knowledge on this topic, but one thing that jumped out when reading your description was that Pattern Matching over the entire message would likely be a huge bottleneck. What about having the message itself, then having a separate field that’s an Action field that would be passed through the Pattern Matching of each Proc. In general, following a “Here’s the data, and here’s what to do with it” mentality.

  13. MarkCC, I don’t quite see your point on the control structures. If I’ve understood your proposed language correctly, then, at least, the if/then/else structure implemented the way I suggested would not change the semantics of the language, would it? The syncing would occur in local processes in a way that would never be directly visible to the user. Since function calls are going to be implemented as syntactic sugar, why not these control structures? Would there be some fundamental difference? Implementing them as syntactic sugar would spare fingers and make the semantics easier to analyze, would it not?

  14. Iteration is even easier. We’ve already seen the solution there: recursion, exactly the way that I demonstrated in some of the examples in earlier articles.

    Is it wise to rigidly enforce this? For example, will iteration over any length n array necessarily entail calling n processes? This would be a serious drawback for any numerical parallel code.

  15. I strongly support Mark’s observation about this performance questions.
    I’ve seen many posts concerning performance issues, talking about CPUs, memory, etc… But what I think is really important is a language capable of well handling and expressing the semantics of concurrency and distribution.
    Not that performance isn’t an important issue, but I think we should leave that for the hardware guys for now and focus on the semantic issues of this new language.

  16. OMF:
    You’re giving me horrible flashbacks of my grad school days. I did work on non-numerical applications of parallelism for general purpose code. *Every* paper that I submitted always came back with comments from reviewers saying “But that won’t work for numerical arrays!”. It got to the point where I think the last paper I submitted included something resembling the wording “while this clearly will not be appropriate for parralel computation over numerical arrays, that is not our intended application domain” no fewer than 8 times in a 12 page paper. And that paper still came back with a rejection because “This work is clearly not applicable to the paralellization of numerical arrays”.
    Pica will not be a good language for numerical parallel code. I’m not trying to design something that will work well for parallelizing numerical arrays. I’m trying to design something that’s good for basic everyday multithreaded applications like the ones that I use and write for my Mac. For those applications, the question isn’t “how much parallelism can I squeeze out of this system?”, but “How do the pieces of this system interact with other things?”
    High performance parallel numerical code has a very specific set of requirements, and a good language for HPP numerical code will have to be designed with those requirements in mind. And those requirements are very different (and in some places incompatible with) the goals that I have in mind for Pica.

Leave a Reply