Interesting Parallels: The Leader Election Problem and Notch Receptors

Yesterday at Pharyngula, PZ posted a description of his favorite signaling pathway in developmental biology, the [Notch system.][notch] Notch is
a cellular system for selecting one cell from a collection of essentially indistinguishable cells, so that that one cell can take on some different role.
What I found striking was that the problem that Notch solves is extremely similar to one of the classic problems of distributed systems that we study in computer science, called the leader-election problem; and that
the mechanism used by Notch is remarkably similar to one of the classic
leader-election algorithms, called the [Itai-Rodeh algorithm][ir].
[notch]: http://scienceblogs.com/pharyngula/2006/12/notch.php
[ir]: http://citeseer.ist.psu.edu/itai81symmetry.html
Before I go into detail, I think it’s worth throwing a bit of a dig at
some of the IDist type bozos. This is very much the kind of thing
that IDists look for; they try to find natural systems that strongly
resemble designed systems, so that they can assert that the natural
system must have been designed. But the people allegedly doing ID
research *don’t bother to study the fundamental science*. No one in
ID-land is studying developmental biology, to really understand
the complex signaling pathways, and what I would call the algorithmic
way that they operate. I think it’s quite damning to their arguments
that they don’t study this; but I also think I know *why*. If you
read PZs article, and you starting looking in depth into developmental
pathways, and understanding how they fit together, and how very small
changes in a process can produce drastically different results – well, it
really sort of pulls the carpet out from under the ID argument. You can really see just how these systems evolved in terms of gradual changes. Just look at Notch – how simple the behavior is, how widely it’s used to produce very *different* kinds of differentiation, and how easily the
process can be perturbed to produce different results.


Anyway, the point of this post is to explain the leader election
problem, and show you the Itai-Rodeh solution. The basic problem is
pretty simple. Suppose that you have a bunch of processes running on a network. You need one of the processes to take on a special role: to become a leader. Not only do you need to figure out which process is the leader, but you need all of the processes to agree about who the leader is. The problem is that the processes don’t have any distinguishing
characteristics which can be used by the the processes themselves to decide who should be the leader. How can we elect a single leader that
all of the processes will agree on?
The Itai-Rodeh solution assumes that the processes are arranged in a ring, so that each process can talk to two others: it can *receive* messages from the process before it; and it can *send* messages to the process *after* it. Using nothing but that one-way communication channel around the ring, how can we elect a single leader, so that all of the participants will simultaneously agree on who the leader is?
The IR solution works by performing an election in *rounds*. In the first round, all of the processes are in active contention for the leadership. In each round, we will eliminate some of the processes from contention, turning them into *passive* participants in the process.
At the beginning of each round, every *active* process randomly picks an ID. The processes with *largest* identifier will win the round. But since the IDs are random, there might be more than one process with a given identifier – so the largest identifier, the winner, may be the identifier of *more than one* process. So at the end of a round, if there is more than one process with the winning identifier, then another round is started, with *only* the winners competing.
Eventually, this *should* produce a winner. There’s no guarantee that it
will – you could wind up with an infinite series of elections ending in
ties. Unfortunately, that’s the best we can do: it’s provable that there
is no election procedure for indistinguishable processes that is
guaranteed to terminate with a single winner. Fortunately, we know that
probabilistically, this is likely to wind up with a unique leader in
a very small number of rounds; we just know that we need to be careful
about how we choose a random identifier-generation algorithm, in order to be certain that we don’t wind up with two processes generating identical streams of semi-random numbers for their identifiers.
With the intuitive explanation out of the way, let’s look at how
IR leader election works in detail. We’ve got a set of *n* processes, P1 through Pn. Each process can be either active (meaning it’s still in competition to become the leader), or passive (meaning it’s already lost). The processes are going to select a leader by sending a bunch of simple messages containing four fields to their neighbors: *(id, round, distance, unique)*, where *id* is a numeric identifier; *round* is the number of the round in which the message was generated; *distance* is the number of times the message has been transmitted before being viewed by a receiver; and *unique* is a flag that will be used to tell if there is more than one process with a particular winning identifier in the current round.
At the beginning of each round *r*, each active process Pi randomly selects an identifier, idi, and sends a message *(idi, r, 1, true)*.
When a message is received by a process, the way it’s handled depends on
whether the receiver is active or passive:
* If the receiver Pi is passive, it increments *distance* in the message, and sends the message to the next process.
* If the receiver is active:
* If *distance*=*n* and *unique*=true, then the receiver has won
the round, and there is no other process with the same identifier,
so it is the leader.
* If *distance*=*n* and *unique*=false, then the receiver has won
the round, but there is another process with the same identifier,
so there are at least two winners of this round, so it starts
a new round.
* If *distance*<n and the *id* in the message is the same as
the receivers ID, then another process in the ring has the same
identifier. So it sets the *unique* field of the message to false,
increments the *distance* field, and sends the message to the next process.
* if *distance*<n and the *id* in the message is *smaller* than
the receivers ID, it increments the *distance* field and passes the message on.
* if *distance*<n and the *id* in the message is *larger* than the
receiver’s ID, then the receiver loses. It becomes a passive process, increments the *distance* field, and passes the message to the
next process.
Let’s look at a quick example. Suppose we’ve got five processes arranged in a ring. The following diagram shows one full round:
process-leader.jpg
* Round 1: each process generates a random identifier. P1=3, P2=4, P3=2, P4=1, P5=4. They send messages to their neighbors.
* Distance=1:
* P1 receives (4,1,1,true). Since 4>3, P1 becomes passive, and sends the message (4,1,2,true) for the distance=2 step.
* P2 receives (3,1,1,true). Since 32, P3 becomes passive, and sends (4,1,2,true).
* P4 receives (2,1,1,true). Since 2>1, P4 becomes passive, and sends
(2,1,2,true).
* P5 receives (1,1,1,true). Since 1<4, P5 remains active, and
sends (1,1,2,true).
* Distance=2:
* P1, P3, and P4 are all passive, so they will just send on messages, incrementing the distance.
* P2 receives (4,1,2,true). Since 4=4, it sets unique=false.
* P5 receives (2,1,2,true), and increments the distance and passes it on.
* Distance=3:
* P1, P3, and P4 are passive.
* P2 receives (1,1,3,true), and passes it on.
* P5 received (4,1,3,true). Since 4=4, it changes unique to false, and passes it on.
* Distance=4: everyone just passes messages on.
* Distance=5: P2 and P5 both receive messages (4,1,5,false). So they
start a new round.
* Round 2: Only P2 and P5 are still active. They generate random identifiers, P2=3, P5=1. The messages hop around the loop in the same way as in round on, resulting in P5 becoming passive, and P2 becomes the leader.
Compare this to the biological system that PZ described. It's not the same, but the similarities are interesting. It's really not
that different at all – the active/passive state is very roughly equivalent to the production of delta; the ID magnitude is very roughly
equivalent to the rate of delta production. In the algorithm,
progressively more processes become inactive, shut down by some other process with a larger identifier; in the biological system, progressively more cells stop producing delta, shut down by some other cell which is a more aggressive delta producer. It's really a fascinating parallel.

0 thoughts on “Interesting Parallels: The Leader Election Problem and Notch Receptors

  1. John Armstrong

    The one thing that still nags at me about this is that some global information is required: the number of nodes in the loop. Talking only to your neighbords is pretty local, but there’s still this annoying global information requirement. Is there a modification so that the nodes don’t need to know how many others there are in the loop?
    Also, let’s say I’m the winner of the whole thing. Now I know I’m the leader because I got my own number back with “unique” set to true. How do the other nodes know I’m the leader?

    Reply
  2. Mark C. Chu-Carroll

    John:
    There are some algorithms that don’t need to know the number of nodes, but it’s harder – because it’s possible for a node to accidentally get ahead, into the second round before all participants had actually finished the first round. It gets messy. ūüôā
    The other nodes know who the leader is because by the time the leader receives the message that lets it know it’s won, the message with the maximum ID has passed through *every* node in the ring, and each node knows the distance to the winner because of the distance value in the message containing the winners ID. So everyone knows who won.

    Reply
  3. Coin

    Is the “ring” really how the communication pathways in a distributed computing environment tend to work? Does this algorithm generalize to more complicated network topologies, so to speak?

    Reply
  4. Mark C. Chu-Carroll

    Coin:
    There are many cases where this leader election algorithm (or one closely related to it) it used. It depends, as many of these things do, on the topology of the network.
    For example, the token ring network used something very much like this. The token ring made more effective use of network bandwidth than most current networking systems. The way that
    it worked was that the network was arranged in rings; within a ring, only one system had permission to transmit data; it was the system that held the *token*. When passing a message on from one system to the next, a system could add a bit specifying that it wanted the token. The token holder would pass the token on when it was done with it. On startup, or when the ring topology changed (either a ring split, or machines exited or entered) they’d run a leader election around the ring to decide where the start the token.
    Some of the high performance fiber networks use something like this today. (Ethernet, which most of us use, is based on something called an Aloha protocol, which doesn’t have any token, and doesn’t need to do anything like leader elections, but aloha style networks rarely achieve better than 1/2 utilization of the theoretical potential bandwidth of the network.)

    Reply
  5. billb

    John: Since there are a number of O(n) operations already occurring in this algorithm, you can determine n at the cost of only one extra round (at least proabalistically). Since this algorithm is only terminates at each round with some probability, there’s no obvious harm in determining n to some confidence. That being said, I can’t think of a case of a network where the members are truly indistinguishable, but maybe Mark’s answer to my question below will give us a clue as to why this algorithm would ever be necessary.
    Coin: It depends largely on what you mean by “distributed computing envrionment[s]”. There appear to be low-level networks were leader-election is important. Mark’s TokenRing example is canoncial. On the other hand, at a much higher level, modern distributed memory supercomputers (read: large clusters) are largely programmed with MPI (the underlying hardware being abstracted away) where leader election isn’t required. Every participating process of a group in an MPI-based program is distinguishable from the others because MPI assigns every process a rank at start up. The ordering is arbitrary, and the processes are, generally speaking, otherwise indistinguishable, but MPI does rank them. In such a case, if one process does need to lead some task, all processors can make a unique choice of who is to do it (e.g. the processor with least rank, etc.).
    MarkCC: I don’t know a lot about leader-election in practice. Can you give some pointers to some more modern examples?

    Reply
  6. JST

    Wow. This post is really great. I’d never heard of any of this stuff and now I feel like I’ve at least grasped the basics of a Brand New Thing. Thank you, Mark. And bravo for the excellent questions John Armstrong and Coin.

    Reply
  7. charlie wagner

    Reposted from Pharyngula:
    “The simple flaw in your thinking and in Chu-Carroll’s thinking is that ALL algorithms are the product of intelligent input. Whether they are designed, or whether they “evolved” an intelligence was required.
    What you need to demonstrate is an algorithm that has emerged WITHOUT the benefit of intelligent input. Now that would be something!

    Reply
  8. Torbjörn Larsson

    EVERYBODY expects the Wagnerian Inquisition! Our chief weapon is repetition…

    Star formation. Water cycle. Evolution. Snow flakes. … Look, as always this gets boring fast.

    Reply
  9. charlie wagner

    Alon Itai (http://www.cs.technion.ac.il/~itai/)
    and
    Michael Rodeh (http://www.haifa.il.ibm.com/rodeh.html)
    “Given a ring of n processors it is required to design the processors such that they will be able to choose a leader (a uniquely designated processor) by sending messages along the ring. If the processors are indistinguishable then there exists no deterministic algorithm to solve the problem. To overcome this difficulty, probabilistic algorithms* are proposed. The algorithms may run forever but they terminate within finite time on the average.”
    *A randomized algorithm or probabilistic algorithm is an algorithm which employs a degree of randomness as part of its logic. In common practice, this means that the machine implementing the algorithm has access to a pseudo-random number generator. The algorithm typically uses the random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the “average case”.
    How is this different from “assign each processer a random number and then randomly choose one of these numbered processors to be the leader”?

    Reply
  10. charlie wagner

    Alon Itai (www.cs.technion.ac.il/~itai/)
    and
    Michael Rodeh (www.haifa.il.ibm.com/rodeh.html)
    “Given a ring of n processors it is required to design the processors such that they will be able to choose a leader (a uniquely designated processor) by sending messages along the ring. If the processors are indistinguishable then there exists no deterministic algorithm to solve the problem. To overcome this difficulty, probabilistic algorithms* are proposed. The algorithms may run forever but they terminate within finite time on the average.”
    *A randomized algorithm or probabilistic algorithm is an algorithm which employs a degree of randomness as part of its logic. In common practice, this means that the machine implementing the algorithm has access to a pseudo-random number generator. The algorithm typically uses the random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the “average case”.
    How is this different from “assign each processer a random number and then randomly choose one of these numbered processors to be the leader”?

    Reply
  11. Curt Sampson

    Oh, dear; this myth is still being propagated, even after being busted almost twenty years ago.
    Theoretically, Token Ring has only a very slight advantage over well-designed CSMA/CD protocols such as Ethernet, and in practice it’s often worse. (Measured as what percentage of the nominal capacity of the link could actually be used. In real systems, Token Ring of any particular technology generation was always much, much slower than Ethernet of that same generation due to a lower raw data transfer rate.)
    Anyway, here’s the paper, from 1988: http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-88-4.pdf

    Reply
  12. Coin

    How is this different from “assign each processer a random number and then randomly choose one of these numbered processors to be the leader”?

    That’s essentially an accurate summary of what the algorithm does, yes. The only reasons why the algorithm appears any more complicated is that (1) “randomly choose one of these numbered processors” is not entirely straightforward in the distributed environment and (2) we have to do that random choosing in such a way that after the leader has been randomly chosen, the other processors all know who the chosen processor was and how to find it.

    Reply
  13. Damaru

    Fibre Channel Arbitrated loop systems use something simillar to this algorithm to select a node to be the loop master.

    Reply
  14. AJS

    Token Ring works by everybody speaking in turn, but only when they have The Conch (see Lord of the Flies); and even if you don’t want to say anything you still have to collect The Conch and pass it to the next person. So, even when everyone is silent, there is an overhead due to passing The Conch around.
    Ethernet works more like the way people talk at a dinner party. Any party can just say something anytime; and if they can’t hear their own voice clearly because somebody is talking over the top of it, both speaking parties shut up for a random period of time before speaking again.
    There’s no overhead due to Conch-passing. The more people are in the room, the less time anybody has to talk, in theory; but a modern router can store up packets and slip them into periods of silence.

    Reply
  15. Jonathan Vos Post

    When a computer built and programmed by humans beats a human at chess, who is the actual winner? How about if the hardware and software evolved in a design computer, long before the game?
    Your statement, above, will probably lead you to heresy and (so far as you probably believe in it) Hell.
    I say this as the first person to have used the Evolutionary Algorithm (when I was beta-testing John Holland’s famous book, while in grad school 1973-1977) to evolve an answer to a previously published unsolved problem in science.
    If God created, by Intelligent Design, a cosmic system that allowed Darwinian (Neodarwinian) evolution wherever the thermodynamic environment was favorable, then God in your system becomes not a Creator but a Meta-Creator. The cosmic system that allows Darwinian activity is what was called “a demiurge” in theology.
    That’s one step away from believing that God has delegated mere terrestrial events to Satan, the highest-ranking bureaucrat/coprocessor. That’s a version of the Manichaen Heresy.
    You’d better start praying (or programming) for forgiveness.
    [This Homecosmos Security message brought to you by the Department for Theomathematics]

    Reply
  16. Jonathan Vos Post

    Let us not forget Artificial Selection.
    “… More than a century later, Charles Darwin wrote that, unlike livestock, humans had never been forcibly bred for select characteristics, “except in the well-known case of the Prussian grenadiers….”
    Little Big Man
    Getting over our obsession with height.
    By David Wallace-Wells

    Wasn’t the whole Noah’s Ark story not only interesting naval architecture, but also proof that God performed Artificial Selection, thus implicitly acknowledging the mechanism of Natural Selection?
    Having the animals march in “two by two” was an early demonstration of the construction of “2” by categorical means, natural transformations, and isomorphism.
    Larry Niven wrote about humans selectively breeding for Luck, and Robert A. Heinlein wrote about selectively breeding humans for longevity [cf. Methuselah].

    Reply
  17. Troublesome Frog

    Mark,
    I know it’s a bit off topic, but when I read the subject of your post, I started thinking about the seemingly endless supply of “instant runoff” style election algorithms. When I started looking into the concept of IRV, I thought that it seemed like a simple enough problem to solve. Doing a little research, I found out that it’s fraught with interesting algorithmic gotchas like Condorcet’s paradox, gaming the system through tactical voting, and failure of monotonicity. Your title got me thinking about it again, and I was wondering if you, as an algorithms guy had any thoughts on the topic. It seems like a far richer and more interesting subject than the causal observer might think.

    Reply
  18. Jonathan Vos Post

    “I’d like the Hanging Chad, in garlic butter, Broiled Arrow’s Theorem, and a bottle of the ’99 Condorcet.”
    When Voting Theory professors and postdocs have a dinner out at a local restaurant-bar (perhaps “The Rotating Vector” venue from Gravity’s Rainbow), how do they vote on who has to pay the check?

    Reply
  19. Jonathan Vos Post

    I urge that you mention the very mathematical notion in developmental biology: the EIGENGENE.
    One place to start is:
    http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=27718
    Proc Natl Acad Sci U S A. 2000 August 29; 97(18): 10101-10106.
    Copyright © 2000, The National Academy of Sciences
    Genetics
    Singular value decomposition for genome-wide expression data processing and modeling
    Orly Alter,*† Patrick O. Brown,‡ and David Botstein*
    Departments of *Genetics and ‡Biochemistry, Stanford University, Stanford, CA 94305
    †To whom reprint requests should be addressed. E-mail: orly@genome.stanford.edu.
    Abstract:
    We describe the use of singular value decomposition in transforming genome-wide expression data from genes √ó arrays space to reduced diagonalized “eigengenes” √ó “eigenarrays” space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.
    =====
    This year’s Proceedings of the (USA) National Academy of Science give papers that, to me, confirm the existence of and importance of eigengene analysis. One survey article in PNAS compares the state of biology (with gene chips and FFT) to that of Astronomy (at the time of the invention of the telescope).
    As a mathematical biologist by undergrad and grad school research, before being led astray into industry, I am very excited by this. Pure Math, and Computer Science, advancing us asymptoticaly towards “the secret of Life!”

    Reply
  20. arensb

    There are some algorithms that don’t need to know the number of nodes, but it’s harder

    I was going to say that in practice, you may not know the number of nodes (e.g., you have a bunch of fileservers with replicated data; a power outage takes out a certain number of them; the remaining ones need to elect one of their number to become the new master fileserver from which the others will get their data). But in practice, the nodes aren’t identical: there’s always something like the IP address, host identifier, MAC address, process ID, etc., so it’s probably easier in that case to use a rule like “the candidate with the lowest IP number wins”.

    Reply

Leave a Reply