I promised my friend Toby that I’d link to one of his blog posts. Toby is one of the SREs that I work with at twitter, and let me tell you, you always want to stay on the good side of the SREs!
And to introduce it, I actually get a chance to talk about what I really do!
Since last July, I’ve been working for twitter. I haven’t yet gotten to talk about what it is that I do at twitter. For obvious reasons, I think it’s absolutely fascinating. And since we just recently released it as open-source, there’s no reason to keep it secret anymore. I work on something called Aurora, which is a framework for Mesos.
Let’s start with Mesos.
When you run a web application like Twitter, you can’t run your application on a single computer. There’s a bunch of reasons for that, but the biggest one is that there’s not a computer in the world that’s big enough to do it; and if there was, depending on one epically massive computer would be incredibly unreliable. Instead, what we do is set up a data center, and fill it up with thousands upon thousands of cheap machines. Then we just use all of those thousands of little computers. These days, everyone does things that way. Twitter, Google, Facebook, Amazon, Microsoft, Apple – we all run on gigantic clusters of cheap machines in a datacenter.
But there’s a problem there. How do you use a thousand computers? Much less 10,000 or a million, or more?
What you do is build a substrate, called a cluster management system. You write a program, and you say that you want to run 1,000 copies of it, and hand it to the cluster manager substrate. The substrate takes care of figuring out where to put those processes, and telling the specific individual machines that it selects what to do to run your program.
One of the hardest problems in a cluster management system is figuring out how to assign programs to machines. Depending on what a particular program needs to do, its requirements can vary enormously. Running a quick map-reduce has one set of requirements. Running a reliable service – a service that can survive if part of a datacenter loses electricity – has a different set of requirements.
One approach – the one used by Google when I worked there – was to define a constraint language that ultimately had hundreds of possible parameters, and then treat it as a classical optimization problem, trying to optimize over all of it. The good side of that is that it meant that every program at Google could express its requirements exactly. The bad side of it was that the constraint system was incredibly complicated, and almost no one could predict what results a given set of constraints would produce. Configuration turned into a sort of game, where you’d make a guess, look at what the borg gave you, and then modify your constraint – repeat until you got something satisfactory.
At Twitter, Mesos is our cluster management system. It takes a completely different approach. Basically, it takes the borg-like approach and turns it upside down. Mesos itself doesn’t try to do constraint satisfaction at all. What it does is talk to components, called frameworks, and it offers them resources. It basically says “Here’s all of the possible ways I can give you a fair share of resources in the clusters. Tell me which ones you want.”.
Each framework, in turn, figures out how to allocate the resources offered to it by Mesos to individual jobs. In some ways, that might seem like it’s just deferring the problem: don’t the frameworks end up needing to do the same allocation nightmare? But in fact, what you do is write a framework for a particular kind of job. A framework needs to work out some constraint satisfaction – but it’s a much simpler problem, because instead of needing to satisfy every possible customer with one way of expressing constraints, each framework can decide, on its own, how to allocate resources to one particular kind of job. Map-reduce jobs get one framework that knows how to do a good job scheduling map-reduces. Services get a framework that knows how to do a good job allocating resources for services. Databases get a framework that knows how to do a good job allocating resources for storage and bandwidth intensive processes. As a result, the basic Mesos cluster manager is dramatically simpler than a one-size-fits-all scheduler, and each of the component frameworks is also much simpler.
I don’t work on Mesos.
I work on Aurora. Aurora is a framework that runs under Mesos. It’s specialized for running services. Services are basically little server processes that run a lot of identical copiesfor a long time. Things like web-servers – where you need a ton of them to support millions of users. Or things like common processes that live behind the web-server answering requests for particular kinds of information.
At Twitter, all of the services that make up our system run in our datacenters on top of Mesos. They’re scheduled by Aurora.
With Aurora, running a service is incredibly easy. You write a configuration:
my_server_task = SequentialTask( processes = [ Process(name='stage_binary', cmdline='hadoopfs-copy /glorp/foo/hello_service .'), Process(name='run_service', cmdline='./myservice') ], resources = Resources(cpu=1.0, ram=128*MB, disk=128*MB)) jobs = [ Service(task = my_server_task, cluster = 'mycluster', role='markcc_service', environment = 'prod', name = 'hello', instances=300)]
The configuration says how to install a program and then run it on a machine, once Aurora assigns a machine to it. Aurora will find 300 machines, each of which can dedicate one full CPU, 128MB of memory, and 128MB of disk to the process, and it will run the service on those 300 machines.
Both Aurora and Mesos are open-source, under the Apache license. We’ve got a test environment in the distribution, so that you could be running tests in a virtual Mesos/Aurora cluster one hour from now! And my good friend Toby wrote an excellent blog post on how to set up a working Mesos/Aurora cluster.
I don’t want to toot my own horn too much, but I’m incredibly proud that Twitter open-sourced this stuff. Most companies consider their cluster management systems to be super-proprietary secrets. But at Twitter, we decided that this is a problem that no one should have to solve from scratch. It’s something we all know how to do, and it’s time that people stop being forced to waste time reinventing the same old wheel. So we’re giving it away, to anyone who wants to use it. That’s pretty damned awesome, and I’m proud to be a part of it!