# The Fedex Problem

Over the weekend on twitter, someone tweeted a link to a really fun paper looking paper on arXiv, called The Fedex Problem. It’s a really interesting paper, because it takes a real-world problem, analyzes mathematically, and shows some of the fascinating complexity that lies under the surface of something that seems simple. (I can’t find the original tweet, so I don’t know who first shared it – sorry!

The problem that they looked at involves the logistics of rapid delivery by services like Federal Expression.

When Fedex was founded, they built their overnight delivery business on a new model of how to handle deliveries, called a hub system.

What most delivery businesses traditionally did was take packages in at a local warehouse, and sort them into groups depending on the warehouse that serviced their destination. Each group was then shipped separately, from the source warehouse to the destination warehouse. To do overnight delivery, each warehouse, then, would need to have a full package-sorting system, and they would need to to send a shipment to every other warehouse every single day. For $N$ warehouses, that means that they needed $N$ full sorting systems, and $N^2$ shipments every day. The costs of the redundant shipping systems and the massive number of shipments are huge, and completely non-scalable. Imagine a system with just 100 local warehouses – that’s only 2 for each state! – you’d need to have 10,000 shipments per day. In practice, the number of fedex warehouses just in the US is considerably larger than just 100: it simply can’t work.

What Fedex chose to do instead was build a hub system. Every package that arrives at a local warehouse gets shipped to the same place, called the hub. The only sorting system is at the hub – individual local warehouses don’t sort, they just ship to the hub. The hub sorts according to destination warehouse, and then ships the packages to the appropriate warehouse for delivery.

The Fedex hub system solved both of the problems of the traditional approach. The individual local shipping points don’t need to do sorting: they just ship to the hub, which takes care of all of it. And since every package goes to the hub, instead of needing $N^2$ shipments, they only needed $2N$ shipments per day: $N$ shipments of unsorted packages from local warehouses to the hub, and $N$ shipments of sorted packages from the hub to the local warehouses for local deliveries.

How well this system works is very dependent on the proper location of the hub. To demonstrate why, consider the worst possible hub location for American shipments: Hawaii. Now, every package from anywhere in the mainland to anywhere else in the mainland needs to travel at least an extra 2500 miles to the hub, and an extra 2500 miles back! The cost of shipping it dominated by fuel costs – adding any extra distance to the average shipment costs fuel, and thus money. So the location of hub has a dramatic effect on the shipment costs.

FedEx located their hub in Memphis, Tennessee, because according to its founders, Memphis was close to the “center” of the shipping region for its original target market cities, and thus minimized the total shipping distances. (That wasn’t their sole concern; another factor was that Memphis is in an area that is rarely incapacitated by weather or other natural disasters – but the travel distance was the biggest factor.)

What this paper does is take the Fedex problem, and generalize it. What, exactly, does it mean for a given point to be an ideal hub? Given a network of points in a Euclidean space, how can you compute the ideal hub? And finally, if we look at the actual Federal Express distribution system, using great-circle routing between points, how close is Memphis to being the ideal hub?

If you look at the problem naively, it seems like it should be simple. The ideal hub should just be the “average” location. That’s basically what the FedEx founders said – they picked the geographical center of their original markets. Unfortunately, reality is rarely kind enough to make things fit our intuitions, and that’s not quite right.

To see why, let’s first look at the definition of the ideal hub. What we want to do is to minimize the total distance from local shipping points to the hub. To keep the notation simple, we’ll write the problem down as if we were looking at the one-dimensional version of the problem. Each location is a single value, and we’ll write the distance between two points $a$ and $b$ as the absolute value of $a-b$: $\left|a-b\right|$.

If the set of $i$ points in the network are called $x_1 \dots x_i$, then the total distance value for a hub at $x$ is given by the function:

$f(x) = \Sigma_i 2\left| x - x_i \right|$

Since the 2 is a constant factor that affects all x-values easily, for the purposes of simplicity, we can drop it out. So the problem of selecting an ideal hub ultimately reduces to an optimization problem: find the value $x$ where $f(x)$ is a minimum.

Again, naively, it seems as though the average is best. But we can see that it’s not with a simple example. Let’s just look at a one-dimensional case – a list of locations scattered along a line: {1, 2, 3, 5, 13, 14, 60}. The point that’s at the average is 14, with a total one-way travel distance of 14 + 12 + 11 + 9 + 0 + 1 + 46 = 93. But by doing some brute-force, we can see that the total distance for $x=5$ is just 4 + 3 + 2 + 0 + 8 + 9 + 55 = 81. So the mean isn’t the optimal hub location. In fact, you can prove that in the one-dimensional case, the optimal hub is the median location, not the mean.

In this example, the average isn’t far off – but even in a quick off-the-cuff example, you can see that average isn’t the answer: the mean can be quite different from the median, particularly when you’ve got a lot of clustering, or a couple of outliers. When you extend to more than one dimension, then things can get bad pretty quickly.

The general problem has been looked at in many forms. It’s known, variously, as the Fermat-Torricelli problem, the Weber problem, or the single facility location problem.

Unfortunately, getting an answer to the problem is a lot harder than just naming it. According to the authors of this paper, even the fact of the existence of a unique ideal hub isn’t trivial. Despite all of the people who’ve worked on the problem, they couldn’t find a single complete proof of it! They go ahead and provide one. They prove that:

For any finite set of non-collinear points in any euclidean space $R^N$,
there is an idea hub, located within the convex hull of the set of points.

The convex hull qualifier there deserves a bit of explanation. It’s an interesting idea that ends up coming up in a bunch of different mathematical contexts. The easiest way to explain it is by the easiest method of finding the convex hull in two dimensions: suppose you’ve got a bunch of points on a plane. Draw the plane on graph paper, and put a nail at the location of each point. Take a rubber band, and stretch it so that every nail on the graph is inside of the rubber band. Let go. The rubber band will end up touching some of the nails; some will be inside, but untouched. The rubber band is the convex hull of the points. It’s basically a general notion of the smallest blob that contains all of the points in the set. The authors present a short, clean proof that the ideal hub location will always be within the “blob” of the points.

The paper goes into some depth in discussing some simple cases of the ideal hub, and how they can be calculated geometrically. But for the general problem, the geometric approaches don’t work. To get the ideal hub for a large collection of points, they’re left with brute force. In the brute force method, you just take every point $x$ in the network, and compute its $f(x)$ value. Since it’s a finite network, you’ll eventually find the ideal hub. (You can, of course, apply
some nice heuristics to simplify the problem: the set of candidates that might be optimal hubs is going to be a lot smaller than the set of all ponits in the network!) But still, computing the optimal hub is quite complicated – $O(N^2)$ in the number of points in the network.

After all of that work, where is the ideal hub? How different is it from the hub that Federal Express actually uses?

As it turns out, Fedex isn’t too far off, but it’s definitely non-optimal. Using US census data, combined with great-circle distances between the census tracts, the ideal hub is in central Indiana. FedEx’s hub is 315 miles off, to the south-south-west of the ideal location.

Interestingly, when UPS (probably FedEx’s biggest rival) set up a hub system, their hub is in Indianapolis – just 85 miles away from the ideal hub.

To me, the most interesting thing about the paper comes after they showed the ideal hub by brute force. If you’ve got more than four points – even just five! – in a two-dimensional space, there’s no simple geometric solution to find the ideal hub. It seems like it should be simple, but it isn’t. The proof of that is really fascinatingly simple: it comes down to group theory and symmetry. Read the paper for the details. A geometric solution is limited by the difference in dimensionality between a particular one-dimensional rotational symmetry group, and the number of objects in the network. As a result, for 4 or less non-collinear points, there’s a way of finding a single unique geometric configuration that can be exploited to characterize the ideal hub. For more than 4 points, there isn’t – there’s no single configuration anymore, which blows away the usefulness of geometric solutions.

If you find this at all interesting, then it’s worth downloading and reading the paper. There’s a whole lot their that I haven’t covered – more than what I have. But I hope this little taste is enough to pique your curiosity.

# Clarifying Groupoids and Groups

This post started out as a response to a question in the comments of my last post on groupoids. Answering those questions, and thinking more about the answers while sitting on the train during my commute, I realized that I left out some important things that were clear to me from thinking about this stuff as I did the research to write the article, but which I never made clear in my explanations. I’ll try to remedy that with this post.

# More Groupoids and Groups

In my introduction to groupoids, I mentioned that if you have a groupoid, you can find
groups within it. Given a groupoid in categorical form, if you take any object in the
groupoid, and collect up the paths through morphisms from that object back to itself, then
that collection will form a group. Today, I’m going to explore a bit more of the relationship
between groupoids and groups.

Before I get into it, I’d like to do two things. First, a mea culpa: this stuff is out on the edge of what I really understand. My category-theory-foo isn’t great, and I’m definitely
on thin ice here. I think that I’ve worked things out enough to get this right, but I’m
not sure. So category-savvy commenters, please let me know if you see any major problems, and I’ll do my best to fix them quickly; other folks, be warned that I might have blown some of the details.

Second, I’d like to point you at Wikipedia’s page on groupoids as a
reference. That article is quite good. I often look at the articles in Wikipedia and
MathWorld when I’m writing posts, and while wikipedia’s articles are rarely bad, they’re also
often not particularly good. That is, they cover the material, but often in a
somewhat disorganized, hard-to-follow fashion. In the case of groupoids, I think Wikipedia’s
article is the best general explanation of groupoids that I’ve seen – better than most
textbooks, and better than any other web-source that I’ve found. So if you’re interested in
finding out more than I’m going to write about here, that’s a good starting point.

# Capturing More Symmetry using Categories: Groupoids

Today’s entry is short, but sweet. I wanted to write something longer, but I’m very busy at work, so this is what you get. I think it’s worth posting despite its brevity.

When we look at groups, one of the problems that we can notice is that there are things
that seem to be symmetric, but which don’t work as groups. What that means is that despite the
claim that group theory defines symmetry, that’s not really entirely true. My favorite example of this is the fifteen puzzle.

The fifteen puzzle is a four-by-four grid filled with 15 tiles, numbered from 1 to 15, and one empty space. You can make a move in the puzzle by sliding a tile adjacent to the empty
space into the empty. In the puzzle, you scramble up the tiles, and then try to move them back so that they’re in numerical order. The puzzle, in its initial configuration, is shown to the right.

If you look at the 15 puzzle in terms of configurations – that is, assignments of the pieces to different positions in the grid – so that each member of the group describes a single tile-move in a configuration, you can see some very clear symmetries. For example, the moves that are possible when the empty is in any corner are equivalent to the moves that are possible when the empty is in any other corner. The possible moves when the space is in any given position are the same except for the labeling of the tiles around them. There’s definitely a kind of symmetry there. There are also loops – sequences of moves which end in exactly the same state as the one in which they began. Those are clearly symmetries.

But it’s not a group. In a group, the group operation most be total – given any pair of values x and y in the group, it must be possible to combine x and y via x+y. But with the 15 puzzle, there moves that can’t be combined with other moves. If x = “move the ‘3’ tile from square 2 to square 6”, and y = “move the ‘7’ tile from square 10 to square 11”, then there’s no meaningful value for “x+y”; the two moves can’t be combined.

# Before Groups from Categories: a Category Refresher

So far, I’ve spent some time talking about groups and what they mean. I’ve also given a
brief look at the structures that can be built by adding properties and operations to groups –
specifically rings and fields.

Now, I’m going to start over, looking at things using category theory. Today, I’ll start
with a very quick refresher on category theory, and then I’ll give you a category theoretic
presentation of group theory. I did a whole series of articles about category theory right after I moved GM/BM to ScienceBlogs; if you want to read more about category theory than this brief introduction, you can look at the category theory archives.

Like set theory, category theory is another one of those attempts to form a fundamental
abstraction with which you can build essentially any mathematical abstraction. But where sets
treat the idea of grouping things together as the fundamental abstraction, category
theory makes the idea of mappings between things as the fundamental abstraction.

# Fields, Characteristics, and Prime Numbers

When we start looking at fields, there are a collection
of properties that are interesting. The simplest one – and
the one which explains the property of the nimbers that
makes them so strange – is called the
characteristic of the field. (In fact, the
characteristic isn’t just defined for fields – it’s defined
for rings as well.)

Given a field F, where 0F is the additive
identity, and 1F is the multiplicative identity,
the characteristic of the field is 0 if and only if no
sequence of adding 1F to itself will ever result
in 0F; otherwise, the characteristic is the
number of 1Fs you need to add together to get
0F.

That sounds confusing – but it really isn’t. It’s just
hard to write in natural language. A couple of examples will
make it clear.

# Abstract Real Numbers: Fields

When I learned abstract algebra, we very nearly skipped over rings. Basically, we
spent a ton of time talking about groups; then we talked about rings pretty much as a
stepping stone to fields. Since then, I’ve learned more about rings, in the context of
category theory. I’m going to follow the order in which I learned things, and move on
to fields. From fields, I’ll jump back a bit into some category theory, and look at
the category theoretic views of the structures of abstract algebra.

My reasoning is that I find that you need to acquire some understanding of what
the basic objects and morphisms mean, before the category theoretic view makes any
sense. Once you understand the basic concepts of abstract algebra, category theory
becomes very useful for understanding how things fit together. Many complicated things become clear in terms of category theoretic descriptions and structures – but first, you need to understand what the elements of those structures mean.

# Building up more: from Groups to Rings

If you’re looking at groups, you’re looking at an abstraction of the idea of numbers, to try to reduce it to minimal properties. As I’ve already explained, a group is a set of values with one operation, and which satisfies several simple properties. From that simple structure comes the
basic mathematical concept of symmetry.

Once you understand some of the basics of groups and symmetry, you can move in two directions. You can ask “What happens if I add something?”; or you can ask “What happens if I remove something?”.

You can either add operations – which can lead you to a two-operation structure called a ring; or you can add properties – in which the simplest step leads you to something called an abelian group. When it comes to removing, you can remove properties, which leads you to a simpler structure called a groupoid. Eventually, I’m going to follow both the upward and the downward paths. For now, we’ll start with the upward path, since it’s easier.

Building up from groups, we can progress to rings. A group captures one simple property of
a set of number-like objects. A ring brings us closer to capturing the structure of the system
of numbers. The way that it does this is by adding a second operation. A group has one operation
with symmetric properties; a ring adds a second symmetric operation, with a well-defined relationship
between the two operations.

# The Meaning of Division: Quotient Groups

After that nasty diversion into economics and politics, we now return to your
regularly scheduled math blogging. And what a relief! In celebration, today I’ll give
you something short, sweet, and beautiful: quotient groups. To me, this is a shining
example of the beauty of abstract algebra. We’ve abstracted away from numbers to these
crazy group things, and one reward is that we can see what division really means. It’s
more than just a simple bit of arithmetic: division is a way of describing a fundamental