# The Fedex Problem

Over the weekend on twitter, someone tweeted a link to a really fun paper looking paper on arXiv, called The Fedex Problem. It’s a really interesting paper, because it takes a real-world problem, analyzes mathematically, and shows some of the fascinating complexity that lies under the surface of something that seems simple. (I can’t find the original tweet, so I don’t know who first shared it – sorry!

The problem that they looked at involves the logistics of rapid delivery by services like Federal Expression.

When Fedex was founded, they built their overnight delivery business on a new model of how to handle deliveries, called a hub system.

What most delivery businesses traditionally did was take packages in at a local warehouse, and sort them into groups depending on the warehouse that serviced their destination. Each group was then shipped separately, from the source warehouse to the destination warehouse. To do overnight delivery, each warehouse, then, would need to have a full package-sorting system, and they would need to to send a shipment to every other warehouse every single day. For $N$ warehouses, that means that they needed $N$ full sorting systems, and $N^2$ shipments every day. The costs of the redundant shipping systems and the massive number of shipments are huge, and completely non-scalable. Imagine a system with just 100 local warehouses – that’s only 2 for each state! – you’d need to have 10,000 shipments per day. In practice, the number of fedex warehouses just in the US is considerably larger than just 100: it simply can’t work.

What Fedex chose to do instead was build a hub system. Every package that arrives at a local warehouse gets shipped to the same place, called the hub. The only sorting system is at the hub – individual local warehouses don’t sort, they just ship to the hub. The hub sorts according to destination warehouse, and then ships the packages to the appropriate warehouse for delivery.

The Fedex hub system solved both of the problems of the traditional approach. The individual local shipping points don’t need to do sorting: they just ship to the hub, which takes care of all of it. And since every package goes to the hub, instead of needing $N^2$ shipments, they only needed $2N$ shipments per day: $N$ shipments of unsorted packages from local warehouses to the hub, and $N$ shipments of sorted packages from the hub to the local warehouses for local deliveries.

How well this system works is very dependent on the proper location of the hub. To demonstrate why, consider the worst possible hub location for American shipments: Hawaii. Now, every package from anywhere in the mainland to anywhere else in the mainland needs to travel at least an extra 2500 miles to the hub, and an extra 2500 miles back! The cost of shipping it dominated by fuel costs – adding any extra distance to the average shipment costs fuel, and thus money. So the location of hub has a dramatic effect on the shipment costs.

FedEx located their hub in Memphis, Tennessee, because according to its founders, Memphis was close to the “center” of the shipping region for its original target market cities, and thus minimized the total shipping distances. (That wasn’t their sole concern; another factor was that Memphis is in an area that is rarely incapacitated by weather or other natural disasters – but the travel distance was the biggest factor.)

What this paper does is take the Fedex problem, and generalize it. What, exactly, does it mean for a given point to be an ideal hub? Given a network of points in a Euclidean space, how can you compute the ideal hub? And finally, if we look at the actual Federal Express distribution system, using great-circle routing between points, how close is Memphis to being the ideal hub?

If you look at the problem naively, it seems like it should be simple. The ideal hub should just be the “average” location. That’s basically what the FedEx founders said – they picked the geographical center of their original markets. Unfortunately, reality is rarely kind enough to make things fit our intuitions, and that’s not quite right.

To see why, let’s first look at the definition of the ideal hub. What we want to do is to minimize the total distance from local shipping points to the hub. To keep the notation simple, we’ll write the problem down as if we were looking at the one-dimensional version of the problem. Each location is a single value, and we’ll write the distance between two points $a$ and $b$ as the absolute value of $a-b$: $\left|a-b\right|$.

If the set of $i$ points in the network are called $x_1 \dots x_i$, then the total distance value for a hub at $x$ is given by the function:

$f(x) = \Sigma_i 2\left| x - x_i \right|$

Since the 2 is a constant factor that affects all x-values easily, for the purposes of simplicity, we can drop it out. So the problem of selecting an ideal hub ultimately reduces to an optimization problem: find the value $x$ where $f(x)$ is a minimum.

Again, naively, it seems as though the average is best. But we can see that it’s not with a simple example. Let’s just look at a one-dimensional case – a list of locations scattered along a line: {1, 2, 3, 5, 13, 14, 60}. The point that’s at the average is 14, with a total one-way travel distance of 14 + 12 + 11 + 9 + 0 + 1 + 46 = 93. But by doing some brute-force, we can see that the total distance for $x=5$ is just 4 + 3 + 2 + 0 + 8 + 9 + 55 = 81. So the mean isn’t the optimal hub location. In fact, you can prove that in the one-dimensional case, the optimal hub is the median location, not the mean.

In this example, the average isn’t far off – but even in a quick off-the-cuff example, you can see that average isn’t the answer: the mean can be quite different from the median, particularly when you’ve got a lot of clustering, or a couple of outliers. When you extend to more than one dimension, then things can get bad pretty quickly.

The general problem has been looked at in many forms. It’s known, variously, as the Fermat-Torricelli problem, the Weber problem, or the single facility location problem.

Unfortunately, getting an answer to the problem is a lot harder than just naming it. According to the authors of this paper, even the fact of the existence of a unique ideal hub isn’t trivial. Despite all of the people who’ve worked on the problem, they couldn’t find a single complete proof of it! They go ahead and provide one. They prove that:

For any finite set of non-collinear points in any euclidean space $R^N$,
there is an idea hub, located within the convex hull of the set of points.

The convex hull qualifier there deserves a bit of explanation. It’s an interesting idea that ends up coming up in a bunch of different mathematical contexts. The easiest way to explain it is by the easiest method of finding the convex hull in two dimensions: suppose you’ve got a bunch of points on a plane. Draw the plane on graph paper, and put a nail at the location of each point. Take a rubber band, and stretch it so that every nail on the graph is inside of the rubber band. Let go. The rubber band will end up touching some of the nails; some will be inside, but untouched. The rubber band is the convex hull of the points. It’s basically a general notion of the smallest blob that contains all of the points in the set. The authors present a short, clean proof that the ideal hub location will always be within the “blob” of the points.

The paper goes into some depth in discussing some simple cases of the ideal hub, and how they can be calculated geometrically. But for the general problem, the geometric approaches don’t work. To get the ideal hub for a large collection of points, they’re left with brute force. In the brute force method, you just take every point $x$ in the network, and compute its $f(x)$ value. Since it’s a finite network, you’ll eventually find the ideal hub. (You can, of course, apply
some nice heuristics to simplify the problem: the set of candidates that might be optimal hubs is going to be a lot smaller than the set of all ponits in the network!) But still, computing the optimal hub is quite complicated – $O(N^2)$ in the number of points in the network.

After all of that work, where is the ideal hub? How different is it from the hub that Federal Express actually uses?

As it turns out, Fedex isn’t too far off, but it’s definitely non-optimal. Using US census data, combined with great-circle distances between the census tracts, the ideal hub is in central Indiana. FedEx’s hub is 315 miles off, to the south-south-west of the ideal location.

Interestingly, when UPS (probably FedEx’s biggest rival) set up a hub system, their hub is in Indianapolis – just 85 miles away from the ideal hub.

To me, the most interesting thing about the paper comes after they showed the ideal hub by brute force. If you’ve got more than four points – even just five! – in a two-dimensional space, there’s no simple geometric solution to find the ideal hub. It seems like it should be simple, but it isn’t. The proof of that is really fascinatingly simple: it comes down to group theory and symmetry. Read the paper for the details. A geometric solution is limited by the difference in dimensionality between a particular one-dimensional rotational symmetry group, and the number of objects in the network. As a result, for 4 or less non-collinear points, there’s a way of finding a single unique geometric configuration that can be exploited to characterize the ideal hub. For more than 4 points, there isn’t – there’s no single configuration anymore, which blows away the usefulness of geometric solutions.

If you find this at all interesting, then it’s worth downloading and reading the paper. There’s a whole lot their that I haven’t covered – more than what I have. But I hope this little taste is enough to pique your curiosity.

## 15 thoughts on “The Fedex Problem”

1. Flooey

It’s interesting that, even after all this, FedEx’s real problem is actually even more complicated. The value of having low-distance shipping to an area isn’t actually proportional to the population there, but to the shipping volume of that area, which is presumably correlated with population but isn’t identical. Washington DC, for instance, likely generates an outsized number of packages due to shipments to and from government agencies. The info on shipping volumes would be both harder to determine and probably more frequently shifting than the population is.

1. Carl Witty

Interesting. When I started reading the sentence “Washington DC, for instance, likely generates…”, I was assuming it would end something like “… less than the average number of packages, because it doesn’t have much in the way of manufacturing”. I wonder which of us has guessed right about Washington DC packages-per-population?

1. John Armstrong

FedEx is pretty much the sole allowed carrier for U.S. government classified material not hand-carried by courier. The DC area generates a LOT of that.

2. Nigel Childs

Yes, the problem is much harder than simply the shipping distance – but actually the distance simply represents the cost of a particular route. Whatever the best algorithm to obtain a “cost per time period” you can use it in place of distances and solve the resulting problem in exactly the same fashion.

So in reality, it’s no harder than finding the appropriate cost of each route.

1. markcc Post author

No – multiple hubs makes the problem even harder. Eventually they become essential – at some point, a single hub gets overwhelmed – but multiple hubs starts you down the O(N^2) trap that the hubs were supposed to avoid.

The paper does include a brief discussion of multiple hubs – take a look!

2. Ingvar Mattsson

It vaguely feels like hierarchical hubs with “I know my downstream hubs/warehouses” and “everything else is routed upwards” would reult in more sorting machinery, but still keep shipment distance to a manageable number.

But I am approaching this from a “I ran somewhat large networks for a decade” perspective (“Border gateway Packageprotocol”?) and I can’t even approach to tell you how to calculate where to put said hubs (apart from “run stats on packages that go to a nearby location” and then run the same “find THE hub” algorithm on a smaller set of points).

1. Spencer

I also expected the hierarchical approach to be optimal. If you used a binary tree, then each node would have to sort incoming packages between just two locations (upstream vs. the child not originating the package). That would mean only O(N) shipments need to be made between facilities. Some sorting happens at each node, but the sorting is proportionate to the shipments passing through the node.

Finding the optimal locations could be done by considering all equal divisions of the points into two halves with equal shipments and then recursively finding the best hub for each half. Not a great running time, but its massively parallelizable and probably feasible to compute.

1. Spencer

I calculate O(n^log n) for the running time, btw, but it’s possible I neglected something in the recurrence relation.

1. markcc Post author

The problem with the heirarchical approach is time: each cycle of sort/ship takes time.

With a single hub, you need to run one flight from each warehouse to the hub in the afternoon; and then one flight from the hub back to the warehouses during the night.

If you do the hierarchical thing, then on average you need multiple legs. For just 8 warehouses, you’re looking at three steps of travel. Realistically, you’d have on the order of 100 warehouses – which means at worst 7 travel hops, which is too many: it’s hard to come up with a schedule that allows that many hops while still making overnight delivery.

3. Ryan

He left a bit of a cliff-hanger at the end, re: the 2-hub problem on [0,1]:

“””
Explicitly finding solutions, however, appears to be virtually impossible… Numerical optimization gives the optimal locations as approximately 0.29 and 0.71. For these hubs the average distance a package travels is 0.39, a significant decrease from the one hub average, which is 1/2, and not too more [sic] than the average distance between any two points, which is 1/3.
“””

I went ahead and figured out the exact numbers: 1-1/sqrt(2) (β0.29), 1/sqrt(2) (β0.71), (4-2sqrt(2))/3 (β0.39), but this comment box is too narrow to contain the proof π