Another really fun and fundamental weighted graph problem is the shortest path problem. The
question in: given a connected weighted graph G, what is the shortest path between two vertices
v and w in G?
To be precise, the weight of a path is just the sum of the weight of the edges that make up
the path. The shortest path between two vertices is the path with minimum weight.
There are a ton of solutions to this problem. I’m going to show one of the easiest to understand ones, which is Djikstra’s algorithm. Djikstra’s algorithm for the shortest path combines two very interesting
algorithmic techniques: greediness, and relaxation. We’ve already talked about greediness in the context
of the minimum spanning tree: a greedy algorithm is an algorithm which works by aggressively picking
optimum sub-values, and building the result from them. Relaxation is an algorithmic technique where
you start with a maximum value, and attempt to reduce it. The easiest way to understand relaxation
is by metaphor: suppose you want to find the optimal arrangement of a bunch of pegs connected by springs. A relaxation technique basically moves one peg in each step in a way that reduces the tension in at least one spring without increasing the tension in any of the others. So you’re continually relaxing the tension in the springs, until you can’t relax it any more.
In Djikstra’s algorithm, you have a graph G=(V,E,W), and you want to find the shortest path
between two vertices a and b. You create a table dist(x) of the minimum distance from a to every vertex in V so that dist(x) is the distance from a to x, and you set the initial value of dist(a) to 0,
the initial value of dist(x) for all other vertices in E to infinity; and you create a priority list of vertices, ordered by their distance from a.
Then, in each step, you look for the vertex n in E that has the smallest value of
dist(n), and you remove that vertex from the priority list. Then, for each vertex m that is adjacent to n, if the path from a to m through n is shorter
than dist(m), set dist(m) = dist(n) + W(n,m). (This is called relaxing the path from a to m.)
When you get to the point that the next vertex selected in the priority list is b, then you’re done – you know the shortest path.
def shortestPath(G,a,b): dist = Map from vertices(G) to distance prev = Map from vertices(G) to vertices(G) for x in vertices(G): if x == a: dist[x] = 0 else: dist[x] = ∞ end priorityList = vertices(G) ordered by distance from a for closest = priorityList.nextVertex() != b: for v adjacent to closest: if dist[v] < dist(closest) + weight(closest,v): dist[v] = dist(closest) + weight(closest,v) prev[v] = closest end end end
When the algorithm terminates, you just trace back through prev to get the shortest path from a to b.
So what’s the complexity of this?
- In the worst case, b could be the last vertex extracted from the priority list, in which case
we would end up running the outer loop once for each vertex. So there’s a maximum of N iterations
of that loop.
- In the inner loop, we need to look at every edge from the selected vertex. That’s O(E).
- To re-order the priority list after removing an element from it takes, at most, N steps.
So, we’re looking at O(|V|2|E|).
That’s not the best we can do; by adopting some hairy data structures for the priority queue,
we can reduce the running time to O(|V|log|V||E|). But the structures to do that have a pretty high
constants, so they only really pay off for large, sparse graphs.
Let’s trace through an example. Suppose we want to find the shortest path from A to M in the following graph.
First, we set all weights to infinity, except A to A, all predecessors to empty, and we’ll compute
the weights for the priority list. The following shows the initial state of all of these.
Now, we pick something off the priority list. The highest priority is 0, for A itself. So we pick A, and
from A, we can give distances and priorities to B and D.
Now, we greedily pick the node with the lowest priority value – B and process its edges.
Next lowest priority is D, with 5.
Next is C, with priority 12. C gives us a new path to F, but it’s a path of length 14, which is longer than the path we already
Next is F, with priority 13. Then there’s a tie, so we pick one randomly, and do E. Then G. After those steps, the table looks like:
It continues on in the same way, until we get the result ADFGJKM with a total length of 26.