RAND CORPORATION · SANTA MONICA · 1955
Ted Harris and Frank Ross are staring at a map of the Soviet rail network. The question on the desk is not abstract. If war comes, which rail lines do we bomb first?
The mathematical answer to their question was being invented one floor away, by two men writing a paper called Maximal Flow Through a Network.
Part of Graph Theory: four standalone stories tracing the mathematics of connection.
See all four →Forty-four vertices, one hundred and five edges, and a classified stamp on the cover.
In late 1954, the United States Air Force asked the RAND Corporation to evaluate the capacity of the Soviet railway network to move supplies from the Soviet interior to Eastern Europe. The study was assigned to Theodore Harris, a mathematician at RAND, and Frank Ross, a retired Air Force general. They finished their report in October 1955 under a title that gave nothing away: Fundamentals of a Method for Evaluating Rail Net Capacities.
Inside was a graph. Forty-four vertices, each a major rail junction in the Soviet Union and its satellites. One hundred and five edges, each a section of track. Every edge had a number attached to it, the number of freight trains it could move per day, computed from track count, signalling capacity, grade, and the locomotives available to the district.
Harris and Ross asked two questions. First: what is the maximum tonnage per day that can move from the Soviet interior to the Eastern European border? Second, and this was the operational question: which small set of rail segments, if destroyed, would reduce that capacity the most?
The map is now public.
The Harris and Ross report was declassified in 1999. The map of the Soviet rail network they used, and the labelled "bottleneck" the study identified, are in the public domain. If you want to see the original, search for Harris Ross 1955 Soviet rail; the RAND archive hosts a PDF. The bottleneck they found ran through a corridor near the present-day Ukraine-Poland border.
Directed edges, capacities, a source, a sink, and a conservation law at every other node.
A flow network is a directed graph in which every edge carries a non-negative number called its capacity. Two distinguished nodes are the source and the sink . A flow is an assignment of a number to every edge such that no edge exceeds its capacity and, at every node other than and , what flows in equals what flows out.
The maximum flow problem asks for the flow that pushes as many units as possible from to . An s-t cut is a partition of the nodes into two sets and with and . The capacity of a cut is the sum of the capacities of edges going from to . The minimum cut is the cut of smallest capacity.
Max flow is the question Harris and Ross asked first. Min cut is the question they cared about second, and more. The two questions turn out to have the same answer.
Ten cities, a westward flow, and a single question: which rails matter most?
Below is a stylised version of the Harris-Ross network. Moscow is
the source, a notional Western Frontier is the sink, and supply
flows westward through ten cities. Every edge shows
flow / capacity. Gold
edges carry some of the maximum flow; red edges are the min cut.
Click an edge label to raise its capacity by five; shift-click to
lower it by five. Watch the red edges move as you work.
Gold edges carry flow. Red edges are on the min cut. Click an edge label to add 5 units of capacity; shift-click to remove 5.
Maximum flow
85
Moscow to Frontier, summed across all paths.
Min cut (3 edges)
Two things worth trying. Raise the capacity of a red edge; the red set almost always shifts somewhere else, because cutting a bottleneck only helps until a new one becomes binding. Set every edge out of Moscow to zero in turn; the max flow collapses to zero. The min cut is never larger than the source's total outgoing capacity, because that is itself a cut.
One of the cleanest theorems in combinatorial optimisation.
First, the easy direction. Any flow from to has to cross every s-t cut, so its value is at most the capacity of any cut:
Read that inequality aloud. On the left, is the total value of whatever flow you happened to construct: how many units per day you successfully pushed from source to sink. On the right, is the total capacity of every pipe that crosses some wall you drew splitting the source side from the sink side. The claim is the obvious one: every single unit of flow you sent across the network had to, at some point, cross that wall, because the source is on one side and the sink is on the other. So your flow cannot possibly exceed the total carrying capacity of the pipes that cross it.
Different walls give different bounds. Draw a loose wall, one that crosses a wide section of the network where pipes are generous, and you get a loose, unhelpful upper bound on the flow. Draw a tight wall, one that slices through a narrow bottleneck, and you get a tight bound. Every wall you can draw is an upper bound on every flow you can push. Every flow you push is, by symmetry, a lower bound on the capacity of every possible wall. The two sides pin each other.
The miracle, sitting in the amber box below, is that the upper bounds and lower bounds actually meet. The tightest wall you can draw has the same capacity as the fattest flow you can push. Not close. Exactly the same.
Max-flow / min-cut theorem (Ford & Fulkerson, 1956)
In any flow network, the value of the maximum flow from to equals the capacity of the minimum - cut.
Why this matters: a question about shipping (how much can flow?) is answered by a question about sabotage (what is the cheapest way to break the network?). For Harris and Ross this was more than a curiosity. Their operational question was the dual of the shipping question they had been asked. The same computation answered both.
The proof is constructive. If you have a maximum flow, the algorithm below gives you a witness min cut for free: the set of nodes still reachable from in the residual graph, and the edges crossing from to its complement. That is exactly how the red edges in the demo are computed.
Push flow along a path. Repeat until there is no path left to push along.
The key construct is the residual graph. Start with a copy of the network. Whenever you push units along an edge , decrease its residual capacity by and increase the residual capacity of the reverse edge by the same amount. The reverse edge is what lets the algorithm "take back" a bad decision: a later augmenting path can route flow back through , effectively cancelling part of an earlier choice.
The reverse-edge trick is the subtle part, so here is the intuition. Suppose at step three you pushed ten units through Kiev because it looked locally like a good idea. At step seven you realise that those ten units would have been better spent going through Warsaw instead. A naive algorithm would be stuck: it has already committed. The residual graph's reverse edges say: fine, push ten units backwards out of Kiev on this new path, which cancels the earlier commitment, and now you are free to route those same ten units the better way. You never actually run the rails backwards; the reverse edge is bookkeeping that lets the algorithm undo its own mistakes without restarting.
The Ford-Fulkerson method is then: find any path from to in the residual graph, push as much flow as the bottleneck edge allows, repeat. When no augmenting path exists, you are done.
Why BFS, not DFS
Ford and Fulkerson did not specify how to find the augmenting path. A depth-first search can, on networks with irrational capacities, fail to terminate; a famous three-edge example by Zwick converges to a flow value that is only part of the true maximum. The fix, from Jack Edmonds and Richard Karp in 1972, is to choose the shortest augmenting path each time, which a BFS gives you for free. The resulting Edmonds-Karp algorithm terminates in at most iterations, independent of the capacity values.
Step through it on the Soviet-rail graph. Each blue path is the current BFS augmenting path. The label on each edge shows the cumulative flow. The number on the right is the total value of the flow so far. When the frontier stops expanding to the sink, the algorithm halts, and that number is the maximum.
Blue highlights the current augmenting path. Gold edges already carry flow. Each edge shows flow / capacity.
Step 0 of 5
Initial state. No flow yet. BFS will look for the shortest augmenting path from Moscow to Frontier.
Total flow
0
Cumulative after this augmentation.
Notice that the paths BFS finds are the ones with the fewest edges, not the ones with the most capacity. That is fine: every augmentation increases the flow by a positive integer (given integer capacities), and no augmentation ever decreases it, so the algorithm converges in a bounded number of steps.
A flow network in fifteen lines. Edmonds-Karp in thirty.
A nested dictionary
_cap[u][v] stores the
capacity of each directed edge. Negative capacities and
self-loops are rejected at insert time.
from collections import defaultdict
from dataclasses import dataclass, field
@dataclass
class FlowNetwork:
"""Directed graph with non-negative edge capacities."""
_cap: dict[str, dict[str, float]] = field(
default_factory=lambda: defaultdict(dict)
)
def add_edge(self, u: str, v: str, capacity: float) -> None:
if capacity < 0:
raise ValueError(f"Negative capacity {capacity} not allowed.")
if u == v:
raise ValueError(f"Self-loop on {u} is not a flow edge.")
self._cap[u][v] = self._cap[u].get(v, 0.0) + capacity
def capacity(self, u: str, v: str) -> float:
return self._cap.get(u, {}).get(v, 0.0)
@property
def nodes(self) -> list[str]:
return list(self._cap.keys()) Build a residual graph, BFS for an augmenting path, push the bottleneck along it, repeat. The reverse-edge update is the one line that makes the whole thing correct.
from collections import deque
def max_flow(net: FlowNetwork, source: str, sink: str):
"""Edmonds-Karp: Ford-Fulkerson with a BFS augmenting path."""
residual = {u: {} for u in net.nodes}
for u in net.nodes:
for v, c in net._cap[u].items():
residual[u][v] = residual[u].get(v, 0.0) + c
residual[v].setdefault(u, 0.0) # reverse edge, cap 0
def bfs():
prev = {source: None}
q = deque([source])
while q:
u = q.popleft()
if u == sink:
break
for v, cap in residual[u].items():
if cap > 0 and v not in prev:
prev[v] = u
q.append(v)
if sink not in prev:
return None
path, node = [], sink
while node is not None:
path.append(node)
node = prev[node]
return list(reversed(path))
value = 0.0
while (path := bfs()) is not None:
bottleneck = min(residual[path[i]][path[i + 1]]
for i in range(len(path) - 1))
for i in range(len(path) - 1):
u, v = path[i], path[i + 1]
residual[u][v] -= bottleneck
residual[v][u] += bottleneck
value += bottleneck
return value Notebook 08: Max flow and min cut
Build a toy flow network. Compute its max flow and min cut. Verify they match. Then build the Soviet-rail graph and see which edges the algorithm picks out as bottlenecks.
notebooks/08_graph_theory_flow.ipynb → The theorem was open from the start. The answer it computed was not.
Ford and Fulkerson's 1956 paper is a public document, published in the Canadian Journal of Mathematics. Anyone with a library card could read the theorem and the algorithm. The mathematics was never secret.
What was secret was the input. The Harris-Ross report carried a SECRET classification because it named specific Soviet rail junctions, attached specific capacities to them, and then pointed at which ones mattered. The report's central finding, the bottleneck in the Soviet supply chain, was the same object that Ford-Fulkerson would call a min cut. But knowing the cut existed in the abstract was harmless. Knowing where the cut was on a real map was a targeting recommendation.
The report was declassified in 1999, forty-four years after it was written, long after its operational relevance had expired. What remains is a beautiful historical document: a real-world problem, stated in the language of graphs, solved with an algorithm that was being invented in the same building at the same time.
The theorem that answers shipping questions ended up everywhere shipping is not.
Max-flow / min-cut did become the workhorse of transportation modelling. But the more striking afterlife was as a primitive in problems that have nothing obviously to do with flow. Image segmentation, where pixels are nodes and a min cut separates foreground from background. Bipartite matching, where max flow counts the largest set of compatible pairs. Network reliability, where a min cut is the smallest set of edges whose failure disconnects a service from its users. The theorem keeps showing up because it is the combinatorial form of linear-programming duality, and duality is everywhere.
Flow is deterministic: you allocate exact capacities and compute exact values. The next story asks what happens when we let go of that determinism, and think of the graph as a process that moves something around probabilistically. The shift from flow to random walk is the shift from a single planned route to an expected distribution over routes, and the mathematics that falls out of it is the reason a word-probability model, trained on enough text, can be coaxed into naming citations.
Next story
Random walks
PageRank built Google this way. Modern LLMs retrieve knowledge this way.
Back to
Flows & cuts
Image segmentation, bipartite matching, network reliability, and why LP duality keeps showing up.
All topics
Four algorithms
One insight from 1736 that made all of them thinkable.
One paper from 1956, one declassified report from 1955, and a history book that connects them.
L. R. Ford and D. R. Fulkerson, Maximal Flow Through a Network (Canadian Journal of Mathematics, 1956)
The original paper. Introduces the max-flow / min-cut theorem and the augmenting-path method. Short, readable, and worth reading in the original notation to see how close the 1956 statement is to the modern one.
T. E. Harris and F. S. Ross, Fundamentals of a Method for Evaluating Rail Net Capacities (RAND RM-1573, 1955; declassified 1999)
The study that motivated the theorem. Includes the 44-vertex rail network, the capacity calculations, and the identified bottleneck. The RAND archive hosts a scanned PDF. A genuine primary source, and a rare case where a classified document was released after the mathematics it inspired had become textbook material.
Alexander Schrijver, On the History of the Transportation and Maximum Flow Problems (Mathematical Programming, 2002)
The definitive history. Schrijver traces the intertwined lines from Kantorovich and Hitchcock through Dantzig, Koopmans, Harris-Ross, and Ford-Fulkerson. Essential reading if you want to understand how the field actually developed, as opposed to how it is usually taught.
Cormen, Leiserson, Rivest and Stein, Introduction to Algorithms (MIT Press, 4th ed. 2022)
Chapter 24 on maximum flow. Clean presentation of Ford-Fulkerson, Edmonds-Karp, and the max-flow / min-cut theorem with full proofs. Also covers push-relabel, the faster alternative that matters once the network gets large.
Alexander Schrijver, Combinatorial Optimization: Polyhedra and Efficiency (Springer, 2003)
Three volumes, two thousand pages, and the most complete treatment of flow and cut problems in print. Not bedside reading, but the reference you reach for when a flow question turns out to be subtler than it looks.
One report asked which rail lines to bomb. One paper asked how much could flow. The answer to both was the same theorem, and the two groups of authors walked past each other in the halls of the same building for months before anyone noticed.
Forty-four years passed before the report was public. The theorem was already in every textbook.