Why does the Gomory-Hu algorithm work? i.e. why does the tree that it outputs have the property that you can do min-cut by removing its smallest edge and returning the connected components? (My hunch is that Fig. 4 has the answer if I stare at it long enough)
Every edge of this tree (green) represents a cut (in particular, the min-cut between the two nodes that it connects). But why must the min-cut between non-adjacent nodes in the tree (say, 2 hops away) be either one of the two min-cuts corresponding to the two edges of the tree? Where is the
optimal substructure?
Answer: Let the min-cut-tree be A-B-C. Then any cut between A and C will cut either A-B or B-C and thus have a value at least as big as the smallest of them. However, A-B (or B-C) by itself is a cut of A-C.
The closest thing to a non-confusing explanation of the steps is
this, but it's a bit high-level and glosses over details like which two nodes to pick as source/target (does it matter?), and has a very silly use of color on the last figure.
There's also supposed to be a simpler algorithm by Gusfield, but I wasn't able to find any substantive information about it. The paper isn't available online.
To Do:
(1) implement min-cut
(2) implement min-cut-tree (Gomory-Hu or Gusfield or other)
(3) implement the neat idea of doing graph clustering by min-cut-trees, as explained in Gary Flake's appropriately titled paper. In typical CS Theory fashion, this paper explains that all nice things are NP-Hard, but goes on to give formal guarantees that the proposed method is almost-optimal.