Randomization in
Graph Optimization Problems
|
|
|
David Karger |
|
|
|
MIT |
|
http://theory.lcs.mit.edu/~karger |
Randomized Algorithms
|
|
|
|
Flip coins to decide what to do next |
|
Avoid hard work of making “right”
choice |
|
Often faster and simpler than
deterministic algorithms |
|
Different from average-case analysis |
|
Input is worst case |
|
Algorithm adds randomness |
Methods
|
|
|
|
Random selection |
|
if most candidate choices “good”, then
a random choice is probably good |
|
Monte Carlo simulation |
|
simulations estimate event likelihoods |
|
Random sampling |
|
generate a small random subproblem |
|
solve, extrapolate to whole problem |
|
Randomized Rounding for approximation |
Cuts in Graphs
|
|
|
Focus on undirected graphs |
|
A cut is a vertex partition |
|
Value is number (or total weight) of
crossing edges |
Optimization with Cuts
|
|
|
|
Cut values determine solution of many
graph optimization problems: |
|
min-cut / max-flow |
|
multicommodity flow (sort-of) |
|
bisection / separator |
|
network reliability |
|
network design |
|
Randomization helps solve these
problems |
Presentation Assumption
|
|
|
|
For entire presentation, we consider unweighted
graphs (all edges have weight/capacity one) |
|
All results apply unchanged to
arbitrarily weighted graphs |
|
Integer weights = parallel edges |
|
Rational weights scale to integers |
|
Analysis unaffected |
|
Some implementation details |
Basic Probability
|
|
|
|
Conditional probability |
|
Pr[A 3 B] = Pr[A] × Pr[B | A] |
|
Independent events multiply: |
|
Pr[A 3 B] = Pr[A] × Pr[B] |
|
Linearity of expectation: |
|
E[X+Y] = E[X] + E[Y] |
|
Union Bound |
|
Pr[X 4 Y] [ Pr[X] + Pr[Y] |
Random Selection
for
Minimum Cuts
|
|
|
Random choices are good |
|
when problems are rare |
Minimum Cut
|
|
|
|
Smallest cut of graph |
|
Cheapest way to separate into 2 parts |
|
Various applications: |
|
network reliability (small cuts are
weakest) |
|
subtour elimination constraints for TSP |
|
separation oracle for network design |
|
Not s-t min-cut |
Max-flow/Min-cut
|
|
|
|
s-t flow: edge-disjoint packing of s-t
paths |
|
s-t cut: a cut separating s and t |
|
[FF]: s-t max-flow = s-t min-cut |
|
max-flow saturates all s-t min-cuts |
|
most efficient way to find s-t min-cuts |
|
[GH]: min-cut is “all-pairs” s-t
min-cut |
|
find using n flow computations |
|
|
Flow Algorithms
|
|
|
|
|
Push-relabel [GT]: |
|
push “excess” around graph till it’s
gone |
|
max-flow in O*(mn) (note: O*
hides logs) |
|
Recent O*(m3/2) [GR] |
|
min-cut in O*(mn2)
--- “harder” than flow |
|
Pipelining [HO]: |
|
save push/relabel data between flows |
|
min-cut in O*(mn) --- “as
easy” as flow |
Contraction
|
|
|
Find edge that doesn’t cross min-cut |
|
Contract (merge) endpoints to 1 vertex |
Contraction Algorithm
|
|
|
|
Repeat
n - 2 times: |
|
find non-min-cut edge |
|
contract it (keep parallel edges) |
|
Each contraction decrements #vertices |
|
At end, 2 vertices left |
|
unique cut |
|
corresponds to min-cut of starting
graph |
|
|
Slide 14
Picking an Edge
|
|
|
|
Must contract non-min-cut edges |
|
[NI]: O(m) time algorithm to pick edge |
|
n contractions: O(mn) time for min-cut |
|
slightly faster than flows |
|
|
|
If only could find edge faster…. |
Randomize
|
|
|
|
|
|
Repeat until 2 vertices remain |
|
pick a random edge |
|
contract it |
|
|
|
(keep fingers crossed) |
Analysis I
|
|
|
|
Min-cut is small---few edges |
|
Suppose graph has min-cut c |
|
Then minimum degree at least c |
|
Thus at least nc/2 edges |
|
Random edge is probably safe |
|
Pr[min-cut edge] £ c/(nc/2) |
|
=
2/n |
|
(easy generalization to capacitated
case) |
Analysis II
|
|
|
|
Algorithm succeeds if never
accidentally contracts min-cut edge |
|
Contracts #vertices from n down to 2 |
|
When k vertices, chance of error is 2/k |
|
thus, chance of being right is 1-2/k |
|
Pr[always right] is product of
probabilities of being right each time |
|
|
Analysis III
Repetition
|
|
|
|
Repetition amplifies success
probability |
|
basic failure probability 1-2/n2 |
|
so repeat 7n2 times |
How fast?
|
|
|
|
Easy to perform 1 trial in O(m) time |
|
just use array of edges, no data
structures |
|
But need n2 trials: O(mn2)
time |
|
Simpler than flows, but slower |
An improvement [KS]
|
|
|
|
When k vertices, error probability 2/k |
|
big when k small |
|
Idea: once k small, change algorithm |
|
algorithm needs to be safer |
|
but can afford to be slower |
|
Amplify by repetition! |
|
Repeat base algorithm many times |
Recursive Algorithm
|
|
|
Algorithm RCA ( G, n ) |
|
{G has n vertices} |
|
repeat twice |
|
randomly contract G to n/21/2
vertices |
|
RCA(G,n/21/2) |
|
|
Main Theorem
|
|
|
|
On any capacitated, undirected graph,
Algorithm RCA |
|
runs in O*(n2)
time with simple structures |
|
finds min-cut with probability ³ 1/log n |
|
Thus, O(log n) repetitions suffice to
find the minimum cut (failure probability 10-6) in O(n2 log2
n) time. |
Proof Outline
|
|
|
|
Graph has O(n2) (capacitated)
edges |
|
So O(n2) work to contract,
then two subproblems of size n/2½ |
|
T(n) =
2 T(n/2½) + O(n2) = O(n2 log n) |
|
Algorithm fails if both iterations fail |
|
Iteration succeeds if contractions and
recursion succeed |
|
P(n)=1 - [1 - ½ P(n/2½)]2
= W (1 / log n) |
Failure Modes
|
|
|
|
Monte Carlo algorithms always run fast
and probably give you the right answer |
|
Las Vegas algorithms probably run fast
and always give you the right answer |
|
To make a Monte Carlo algorithm Las
Vegas, need a way to check answer |
|
repeat till answer is right |
|
No fast min-cut check known (flow
slow!) |
How do we verify
a minimum cut?
Enumerating Cuts
|
|
|
The probabilistic method, backwards |
Cut Counting
|
|
|
|
Original CA finds any given min-cut
with probability at least 2/n(n-1) |
|
Only one cut found |
|
Disjoint events, so probabilities add |
|
So at most n(n-1)/2 min-cuts |
|
probabilities would sum to more than
one |
|
Tight |
|
Cycle has exactly this many min-cuts |
Enumeration
|
|
|
|
RCA as stated has constant probability
of finding any given min-cut |
|
If run O(log n) times, probability of
missing a min-cut drops to 1/n3 |
|
But only n2 min-cuts |
|
So, probability miss any at most 1/n |
|
So, with probability 1-1/n, find all |
|
O(n2 log3 n) time |
Generalization
|
|
|
|
If G has min-cut c, cut £ac is a-mincut |
|
Lemma: contraction algorithm finds any
given a-mincut with probability W (n-2a) |
|
Proof: just add a factor to basic
analysis |
|
Corollary: O(n2a)
a-mincuts |
|
Corollary: Can find all in O*(n2a)
time |
|
Just change contraction factor in RCA |
Summary
|
|
|
|
A simple fast min-cut algorithm |
|
Random selection avoids rare problems |
|
Generalization to near-minimum cuts |
|
Bound on number of small cuts |
|
Probabilistic method, backwards |
Network Reliability
The Problem
|
|
|
|
|
Input: |
|
Graph G with n vertices |
|
Edge failure probabilities |
|
For simplicity, fix a single p |
|
Output: |
|
FAIL(p): probability G is disconnected
by edge failures |
Approximation Algorithms
|
|
|
|
|
Computing FAIL(p) is #P complete [V] |
|
Exact algorithm seems unlikely |
|
Approximation scheme |
|
Given G, p, e, outputs e-approximation |
|
May be randomized: |
|
succeed with high probability |
|
Fully polynomial (FPRAS) if runtime is
polynomial in n, 1/e |
Monte Carlo Simulation
|
|
|
|
Flip a coin for each edge, test graph |
|
k failures in t trials Þ FAIL(p) » k/t |
|
E[k/t] = FAIL(p) |
|
How many trials needed for confidence? |
|
“bad luck” on trials can yield bad
estimate |
|
clearly need at least 1/FAIL(p) |
|
Chernoff bound: O*(1/e2FAIL(p))
suffice to give probable accuracy within e |
|
Time O*(m/e2FAIL(p)) |
Chernoff Bound
|
|
|
|
Random variables Xi ` [0,1] |
|
Sum X = å Xi |
|
Bound deviation from expectation |
|
Pr[ |X-E[X]| m e E[X]
] < exp(-e2E[X]/4) |
|
If
E[X] m 4(log n)/e2, “tight concentration” |
|
Deviation by e probability < 1 / n |
|
No one variable is a big part of E[X] |
Application
|
|
|
|
Let Xi=1 if trial i is a
failure, else 0 |
|
Let X = X1 + … + Xt |
|
Then E[X] = t FAIL(p) |
|
Chernoff says X within relative e of E[X]
with probability 1-exp(e2 t FAIL(p)/4) |
|
So choose t to cancel other terms |
|
“High probability” t = O(log n / e2FAIL(p)) |
|
Deviation by e probability < 1 / n |
Review
|
|
|
|
Contraction Algorithm |
|
O(n2a) a-mincuts |
|
Enumerate in O*(n2a)
time |
Network reliability problem
|
|
|
|
|
Random edge failures |
|
Estimate FAIL(p) = Pr[graph
disconnects] |
|
Naïve Monte Carlo simulation |
|
Chernoff bound---“tight concentration” |
|
Pr[ |X-E[X]| m e E[X] ] < exp(-e2E[X]/4) |
|
O(log n / e2FAIL(p)) trials
expect O(log n / e2) network failures---good for Chernoff |
|
So estimate within e in O*(m/e2FAIL(p))
time |
Rare Events
|
|
|
When FAIL(p) too small, takes too long
to collect sufficient statistics |
|
Solution: skew trials to make
interesting event more likely |
|
But in a way that let’s you recover
original probability |
DNF Counting
|
|
|
|
Given DNF formula (OR of ANDs) |
|
(e1 Ùe2 Ù e3)
Ú (e1 Ù e4) Ú (e2 Ù e6) |
|
Each variable set true with probability
p |
|
Estimate Pr[formula true] |
|
#P-complete |
|
[KL, KLM] FPRAS |
|
Skew to make true outcomes “common” |
|
Time linear in formula size |
Rewrite problem
|
|
|
|
Assume p=1/2 |
|
Count satisfying assignments |
|
“Satisfaction matrix |
|
Sij=1 if ith
assignment satisfies jth clause |
|
We want number of nonzero rows |
|
Randomly sampling rows won’t work |
|
Might be too few nonzeros |
New sample space
|
|
|
|
So normalize every nonzero row to sum
to one (divide by number of nonzeros) |
|
Now sum of nonzeros is desired value |
|
So sufficient to estimate average
nonzero |
Sampling Nonzeros
|
|
|
|
We know number of nonzeros/column |
|
If satisfy given clause, all variables
in clause must be true |
|
All other variables unconstrained |
|
Estimate average by random sampling |
|
Know number of nonzeros/column |
|
So can pick random column |
|
Then pick random true-for-column
assignment |
Few Samples Needed
|
|
|
|
Suppose k clauses |
|
Then E[sample] > 1/k |
|
1 £ satisfied clauses £ k |
|
1 ³ sample value ³ 1/k |
|
Adding O(k log n / e2) samples
gives “large” mean |
|
So Chernoff says sample mean is probably good estimate |
Reliability Connection
|
|
|
|
Reliability as DNF counting: |
|
Variable per edge, true if edge fails |
|
Cut fails if all edges do (AND of edge
vars) |
|
Graph fails if some cut does (OR of
cuts) |
|
FAIL(p)=Pr[formula true] |
Focus on Small Cuts
|
|
|
Fact: FAIL(p) > pc |
|
Theorem: if pc=1/n(2+d)
then
Pr[>a-mincut
fails]< n-ad |
|
Corollary: FAIL(p) » Pr[£ a-mincut
fails], |
|
where a=1+2/d |
|
Recall: O(n2a) a-mincuts |
|
Enumerate with RCA, run DNF counting |
Proof of Theorem
|
|
|
Given pc=1/n(2+d) |
|
At most n2a cuts have value ac |
|
Each fails with probability pac=1/na(2+d) |
|
Pr[any cut of value ac fails] = O(n-ad) |
|
Sum over all a > 1 |
Algorithm
|
|
|
RCA can enumerate all a-minimum cuts
with high probability in O(n2a) time. |
|
Given a-minimum cuts, can e-estimate
probability one fails via Monte Carlo simulation for DNF-counting (formula
size O(n2a)) |
|
Corollary: when FAIL(p)< n-(2+d),
can
e-approximate it in O (cn2+4/d) time |
Combine
|
|
|
For large FAIL(p), naïve Monte Carlo |
|
For small FAIL(p), RCA/DNF counting |
|
Balance: e-approx. in O(mn3.5/e2)
time |
|
Implementations show practical for
hundreds of nodes |
|
Again, no way to verify correct |
Summary
|
|
|
Naïve Monte Carlo simulation works well
for common events |
|
Need to adapt for rare events |
|
Cut structure and DNF counting lets us
do this for network reliability |
Random Sampling
Random Sampling
|
|
|
|
General tool for faster algorithms: |
|
pick a small, representative sample |
|
analyze it quickly (small) |
|
extrapolate to original (representative) |
|
Speed-accuracy tradeoff |
|
smaller sample means less time |
|
but also less accuracy |
Min-cut Duality
|
|
|
|
[Edmonds]: min-cut=max tree packing |
|
convert to directed graph |
|
“source” vertex s (doesn’t matter
which) |
|
spanning trees directed away from s |
|
[Gabow] “augmenting trees” |
|
add a tree in O*(m) time |
|
min-cut c (via max packing) in O*(mc) |
|
great if m and c are small… |
Example
Sampling for Approximation
Random Sampling
|
|
|
|
[Gabow] scheme great if m, c small |
|
Random sampling |
|
reduces m, c |
|
scales cut values (in expectation) |
|
if pick half the edges, get half of
each cut |
|
So find tree packings, cuts in samples |
Sampling Theorem
|
|
|
Given graph G, build a sample G(p) by
including each edge with probability p |
|
Cut of value v in G has expected value pv
in G(p) |
|
Definition: “constant” r = 8
(ln n) / e2 |
|
Theorem: With high probability, all
cuts in G(r / c) have (1 ± e) times their expected values. |
A Simple Application
|
|
|
|
[Gabow] packs trees in O*(mc)
time |
|
Build G(r / c) |
|
minimum expected cut r |
|
by theorem, min-cut probably near r |
|
find min-cut in time O*(rm) using
[Gabow] |
|
corresponds to near-min-cut in G |
|
Result: (1+e) times min-cut in O*(rm)
time |
Proof of Sampling: Idea
|
|
|
|
Chernoff bound says probability of
large deviation in cut value is small |
|
Problem: exponentially many cuts. Perhaps some deviate a great deal |
|
Solution: showed few small cuts |
|
only small cuts likely to deviate much |
|
but few, so Chernoff bound applies |
Proof of Sampling
|
|
|
|
Sampled with probability r /c, |
|
a cut of value ac has mean ar |
|
[Chernoff]: deviates from expected size
by more than e with probability at most n-3a |
|
At most n2a cuts have value ac |
|
Pr[any cut of value ac deviates] = O(n-a) |
|
Sum over all a ³ 1 |
Las Vegas Algorithms
|
|
|
Finding Good Certificates |
Approximate Tree Packing
|
|
|
|
Break edges into c /r random groups |
|
Each looks like a sample at rate r / c |
|
O*( rm / c) edges |
|
each has min expected cut r |
|
so theorem says min-cut (1 – e) r |
|
So each has a packing of size (1 – e) r |
|
[Gabow] finds in time O*(r2m/c)
per group |
|
so overall time is c × O*(r2m/c) = O*(r2m) |
Las Vegas Algorithm
|
|
|
|
Packing algorithm is Monte Carlo |
|
Previously found approximate cut
(faster) |
|
If close, each “certifies” other |
|
Cut exceeds optimum cut |
|
Packing below optimum cut |
|
If not, re-run both |
|
Result: Las Vegas, expected time O*(r2m) |
Exact Algorithm
|
|
|
|
Randomly partition edges in two groups |
|
each like a 1/2-sample: e = O*(c-1/2) |
|
Recursively pack trees in each half |
|
c/2 - O*(c1/2)
trees |
|
Merge packings |
|
gives packing of size c - O*(c1/2) |
|
augment to maximum packing: O*(mc1/2) |
|
T(m,c)=2T(m/2,c/2)+O*(mc1/2)
= O* (mc1/2) |
Nearly Linear Time
Analyze Trees
|
|
|
|
Recall: [G] packs c (directed)-edge
disjoint spanning trees |
|
Corollary: in such a packing, some tree
crosses min-cut only twice |
|
To find min-cut: |
|
find tree packing |
|
find smallest cut with 2 tree edges
crossing |
|
Problem: packing takes O*(mc)
time |
Constraint trees
|
|
|
|
Min-cut c: |
|
c directed trees |
|
2c directed min-cut edges |
|
On average, two min-cut edges/tree |
|
Definitions: |
|
tree 2-crosses cut |
Finding the Cut
|
|
|
From crossing tree edges, deduce cut |
|
Remove tree edges |
Sampling
|
|
|
|
Solution: use G(r/c) with e=1/8 |
|
pack O*(r) trees in O*(m)
time |
|
original min-cut has (1+e)r edges in G(r
/ c) |
|
some tree 2-crosses it in G(r / c) |
|
…and thus 2-crosses it in G |
|
Analyze O*(r) trees in G |
|
time O*(m) per tree |
|
Monte Carlo |
Simplify
|
|
|
Discuss case where one tree edge
crosses min-cut |
Analyzing a tree
|
|
|
|
Root tree, so cut subtree |
|
Use dynamic program up from leaves to
determine subtree cuts efficiently |
|
Given cuts at children of a node,
compute cut at parent |
|
Definitions: |
|
v¯ are nodes below v |
|
C(v¯) is value of cut at
subtree v¯ |
The Dynamic Program
Algorithm: 1-Crossing Trees
|
|
|
|
Compute edges’ LCA’s: O(m) |
|
Compute “cuts” at leaves |
|
Cut values = degrees |
|
each edge incident on at most two
leaves |
|
total time O(m) |
|
Dynamic program upwards: O(n) |
|
|
|
Total: O(m+n) |
2-Crossing Trees
|
|
|
Cut corresponds to two subtrees: |
Linear Time
|
|
|
Bottleneck is C(v¯, w¯)
computations |
|
Avoid.
Find right “twin” w for each v |
How do we verify
a minimum cut?
Network Design
Problem Statement
|
|
|
Given vertices, and cost cvw
to buy and edge from v to w, find minimum cost purchase that creates a graph
with desired connectivity properties |
|
Example: minimum cost k-connected
graph. |
|
Generally NP-hard |
|
Recent approximation algorithms
[GW],[JV] |
Integer Linear Program
|
|
|
|
Variable xvw=1 if buy edge vw |
|
Solution cost S xvw cvw |
|
Constraint: for every cut, S xvw ³
k |
|
Relaxing integrality gives tractable LP |
|
Exponentially many cuts |
|
But separation oracles exist (eg
min-cut) |
|
What is integrality gap? |
Randomized Rounding
|
|
|
Given LP solution values xvw |
|
Build graph where vw is present with
probability xvw |
|
Expected cost is at most opt: S xvw
cvw |
|
Expected number of edges crossing any
cut satisfies constraint |
|
If expected number large for every cut,
sampling theorem applies |
k-connected subgraph
|
|
|
Fractional solution is k-connected |
|
So every cut has (expected) k edges
crossing in rounded solution |
|
Sampling theorem says every cut has at
least k-(k log n)1/2 edges |
|
Close approximation for large k |
|
Can often repair: e.g., get k-connected
subgraph at cost 1+((log n)/k)1/2 times min |
|
|
Nonuniform Sampling
|
|
|
Concentrate on the important things |
|
|
|
[Benczur-Karger, Karger, Karger-Levine] |
s-t Min-Cuts
|
|
|
Recall: if G has min-cut c, then in G(r/c)
all cuts approximate their expected values to within e. |
|
Applications: |
The Problem
|
|
|
Cut sampling relied on Chernoff bound |
|
Chernoff bounds required that no one
edge is a large fraction of the expectation of a cut it crosses |
|
If sample rate ^1/c, each edge across a
min-cut is too significant |
|
|
Biased Sampling
|
|
|
|
Original sampling theorem weak when |
|
large m |
|
small c |
|
But if m is large |
|
then G has dense regions |
|
where c must be large |
|
where we can sample more sparsely |
Problem Old
Time New Time
|
|
|
Approx. s-t min-cut O*(mn) O*(n2 / e2) |
|
Approx. s-t max-flow O*(m3/2 ) O*(mn1/2 / e) |
|
Flow of value v O*(mv) |
|
O*(n11/9v) |
|
Approx. bisection O*(m2) O*(n2 / e2) |
|
|
|
m Þ n /e2 in weighted, undirected graphs |
Strong Components
|
|
|
Definition: A k-strong component is a
maximal vertex-induced subgraph with min-cut k. |
Nonuniform Sampling
|
|
|
Definition: An edge is k-strong if its
endpoints are in same k-component. |
|
Stricter than k-connected endpoints. |
|
Definition: The strong connectivity ce
for edge e is the largest k for which e is k-strong. |
|
Plan: sample dense regions lightly |
Nonuniform Sampling
|
|
|
Idea: if an edge is k-strong, then it
is in a k-connected graph |
|
So “safe” to sample with probability 1/k |
|
Problem: if sample edges with different
probabilities, E[cut value] gets messy |
|
Solution: if sample e with probability pe,
give it weight 1/pe |
|
Then E[cut value]=original cut value |
Compression Theorem
|
|
|
|
Definition: Given compression
probabilities pe, compressed graph G[pe] |
|
includes edge e with probability pe
and |
|
gives it weight 1/pe if
included |
|
Note E[G[pe]] = G |
|
Theorem: G[r / ce] |
|
approximates all cuts by e |
|
has O (rn) edges |
Proof (approximation)
|
|
|
|
Basic idea: in a k-strong component,
edges get sampled with prob. r / k |
|
original sampling theorem works |
|
Problem: some edges may be in stronger
components, sampled less |
|
Induct up from strongest components: |
|
apply original sampling theorem inside |
|
then “freeze” so don’t affect weaker
parts |
Strength Lemma
|
|
|
|
Lemma: å 1/ce £ n |
|
Consider connected component C of G |
|
Suppose C has min-cut k |
|
Then every edge e in C has ce
³ k |
|
So k edges crossing C’s min-cut have |
|
å 1/ce £
å 1/k £ k (1/k )
= 1 |
|
Delete these edges (“cost” 1) |
|
Repeat n - 1 times: no more edges! |
Proof (edge count)
|
|
|
Edge e included with
probability r / ce |
|
So expected number is S r / ce |
|
We saw
S 1/ce £ n |
|
So expected number at most r n |
Construction
|
|
|
|
To sample, must find edge strengths |
|
can’t, but approximation suffices |
|
Sparse certificates identify weak
edges: |
|
construct in linear time [NI] |
|
contain all edges crossing cuts £ k |
|
iterate until strong components emerge |
|
Iterate for 2i-strong edges,
all i |
|
tricks turn it strongly polynomial |
Certificate Algorithm
|
|
|
|
Repeat k times |
|
Find a spanning forest |
|
Delete it |
|
Each iteration deletes one edge from
every cut (forest is spanning) |
|
So at end, any edge crossing a cut of
size £ k is deleted |
|
[NI] merge all iterations in O(m) time |
Flows
|
|
|
|
Uniform sampling led to flow algorithms |
|
Randomly partition edges |
|
Merge flows from each partition element |
|
Compression problematic |
|
Edge capacities changed |
|
So flow path capacities distorted |
|
Flow in compressed graph doesn’t fit in
original graph |
Smoothing
|
|
|
|
If edge has strength ce,
divide into br / ce edges of capacity ce /br |
|
Creates br å 1/ce = brn edges |
|
Now each edge is only 1/br fraction of
any cut of its strong component |
|
So sampling a 1/b fraction works |
|
So dividing into b groups works |
|
Yields (1-e) max-flow in time |
Cleanup
|
|
|
|
Approximate max-flow can be made exact
by augmenting paths |
|
Integrality problems |
|
Augmenting paths fast for small integer
flow |
|
But breakup by smoothing ruins
integrality |
|
Surmountable |
|
Flows in dense and sparse parts
separable |
|
Result: max-flow in O*(n11/9v)
time |
Proof By Picture
Compress Dense Regions
Solve Sparse Flow
Replace Dense
Parts
(keep flow in sparse bits)
“Fill In” Dense Parts
Conclusions
Conclusion
|
|
|
Randomization is a crucial tool for
algorithm design |
|
Often yields algorithms that are faster
or simpler than traditional counterparts |
|
In particular, gives significant
improvements for core problems in graph algorithms |
Randomized Methods
|
|
|
|
Random selection |
|
if most candidate choices “good”, then
a random choice is probably good |
|
Monte Carlo simulation |
|
simulations estimate event likelihoods |
|
Random sampling |
|
generate a small random subproblem |
|
solve, extrapolate to whole problem |
|
Randomized Rounding for approximation |
Random Selection
|
|
|
|
When most choices good, do one at
random |
|
Recursive contraction algorithm for
minimum cuts |
|
Extremely simple (also to implement) |
|
Fast in theory and in practice [CGKLS] |
Monte Carlo
|
|
|
To estimate event likelihood, run
trials |
|
Slow for very rare events |
|
Bias samples to reveal rare event |
|
FPRAS for network reliability |
Random Sampling
|
|
|
|
Generate representative subproblem |
|
Use it to estimate solution to whole |
|
Gives approximate solution |
|
May be quickly repaired to exact
solution |
|
Bias sample toward “important” or
“sensitive” parts of problem |
|
New max-flow and min-cut algorithms |
|
|
Randomized Rounding
|
|
|
Convert fractional to integral
solutions |
|
Get approximation algorithms for
integer programs |
|
“Sampling” from a well designed sample
space of feasible solutions |
|
Good approximations for network design. |
Generalization
|
|
|
|
Our techniques work because undirected
graph are matroids |
|
All our results extend/are special
cases |
|
Packing bases |
|
Finding minimum “quotients” |
|
Matroid optimization (MST) |
Directed Graphs?
|
|
|
|
Directed graphs are not matroids |
|
Directed graphs can have lots of
minimum cuts |
|
Sampling doesn’t appear to work |
|
Residual graphs for flows are directed |
|
Precludes obvious recursive solutions
to flow problems |
Open problems
|
|
|
|
Flow in O(nv) time (complete m Þ n) |
|
Eliminate v dependence |
|
Apply to weighted graphs with large
flows |
|
Flow in O(m) time? |
|
Las Vegas algorithms |
|
Finding good certificates |
|
Detrministic algorithms |
|
Deterministic construction of “samples” |
|
Deterministically compress a graph |
Randomization in
Graph Optimization Problems
|
|
|
David Karger |
|
MIT |
|
http://theory.lcs.mit.edu/~karger |
|
karger@mit.edu |