Algorithms and Complexity Seminar
Monday, December 11, 2006, 4-5:15pm in 32-G575.
Testing k-wise and Almost k-wise Independence
Ning Xie (MIT)
In this work, we consider the problems of testing
whether a distribution over $\{0,1\}^n$ is $k$-wise or $(\eps,k)$-wise
independent using samples drawn from that distribution.
To distinguish $k$-wise independent distributions from those that are
$\delta$-far in statistical distance, we upper bound the number of
required samples by $\tilde{O}(n^k/\delta^{2})$ and lower bound it by
$\Omega(n^{\frac{k-1}{2}}/\delta)$ (these bounds hold for constant
$k$, and essentially the same bounds hold for general $k$). To
achieve these bounds, we use Fourier analysis to relate a
distribution's distance from $k$-wise independence to its
\emph{biases} (a measure of the parity imbalance it induces on a set
of variables). The relationships we derive are tighter than previously
known, and are of independent interest.
To distinguish $(\eps,k)$-wise independent distributions from those
that are $\delta$-far in statistical distance, we upper bound the
number of required samples by $O\left({k \log n \over
\delta2\epsilon2}\right)$
and lower bound it by
$\Omega\left({\sqrt{k\log n} \over (\eps+\delta)\sqrt{\log
1/(\eps+\delta)}}\right)$. Although these bounds are an
exponential
improvement (in terms of $n$ and $k$) over the corresponding bounds
for testing $k$-wise independence, we show that the {\em time}
complexity of testing $(\eps,k)$-wise independence is unlikely to be
$\poly(n,1/\eps,1/\delta)$ for $k=\Theta(\log n)$,
since this would disprove a plausible conjecture about the hardness of
finding hidden cliques in random graphs. Under the conjecture, our
result implies that for, say, $k =\log n$ and $\eps = 1 / n^{0.99}$,
there is a set of $(\eps,k)$-wise independent distributions, and a set
of distributions at distance $\delta=1/n^{0.51}$ from $(\eps,k)$-wise
independence, which are indistinguishable by polynomial time
algorithms.
Joint work with Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin
Matulef, and Ronitt Rubinfeld.
Host: Madhu Sudan
(The Algorithms
and Complexity Seminar series talks are usually held Mondays from
4-5:15pm in 32-G575.)