Algorithms and Complexity Seminar
Thursday, March 16, 2006, 4-5:15pm in 32-G575.
Sublinear Algorithms for String Compressibility and the
Distribution Support Size
Sofya Raskhodnikova (Weizmann Institute)
Imagine having to choose between a few compression schemes to compress
a very long file. Before deciding on the scheme, you might want to
obtain a rough estimate on how well each scheme performs on your file.
We consider the question of approximating compressibility of a string
with respect to a fixed compression scheme, in sublinear time.
In the talk, we will concentrate on the run-length encoding and a
version of Lempel-Ziv as our example compression schemes. We present
algorithms and lower bounds for approximating compressibility with
respect to both schemes. We show that compressibility with respect to
Lempel-Ziv is related to approximating the support size of a
distribution. This problem has been considered under different guises
in the literature. We prove a lower bound for it, at the heart of which
is a construction of two positive integer random variables, X and Y,
with very different expectations and the following condition on the
moments up to k:
E[X]/E[Y] = E[X^2]/E[Y^2] = ... = E[X^k]/E[Y^k].
Joint work with Dana Ron, Ronitt Rubinfeld, Amir Shpilka and Adam Smith.
Host: Ronitt Rubinfeld