Algorithms and Complexity Seminar
Thursday, March 6, 2008, 4:00-5:15pm in 32-G575.
Overcoming the L1 non-embeddability barrier: choose your host space wisely
Alexandr Andoni (MIT)
We propose a new technique for solving computational problems
over "hard" distance metrics, such as designing nearest neighbor data
structure for the edit distance (aka Levenshtein distance). The usual
approach to such problems is through embeddings, which map a "hard"
space into an "easy" host space, most notably L1. Although embedding into
L1 has proved successful for some spaces, it has inherent limitations:
it would provably require high approximation for some important metrics,
including the edit distance.
Our new technique overcomes this "L1 non-embeddability barrier" for a
variant of the edit distance, Ulam metric, which is edit distance on
non-repetitive strings (formally, permutations). The Ulam metric captures
the hardness caused by the misalignments in the edit distance, and in fact
all known lower bounds for the edit distance hold for Ulam metric to
almost the same degree. For example, embedding Ulam into L1 requires
~(log d) distortion (approximation), where d is the strings' length.
We achieve a nearest neighbor data structure for Ulam with
exponentially better approximation, namely O(log log d). We do so by
constructing a constant approximation mapping of the Ulam metric into
an alternative, richer host space, which is an iterated product of
standard spaces like L1 and L_infinity. Then we show that, despite
the added richness, the new host space admits efficient nearest
neighbor data structure.
This is joint work with Piotr Indyk and Robert Krauthgamer.
Host: Piotr Indyk
(The Algorithms
and Complexity Seminar series talks are usually held Thursdays 4:00-5:15pm in 32-G575.)