Abstract: A technique introduced by Indyk and Woodruff (STOC 2005) has inspired several recent advances in data-stream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data-stream algorithms that maintain a randomized sketch of an input vector x=(x1,x2,...,xn), which is useful for the following applications:
For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x.
Precision Sampling itself addresses the problem of estimating a sum ∑i=1n ai from weak estimates of each real ai in [0,1]. More precisely, the estimator first chooses a desired precision ui in (0,1] for each i in [n], and then it receives an estimate of every ai within additive ui. Its goal is to provide a good approximation to ∑ai while keeping a tab on the ``approximation cost'' ∑i (1/ui). Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as ∑ai = Ω(1), a good multiplicative approximation can be achieved using total precision of only O(n * log n).