Streaming Algorithms via Precision Sampling

Alexandr Andoni; Robert Krauthgamer; Krzysztof Onak

Streaming Algorithms via Precision Sampling

Alexandr Andoni ,
Robert Krauthgamer ,
Krzysztof Onak

Symposium on Foundations of Computer Science (FOCS) | January 2011

Published by IEEE

Download BibTex

A technique introduced by Indyk and Woodruff (STOC 2005) has inspired several recent advances in data-stream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data-stream algorithms that maintain a randomized sketch of an input vector x=(x₁,x₂,…,x_n), which is useful for the following applications:

Estimating the F_k-moment of x, for k>2.
Estimating the ℓ_p-norm of x, for p∈[1,2], with small update time.
Estimating cascaded norms ℓ_p(ℓ_q) for all p,q>0.
ℓ₁ sampling, where the goal is to produce an element i with probability (approximately) |x_i|/\|x\|₁. It extends to similarly defined ℓ_p-sampling, for p∈ [1,2].

For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x.

Precision Sampling itself addresses the problem of estimating a sum sum_i=1ⁿ a_i from weak estimates of each real a_i∈[0,1]. More precisely, the estimator first chooses a desired precision u_i∈(0,1] for each i∈[n], and then it receives an estimate of every a_i within additive u_i. Its goal is to provide a good approximation to sum a_i while keeping a tab on the “approximation cost” sum_i (1/u_i). Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as sum a_i=Ω(1), a good multiplicative approximation can be achieved using total precision of only O(nlog n).