Abstract:
In turnstile streaming heavy hitters, the goal is to identify high frequency items in a stream of item insertions and deletions. Such sketches can also be used to obtain good “for each” sparse recovery schemes in compressed sensing. The CountMin sketch of Cormode and Muthukrishnan provides a solution using optimal O(lg n) words of space with O(lg n) update time and very slow O(n lg n) query, where n is the universe size. The algorithm succeeds whp 1-1/poly(n). The query time can be brought down to polylg n using the “dyadic trick”, at the expense of increasing both space and update time to a suboptimal lg^2 n.
We show this sacrifice is unnecessary. We obtain lg n space, lg n update time, and polylg n query time simultaneously. Our main technique is a reduction from this problem to a certain graph partitioning problem, which we solve using spectral methods.
Joint work with Kasper Green Larsen, Huy Le Nguyen, and Mikkel Thorup.
About the speaker:
Jelani Nelson is a member of the computer science faculty at Harvard. He is interested in algorithms, especially for large and high-dimensional datasets.