Distributional Private Information Retrieval

Ryan Lehmkuhl, Alexandra Henzinger, and Henry Corrigan-Gibbs

USENIX Security Symposium
August 13-15, 2025, Seattle Washington

Materials
Abstract

A private-information-retrieval (PIR) scheme lets a client fetch a record from a remote database without revealing which record it fetched. Classic PIR schemes treat all database records the same but, in practice, some database records are much more popular (i.e., commonly fetched) than others. We introduce distributional private information retrieval, a new type of PIR that can run faster than classic PIR—both asymptotically and concretely—when the popularity distribution is heavily skewed. Distributional PIR provides exactly the same cryptographic privacy as classic PIR. The speedup comes from a relaxed form of correctness: distributional PIR guarantees that in-distribution queries succeed with good probability, while out-of-distribution queries succeed with lower probability.

We construct a distributional-PIR scheme that makes black-box use of classic PIR protocols, and prove that it is asymptotically optimal (up to log factors) among a large class of schemes in terms of the server-runtime it achieves. On two popularity distributions built from real-world data, distributional PIR reduces compute costs by 5-77× compared to existing techniques. Finally, we build CrowdSurf, an end-to-end system for privately fetching tweets, and show that distributional-PIR reduces the end-to-end server cost by 8×.