Algorithms and Complexity Seminar
Pan-Private Streaming Algorithms and Continual Observation
Guy Rothblum (Princeton University)
Research on privacy-preserving statistical data analysis has been
flourishing recently. Following the introduction of the rigorous
differential privacy guarantee, sanitizing algorithms matching the
definition have been proposed for many problems. Most of this
private data analysis literature assumes that the sensitive data to be
analyzed is (i) held by a trusted curator and (ii) is in a fixed and
static database. The talk will consider and address concerns
with both of these assumptions:
1. Even well-intentioned collectors of confidential data, such as
governmental agencies, hospitals, or search engine providers, can be
pressured to permit data to be used for purposes other than that for
which they were collected. To support the data curators, we initiate
a study of "pan-private" algorithms; roughly speaking, these
algorithms collect statistical information without ever storing
sensitive information about individuals. Privacy is protected even if
the algorithm's internal state becomes visible to an adversary.
2. Many applications of data analysis involve analyzing data that is
not in a static database, but rather changes dynamically. This can be
either because the entire goal is one of monitoring, e.g. search
trends, traffic conditions, or incidence of influenza, or because the
goal is one of adaptive optimization, e.g., aggregating expert advice
or placement of data to minimize access costs.
In particular, we will present pan-private algorithms for continually
monitoring the values of useful statistics, such as the number of
distinct elements in the input, and applications.
Based on joint works with Cynthia Dwork, Moni Naor, Toni Pitassi, and
Sergey Yekhani