Date of Award

2017

Document Type

Thesis

Degree Name

Bachelor of Science

First Advisor

Dr. Sara Krehbiel

Abstract

Differential privacy [DMNS06] is a strong definition of database privacy that provides indi- viduals in a database with the guarantee that any particular person’s information has very little effect on the output of any analysis of the overall database. In order for this type of analysis to be practical, it must simultaneously preserve privacy and utility, where utility refers to how well the analysis describes the contents of the database.

An analyst may additionally wish to evaluate how a database’s composition changes over time. Consider a company, for example, that accumulates data from a growing base of customers. This company may want to analyze how its customer base evolves over time. Despite the practical need to conduct private analysis on growing databases, relatively little is known about differential privacy in this setting.

In this work, we seek to expand the scope of a differentially private mechanism called the median mechanism [RR10]. The median mechanism’s strength lies in its ability to answer many queries interactively, satisfying both privacy and utility constraints. We examine how these privacy and utility guarantees change in a growing database setting. First, we analyze the median mechanism when run multiple times independently as the database size increases. This approach is called sequential composition, and we show how to adjust parameters so that the privacy guarantee suffers logarithmically with the number of runs of the mechanism without any loss of utility. Having established this as a benchmark, we propose a new algorithm called the memory mechanism. In contrast to sequential composition, the memory approach preserves the history of the mechanism’s responses to earlier queries as the size increases. We show that the memory mechanism’s worst case performance matches that of the sequential composition, and we conjecture that the utility guarantee can be improved with natural constraints on the queries asked in each phase and on the distribution of the data. Proving such a conjecture to establish the benefit of the memory mechanism is left for future work.

Share

COinS