Summary: | Statistical agencies routinely publish aggregate data in the form of contingency tables.
In this paper, we consider the problem of releasing private contingency tables so that
the privacy of individual respondents in the table is preserved. We first uncover funda-
mental problems with existing cell suppression algorithms that are used for this purpose.
We then present a rigorous definition of privacy and a generic algorithmic framework
for cell suppression given this definition. Using this framework we build a complete
cell suppression solution for the special case of boolean private attributes. We study
both theoretically and experimentally the utility of our approach. Along the way, we
demonstrate a connection to the query auditing problem in statistical databases and
make a foundational contribution to this problem as well. In particular, we analyze an
unexamined assumption from the literature regarding the prior knowledge of attackers.
|