Privacy-Preserving Datamining on Vertically Partitioned Databases

Cynthia Dwork, Kobbi Nissim

24th Annual International Cryptology Conference (CRYPTO 2004) |

Published by Springer Verlag

In a recent paper Dinur and Nissim considered a statistical database in which a trusted database administrator monitors queries and introduces noise to the responses with the goal of maintaining data privacy. Under a rigorous definition of breach of privacy, Dinur and Nissim proved that unless the total number of queries is sub-linear in the size of the database, a substantial amount of noise is required to avoid a breach, rendering the database almost useless.

As databases grow increasingly large, the possibility of being able to query only a sub-linear number of times becomes realistic. We further investigate this situation, generalizing the previous work in two important directions: multi-attribute databases (previous work dealt only with single-attribute databases) and vertically partitioned databases, in which different subsets of attributes are stored in different databases. In addition, we show how to use our techniques for datamining on published noisy statistics.