The need

How do we create machine learning models that preserve the privacy of individuals while including the broadest possible data? Patients can opt out of sharing data, but that may exacerbate imbalanced or inaccurate datasets. Anonymizing data may eliminate elements critical to answering research questions. Researchers need a way to include all the data available, while still protecting the anonymity of individuals.

The idea

Differential privacy simultaneously enables researchers and analysts to extract useful insights from datasets containing personal information and offers stronger privacy protections. This is achieved by introducing “statistical noise”. The noise is significant enough to protect the privacy of any individual, but small enough that it will not impact the accuracy of the answers extracted by analysts and researchers.

The solution

We are working with our partners to build open toolkits to better enable differential privacy. Further combined with other security services like Confidential Compute – differential privacy can help researchers find answers to their questions, while protecting the privacy of the individual.

Projects related to differential privacy

Browse projects in security and responsibility

Explore the possibilities of AI

Jumpstart your own AI innovations with learning resources and development solutions from Microsoft AI.