{"id":879525,"date":"2022-09-21T10:11:15","date_gmt":"2022-09-21T17:11:15","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/"},"modified":"2023-02-03T11:01:58","modified_gmt":"2023-02-03T19:01:58","slug":"distribution-inference-risks-identifying-and-mitigating-sources-of-leakage","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/distribution-inference-risks-identifying-and-mitigating-sources-of-leakage\/","title":{"rendered":"Distribution Inference Risks: Identifying and Mitigating Sources of Leakage"},"content":{"rendered":"<p>A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on distribution inference has focused on demonstrating successful attacks, with little attention given to identifying the potential causes of the leakage and to proposing mitigations. To bridge this gap, as our main contribution, we theoretically and empirically analyze the sources of information leakage that allows an adversary to perpetrate distribution inference attacks. We identify three sources of leakage: (1) memorizing specific information about the <span id=\"MathJax-Element-1-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-1\" class=\"math\"><span id=\"MathJax-Span-2\" class=\"mrow\"><span id=\"MathJax-Span-3\" class=\"texatom\"><span id=\"MathJax-Span-4\" class=\"mrow\"><span id=\"MathJax-Span-5\" class=\"mi\">E<\/span><\/span><\/span><span id=\"MathJax-Span-6\" class=\"mo\">[<\/span><span id=\"MathJax-Span-7\" class=\"mi\">Y<\/span><span id=\"MathJax-Span-8\" class=\"texatom\"><span id=\"MathJax-Span-9\" class=\"mrow\"><span id=\"MathJax-Span-10\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-11\" class=\"mi\">X<\/span><span id=\"MathJax-Span-12\" class=\"mo\">]<\/span><\/span><\/span><\/span> (expected label given the feature values) of interest to the adversary, (2) wrong inductive bias of the model, and (3) finiteness of the training data. Next, based on our analysis, we propose principled mitigation techniques against distribution inference attacks. Specifically, we demonstrate that causal learning techniques are more resilient to a particular type of distribution inference risk termed distributional membership inference than associative learning methods. And lastly, we present a formalization of distribution inference that allows for reasoning about more general adversaries than was previously possible.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"IEEE Conference on Secure and Trustworthy Machine Learning (SatML)","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2023-2-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13558],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[251401,248713],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-879525","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-security-privacy-cryptography","msr-locale-en_us","msr-field-of-study-data-privacy","msr-field-of-study-trustworthiness"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2023-2-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/arxiv.org\/pdf\/2209.08541.pdf","label_id":"243109","label":0}],"msr_related_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/github.com\/epfl-dlab\/property-inference-attacks","label_id":"264520","label":0}],"msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Valentin Hartmann","user_id":0,"rest_url":false},{"type":"text","value":"Leo Meynent","user_id":0,"rest_url":false},{"type":"text","value":"Maxime Peyrard","user_id":0,"rest_url":false},{"type":"text","value":"Dimitrios Dimitriadis","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Shruti Tople","user_id":39003,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Shruti Tople"},{"type":"text","value":"Robert West","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[199561,199565],"msr_event":[],"msr_group":[559983,756487],"msr_project":[788945,648207],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":788945,"post_title":"Privacy-preserving Deep Learning","post_name":"privacy-preserving-deep-learning","post_type":"msr-project","post_date":"2021-10-26 23:54:04","post_modified":"2021-11-25 19:01:25","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/privacy-preserving-deep-learning\/","post_excerpt":"Large machine learning model can memorize the training data, which poses privacy risk. To preserve privacy,&nbsp; it requires to control the data access and measure the privacy loss. Differential privacy (DP) is widely recognized as a gold standard of privacy protection due to its mathematical rigor. We propose a series of approaches to solve the challenges of applying DP in large deep neural networks and achieve new state-of-the-art results for private learning. Da Yu,&nbsp;Huishuai Zhang,&nbsp;Wei&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/788945"}]}},{"ID":648207,"post_title":"Confidential AI","post_name":"confidential-ai","post_type":"msr-project","post_date":"2020-05-15 05:46:38","post_modified":"2023-02-15 01:10:13","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/confidential-ai\/","post_excerpt":"Our goal is to make Azure the most trustworthy cloud platform for AI. The platform we envisage offers confidentiality and integrity against privileged attackers including attacks on the code, data and hardware supply chains, performance close to that offered by GPUs, and programmability of state-of-the-art ML frameworks. The confidential AI platform will enable multiple entities to collaborate and train accurate models using sensitive data, and serve these models with assurance that their data and models&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/648207"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/879525","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/879525\/revisions"}],"predecessor-version":[{"id":899484,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/879525\/revisions\/899484"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=879525"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=879525"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=879525"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=879525"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=879525"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=879525"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=879525"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=879525"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=879525"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=879525"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=879525"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=879525"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=879525"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}