{"id":786691,"date":"2021-10-20T09:42:23","date_gmt":"2021-10-20T16:42:23","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=786691"},"modified":"2021-10-20T09:48:41","modified_gmt":"2021-10-20T16:48:41","slug":"tackling-dynamics-in-federated-incremental-learning-with-variational-embedding-rehearsal","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/tackling-dynamics-in-federated-incremental-learning-with-variational-embedding-rehearsal\/","title":{"rendered":"Tackling Dynamics in Federated Incremental Learning with Variational Embedding Rehearsal"},"content":{"rendered":"<p><span dir=\"ltr\" role=\"presentation\">Federated Learning is a fast growing area of ML <\/span><span dir=\"ltr\" role=\"presentation\">where<\/span> <span dir=\"ltr\" role=\"presentation\">the<\/span> <span dir=\"ltr\" role=\"presentation\">training<\/span> <span dir=\"ltr\" role=\"presentation\">datasets<\/span> <span dir=\"ltr\" role=\"presentation\">are<\/span> <span dir=\"ltr\" role=\"presentation\">extremely<\/span> <span dir=\"ltr\" role=\"presentation\">dis<\/span><span dir=\"ltr\" role=\"presentation\">tributed,<\/span> <span dir=\"ltr\" role=\"presentation\">all<\/span> <span dir=\"ltr\" role=\"presentation\">while<\/span> <span dir=\"ltr\" role=\"presentation\">dynamically<\/span> <span dir=\"ltr\" role=\"presentation\">changing<\/span> <span dir=\"ltr\" role=\"presentation\">over <\/span><span dir=\"ltr\" role=\"presentation\">time.<\/span> <span dir=\"ltr\" role=\"presentation\">Models need to be trained on clients\u2019 de<\/span><span dir=\"ltr\" role=\"presentation\">vices without any guarantees for either homogene<\/span><span dir=\"ltr\" role=\"presentation\">ity or stationarity of the local private data.<\/span> <span dir=\"ltr\" role=\"presentation\">The <\/span><span dir=\"ltr\" role=\"presentation\">need<\/span> <span dir=\"ltr\" role=\"presentation\">for<\/span> <span dir=\"ltr\" role=\"presentation\">continual<\/span> <span dir=\"ltr\" role=\"presentation\">training<\/span> <span dir=\"ltr\" role=\"presentation\">has<\/span> <span dir=\"ltr\" role=\"presentation\">also<\/span> <span dir=\"ltr\" role=\"presentation\">risen,<\/span> <span dir=\"ltr\" role=\"presentation\">due <\/span><span dir=\"ltr\" role=\"presentation\">to the ever-increasing production of in-task data. <\/span><span dir=\"ltr\" role=\"presentation\">However,<\/span> <span dir=\"ltr\" role=\"presentation\">pursuing<\/span> <span dir=\"ltr\" role=\"presentation\">both<\/span> <span dir=\"ltr\" role=\"presentation\">directions<\/span> <span dir=\"ltr\" role=\"presentation\">at<\/span> <span dir=\"ltr\" role=\"presentation\">the<\/span> <span dir=\"ltr\" role=\"presentation\">same <\/span><span dir=\"ltr\" role=\"presentation\">time is challenging,<\/span> <span dir=\"ltr\" role=\"presentation\">since client data privacy is <\/span><span dir=\"ltr\" role=\"presentation\">a major constraint, especially for rehearsal meth<\/span><span dir=\"ltr\" role=\"presentation\">ods.<\/span> <span dir=\"ltr\" role=\"presentation\">Herein,<\/span> <span dir=\"ltr\" role=\"presentation\">we<\/span> <span dir=\"ltr\" role=\"presentation\">propose<\/span> <span dir=\"ltr\" role=\"presentation\">a<\/span> <span dir=\"ltr\" role=\"presentation\">novel<\/span> <span dir=\"ltr\" role=\"presentation\">algorithm<\/span> <span dir=\"ltr\" role=\"presentation\">to <\/span><span dir=\"ltr\" role=\"presentation\">address<\/span> <span dir=\"ltr\" role=\"presentation\">the<\/span> <span dir=\"ltr\" role=\"presentation\">incremental<\/span> <span dir=\"ltr\" role=\"presentation\">learning<\/span> <span dir=\"ltr\" role=\"presentation\">process<\/span> <span dir=\"ltr\" role=\"presentation\">in<\/span> <span dir=\"ltr\" role=\"presentation\">an <\/span><span dir=\"ltr\" role=\"presentation\">FL scenario, based on realistic client enrollment <\/span><span dir=\"ltr\" role=\"presentation\">scenarios where clients can drop in or out dynam<\/span><span dir=\"ltr\" role=\"presentation\">ically.<\/span> <span dir=\"ltr\" role=\"presentation\">We<\/span> <span dir=\"ltr\" role=\"presentation\">first<\/span> <span dir=\"ltr\" role=\"presentation\">propose<\/span> <span dir=\"ltr\" role=\"presentation\">using<\/span> <span dir=\"ltr\" role=\"presentation\">deep<\/span> <span dir=\"ltr\" role=\"presentation\">Variational <\/span><span dir=\"ltr\" role=\"presentation\">Embeddings that secure the privacy of the client <\/span><span dir=\"ltr\" role=\"presentation\">data.<\/span> <span dir=\"ltr\" role=\"presentation\">Second, we propose a server-side training <\/span><span dir=\"ltr\" role=\"presentation\">method that enables a model to rehearse the pre<\/span><span dir=\"ltr\" role=\"presentation\">viously learnt knowledge. Finally, we investigate <\/span><span dir=\"ltr\" role=\"presentation\">the performance of federated incremental learning <\/span><span dir=\"ltr\" role=\"presentation\">in dynamic client enrollment scenarios. The pro<\/span><span dir=\"ltr\" role=\"presentation\">posed method shows parity with offline training <\/span><span dir=\"ltr\" role=\"presentation\">on domain-incremental learning, addressing chal<\/span><span dir=\"ltr\" role=\"presentation\">lenges in both the dynamic enrollment of clients <\/span><span dir=\"ltr\" role=\"presentation\">and the domain shifting of client data.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Federated Learning is a fast growing area of ML where the training datasets are extremely distributed, all while dynamically changing over time. Models need to be trained on clients\u2019 devices without any guarantees for either homogeneity or stationarity of the local private data. The need for continual training has also risen, due to the ever-increasing [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"arxiv","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2021-10-19","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13561,13556],"msr-publication-type":[193724],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[257863,253024],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-786691","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-field-of-study-continual-learning","msr-field-of-study-federated-learning"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2021-10-19","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"arxiv","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/arxiv.org\/pdf\/2110.09695.pdf","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Tae Jin Park","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Kenichi Kumatani","user_id":39321,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Kenichi Kumatani"},{"type":"user_nicename","value":"Dimitrios Dimitriadis","user_id":37521,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Dimitrios Dimitriadis"}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[702211,756487,761911],"msr_project":[658488],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"miscellaneous","related_content":{"projects":[{"ID":658488,"post_title":"Project FLUTE","post_name":"project-flute","post_type":"msr-project","post_date":"2020-05-12 17:58:21","post_modified":"2022-05-12 10:36:15","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-flute\/","post_excerpt":"A novel framework for training models in a Federated Learning fashion. One of the novelties of the project is the first attempt to introduce Federated Learning in Speech Recognition tasks. Besides the novelty of the task, the paper describes an easily generalizable FL platform and some of the design decisions used for this task. Among the novel algorithms introduced are a new hierarchical optimization scheme, a gradient selection algorithm, and self-supervised training algorithms.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/658488"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/786691","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/786691\/revisions"}],"predecessor-version":[{"id":786697,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/786691\/revisions\/786697"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=786691"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=786691"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=786691"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=786691"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=786691"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=786691"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=786691"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=786691"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=786691"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=786691"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=786691"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=786691"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=786691"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}