{"id":788831,"date":"2021-10-26T23:11:58","date_gmt":"2021-10-27T06:11:58","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=788831"},"modified":"2021-11-25T19:01:12","modified_gmt":"2021-11-26T03:01:12","slug":"generalization-in-deep-learning","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/generalization-in-deep-learning\/","title":{"rendered":"Generalization in Deep Learning"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background bg-gray-200 has-background- card-background--full-bleed\">\n\t\t\t\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 align-self-center\">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 id=\"generalization-in-deep-learning\">Generalization in Deep Learning<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p class=\"wp-block-paragraph\">Understanding the generalization of deep learning is an important but challenging problem. We establish generalization error bounds for SGD by characterizing the algorithmic stability in terms of the population risk at initialization and from information theory perspective.<\/p>\n\n\n\n\n\n<ul class=\"wp-block-list\"><li>Bohan Wang, Qi Meng, Wei Chen, Tie-Yan Liu, On the Implicit Regularization for Adaptive Optimization Algorithms on Homogeneous Neural Networks.&nbsp;&nbsp;<em>In the Thirty-eighth International Conference on Machine Learning&nbsp;<\/em>(ICML)<em>, 2021&nbsp;<\/em><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Shuxin Zheng, Qi Meng, Huishuai Zhang, Wei Chen, and Tie-Yan Liu,&nbsp;Capacity Control of ReLU Neural Networks by Basis-path Norm, In&nbsp;<em>Proceedings of the 33rd International Association for the Advancement of Artificial Intelligence Conference (<\/em>AAAI), 2019.<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Qi Meng, Shiqi Gong, Weitao Du, Huishuai Zhang, Wei Chen, Zhiming Ma, Tie-Yan Liu, Dynamic of Stochastic Gradient Descent with State-dependent Noise. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2006.13719\">arXiv:2006.13719<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Mingyang Yi,&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/huzhang\/\">Huishuai Zhang<\/a>,&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/wche\/\">Wei Chen<\/a>, Zhiming Ma and&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tyliu\/\">Tie-Yan Liu<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/bn-invariant-sharpness-regularizes-the-training-model-to-better-generalization-2\/\">BN-invariant Sharpness Regularizes the Training Model to Better Generalization<\/a>, <em>International Joint Conferences on Artificial Intelligence (IJCAI), 2019<\/em><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Yi Zhou,&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/huzhang\/\">Huishuai Zhang<\/a> and Yingbin Liang, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/understanding-generalization-error-of-sgd-in-nonconvex-optimization\/\">Understanding Generalization Error of SGD in Nonconvex Optimization<\/a>, International Conference on Acoustics, Speech, and Signal Processing (ICASSP)<em>, 2019<\/em><\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n","protected":false},"excerpt":{"rendered":"<p>Understanding the generalization of deep learning is an important but challenging problem. We establish generalization error bounds for SGD by characterizing the algorithmic stability in terms of the population risk at initialization and from information theory perspective. Bohan Wang, Qi Meng, Wei Chen, Tie-Yan Liu, On the Implicit Regularization for Adaptive Optimization Algorithms on Homogeneous [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-788831","msr-project","type-msr-project","status-publish","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"related-researchers":[],"msr_research_lab":[199560],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/788831","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/788831\/revisions"}],"predecessor-version":[{"id":799924,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/788831\/revisions\/799924"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=788831"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=788831"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=788831"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=788831"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=788831"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}