{"id":578,"date":"2014-12-23T12:59:49","date_gmt":"2014-12-23T12:59:49","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/inside_microsoft_research\/2014\/12\/23\/new-research-brings-precision-to-sampling-methods-used-in-statistics-and-machine-learning\/"},"modified":"2016-07-20T07:29:34","modified_gmt":"2016-07-20T14:29:34","slug":"new-research-brings-precision-to-sampling-methods-used-in-statistics-and-machine-learning","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/new-research-brings-precision-to-sampling-methods-used-in-statistics-and-machine-learning\/","title":{"rendered":"New Research Brings Precision to Sampling Methods Used in Statistics and Machine Learning"},"content":{"rendered":"<p class=\"posted-by\">Posted by <span class=\"author\">George Thomas Jr.<\/span><\/p>\n<p>Addressing one of the core problems in statistics and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/about\/our-research\/machine-learning.aspx\" title=\"Machine Learning at Microsoft Research\" target=\"_blank\">machine learning<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Microsoft researchers have developed a new, more efficient algorithm that enables exact sampling.<\/p>\n<\/p>\n<table border=\"0\" cellpadding=\"3\" align=\"right\" style=\"width:305px;margin:8px\" cellspacing=\"1\">\n<tbody>\n<tr>\n<td><img decoding=\"async\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/daniel-tarlow-tom-minka-300.png\" alt=\"Daniel Tarlow and Tom Minka\" title=\"Daniel Tarlow and Tom Minka\" \/><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-size:x-small\"><strong>Daniel Tarlow and Tom Minka<\/strong><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Researchers <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/dtarlow\/default.aspx\" title=\"Daniel Tarlow\" target=\"_blank\">Daniel Tarlow<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/um\/people\/minka\/\" title=\"Tom Minka\" target=\"_blank\">Tom Minka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and former Microsoft intern <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.cs.toronto.edu\/~cmaddis\/\" title=\"Chris Maddison\" target=\"_blank\">Chris Maddison<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;introduced the algorithm in their paper, <em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/papers.nips.cc\/paper\/5449-a-sampling\" title=\"A* Sampling\" target=\"_blank\">A* Sampling<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/em>, one of only two of the 1,700 submitted that received an Outstanding Paper Award at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/nips.cc\/Conferences\/2014\/\" title=\"Neural Information Processing Systems Conference\" target=\"_blank\">NIPS 2014<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, the renowned machine learning conference of the Neural Information Processing Systems Foundation.<\/p>\n<p>&ldquo;This research makes a very significant advance in the efficiency of sampling, which is a core component of probabilistic modelling and reasoning systems,&rdquo; said <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/um\/people\/ablake\/\" title=\"Andrew Blake\" target=\"_blank\">Andrew Blake<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Distinguished Scientist and Laboratory Director of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/labs\/cambridge\/\" title=\"Microsoft Research Cambridge\" target=\"_blank\">Microsoft Research Cambridge<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>In their paper, the authors present a new construction of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/en.wikipedia.org\/wiki\/Gumbel_distribution\" title=\"Gumbel distribution on Wikipedia\" target=\"_blank\">Gumbel<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> process and A* sampling &#8212; an algorithm that searches for the maximum of a Gumbel process using A* search &#8212; and explain how their approach, from a different perspective, enables exact <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/en.wikipedia.org\/wiki\/Sampling_(statistics)\" title=\"Sampling on Wikipedia\" target=\"_blank\">sampling<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, whereas previous solutions were forced to resort to approximate sampling.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/a-star-sampling-illustration.jpg\" alt=\"Illustration of A* sampling\" style=\"margin-left:auto;margin-right:auto;vertical-align:middle\" title=\"Illustration of A* sampling\" \/><\/p>\n<p align=\"center\"><span style=\"font-size:x-small\"><b>Illustration of A* sampling<\/b><\/span><\/p>\n<p>Citing inspiration from an algorithm for sampling from a discrete distribution known as the &ldquo;Gumbel-Max trick,&rdquo; the authors note how an exact sample results by adding independent Gumbel perturbations to each configuration of a discrete negative energy function and returning the argmax configuration of the perturbed negative energy function.<\/p>\n<p>&ldquo;Our first key observation,&rdquo; the authors write, &ldquo;is that we can apply the Gumbel-Max trick without instantiating all of the (possibly exponentially many) Gumbel perturbations. The same basic idea then allows us to extend the Gumbel-Max trick to continuous spaces where there will be infinitely many independent perturbations.<\/p>\n<p>&ldquo;Intuitively, for any given random energy function, there are many perturbation values that are irrelevant to determining the argmax so long as we have an upper bound on their values. We will show how to instantiate the relevant ones and bound the irrelevant ones, allowing us to find the argmax &#8212; and thus an exact sample.&rdquo;<\/p>\n<p>Tarlow&rsquo;s hope for the future is these findings will lead to <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/en.wikipedia.org\/wiki\/Probabilistic_logic\" title=\"Probabilistic logic on Wikipedia\" target=\"_blank\">probabilistic reasoning<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> systems that are more powerful and easier to use than current systems. &ldquo;When we can provide stronger guarantees about the quality of outputs from our inference algorithms, it becomes easier to use these algorithms inside larger systems and to build tools that can be used reliably by non-experts,&rdquo; he said.<\/p>\n<p>Machine learning is a key focus of Microsoft Research (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"https:\/\/x.com\/msftresearch\" title=\"Follow Microsoft Research on Twitter\" target=\"_blank\">@MSFTResearch<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>) and has led to <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/about\/techtransfer\/\" title=\"product contributions and technology transfer\" target=\"_blank\">numerous product contributions<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> that include <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/news\/features\/officelens-031714.aspx\" title=\"Office Lens for Microsoft Office\" target=\"_blank\">Microsoft Office<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a href=\"\/b\/inside_microsoft_research\/archive\/2014\/03\/18\/helping-sql-server-rev-up-to-full-speed.aspx\" title=\"Making SQL Server faster\">SQL Server<\/a>, <a href=\"\/b\/inside_microsoft_research\/archive\/2013\/10\/16\/audio-advances-help-xbox-one-determine-signal-from-noise.aspx\" title=\"Audio advances for Xbox One\">Xbox One<\/a>, <a href=\"\/b\/inside_microsoft_research\/archive\/2014\/07\/14\/making-cortana-the-researcher-s-dream-assistant.aspx\" title=\"Cortana speech recognition\">Cortana speech recognition<\/a>, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/news\/features\/translator-052714.aspx\" title=\"Skype near real-time translation\" target=\"_blank\">Skype Translator<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>The Neural Information Processing Systems Foundation is a non-profit corporation whose purpose is to foster the exchange of research on neural information processing systems in their biological, technological, mathematical, and theoretical aspects.<\/p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Posted by George Thomas Jr. Addressing one of the core problems in statistics and machine learning, Microsoft researchers have developed a new, more efficient algorithm that enables exact sampling. Daniel Tarlow and Tom Minka Researchers Daniel Tarlow, Tom Minka, and former Microsoft intern Chris Maddison&nbsp;introduced the algorithm in their paper, A* Sampling, one of only [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[200195,200897,201195,201781,186418,196435,203063,203387,204297],"research-area":[],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-578","post","type-post","status-publish","format-standard","hentry","category-research-blog","tag-a-sampling","tag-chris-maddison","tag-daniel-tarlow","tag-gumbel-max-trick","tag-machine-learning","tag-microsoft-research-cambridge","tag-nips-2014","tag-probabilistic-reasoning","tag-tom-minka","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"December 23, 2014","formattedExcerpt":"Posted by George Thomas Jr. Addressing one of the core problems in statistics and machine learning, Microsoft researchers have developed a new, more efficient algorithm that enables exact sampling. Daniel Tarlow and Tom Minka Researchers Daniel Tarlow, Tom Minka, and former Microsoft intern Chris Maddison&nbsp;introduced&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/578","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=578"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/578\/revisions"}],"predecessor-version":[{"id":260883,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/578\/revisions\/260883"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=578"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=578"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=578"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=578"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=578"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=578"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=578"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=578"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=578"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=578"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=578"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}