{"id":187384,"date":"2012-02-21T00:00:00","date_gmt":"2012-02-25T10:05:22","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/inference-and-learning-in-structured-output-models-for-computer-vision\/"},"modified":"2016-08-02T06:11:24","modified_gmt":"2016-08-02T13:11:24","slug":"inference-and-learning-in-structured-output-models-for-computer-vision","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/inference-and-learning-in-structured-output-models-for-computer-vision\/","title":{"rendered":"Inference and Learning in Structured-Output Models for Computer Vision"},"content":{"rendered":"<div class=\"asset-content\">\n<p>A large number of problems in computer vision involve predictions over exponentially (or infinitely) large structured-output spaces, e.g. the space of segmentations of an image, the space of all object-part hierarchies in a context-free grammar, the space of all pixel-level depth-predictions, etc.<\/p>\n<p>In order to build intelligent vision systems that are able to reason about these tasks, we must address the challenges of 1) representation: how do we store and represent beliefs over exponentially and infinitely large output-spaces? 2) learning: how do we learn these beliefs from data? 3) inference: how do we predict under these beliefs? and 4) their interactions: the richer the model, the more difficult it is to learn and infer under. In this talk, I will present a sampling of my recent work that addresses some of these challenges.<\/p>\n<p>While a lot of progress has been made on the &#8220;static&#8221; version of the MAP inference problem, a number of situations require dynamic inference algorithms that must adapt and reorder computation to focus on &#8220;important&#8221; parts of the problem. I will present a novel measure for identifying such important parts of the problem and demonstrate how it is useful in speeding up inference algorithms in a variety of settings.<\/p>\n<p>Next, I will talk about our recent work on the M-Best-Mode problem, which involves extracting not just the most probable solution, but also a \/diverse\/ set of top M most probable solutions in discrete graphical models (like MRFs\/CRFs). Extracting the top M modes of the distribution allows us to better exploit the beliefs that our model holds.<\/p>\n<p>Joint work with Pushmeet Kohli (MSRC), Vladimir Kolmogorov (IST), Sebastian Nowozin (MSRC), Greg Shakhnarovich (TTIC), Ashutosh Saxena (Cornell), Daniel Tarlow (UToronto) and Payman Yadollahpour (TTIC).<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A large number of problems in computer vision involve predictions over exponentially (or infinitely) large structured-output spaces, e.g. the space of segmentations of an image, the space of all object-part hierarchies in a context-free grammar, the space of all pixel-level depth-predictions, etc. In order to build intelligent vision systems that are able to reason about [&hellip;]<\/p>\n","protected":false},"featured_media":196666,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[206954],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-187384","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-video-type-microsoft-research-talks","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/MLAPFjWGDIY","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/187384","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/187384\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/196666"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=187384"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=187384"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=187384"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=187384"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=187384"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=187384"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=187384"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=187384"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=187384"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=187384"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}