{"id":184050,"date":"2005-08-15T00:00:00","date_gmt":"2009-10-31T13:19:50","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/estimating-geometric-scene-context-from-a-single-image\/"},"modified":"2016-09-09T09:52:14","modified_gmt":"2016-09-09T16:52:14","slug":"estimating-geometric-scene-context-from-a-single-image","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/estimating-geometric-scene-context-from-a-single-image\/","title":{"rendered":"Estimating Geometric Scene Context from a Single Image"},"content":{"rendered":"<div class=\"asset-content\">\n<p>Humans have an amazing ability to instantly grasp the overall 3D structure of a scene \u2013 ground orientation, relative positions of major landmarks, etc \u2013 even from a single image.  This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and\/or view it through a patch-sized peephole.  Yet it seems very likely that having a grasp of this &#8220;geometric context&#8221; of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis.<\/p>\n<p>In this talk, I will describe our first steps toward the goal of estimating a 3D scene context from a single image. We propose to estimate the coarse geometric properties of a scene by learning appearance-based models of <b>geometric<\/b> classes. Geometric classes describe the 3D orientation of an image region with respect to the camera.  We provide a multiple-hypothesis segmentation framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label.  These confidences can then (hopefully) be used to improve the performance of many other applications.  We provide a quantitative evaluation of our algorithm on a dataset of challenging outdoor images. We also demonstrate its usefulness in two applications: 1) improving object detection (preliminary results), and 2) automatic qualitative single-view reconstruction (&#8220;Automatic Photo Pop-up&#8221;, SIGGRAPH&#8217;05).<\/p>\n<p>Joint work with Derek Hoiem and Martial Hebert at CMU.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Humans have an amazing ability to instantly grasp the overall 3D structure of a scene \u2013 ground orientation, relative positions of major landmarks, etc \u2013 even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and\/or view it through a patch-sized peephole. Yet [&hellip;]<\/p>\n","protected":false},"featured_media":195297,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-184050","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/lTsR8c7bkjc","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/184050","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/184050\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/195297"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=184050"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=184050"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=184050"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=184050"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=184050"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=184050"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=184050"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=184050"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=184050"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=184050"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}