{"id":186077,"date":"2011-03-17T00:00:00","date_gmt":"2011-03-26T15:42:30","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/beyond-naming-image-understanding-via-physical-functional-and-causal-relationships\/"},"modified":"2016-08-22T11:32:44","modified_gmt":"2016-08-22T18:32:44","slug":"beyond-naming-image-understanding-via-physical-functional-and-causal-relationships","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/beyond-naming-image-understanding-via-physical-functional-and-causal-relationships\/","title":{"rendered":"Beyond Naming: Image Understanding via Physical, Functional and Causal Relationships"},"content":{"rendered":"<div class=\"asset-content\">\n<p>What does it mean to &#8220;understand&#8221; an image? One popular answer is simply naming the objects seen in the image. During the last decade most computer vision researchers have focused on this &#8220;object naming&#8221; problem. While there has been great progress in detecting things like &#8220;cars&#8221; and &#8220;people&#8221;, such a level of understanding still cannot answer even basic questions about an image such as &#8220;What is the geometric structure of the scene?&#8221;, &#8220;Where in the image can I walk?&#8221; or &#8220;What is going to happen next?&#8221;. In this talk, I will show that it is beneficial to go beyond mere object naming and harness relationships between objects for image understanding. These relationships can provide crucial high-level constraints to help construct a globally-consistent model of the scene, as well as allow for powerful ways of understanding and interpreting the underlying image. Specifically, I will present image and video understanding systems that incorporate: (1) physical relationships between objects via a qualitative 3D volumetric representation; (2) functional relationships between objects and actions via data-driven physical interactions; and (3) causal relationships between actions via a storyline representation. I will demonstrate the importance of these relationships on a diverse set of real-world images and videos.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What does it mean to &#8220;understand&#8221; an image? One popular answer is simply naming the objects seen in the image. During the last decade most computer vision researchers have focused on this &#8220;object naming&#8221; problem. While there has been great progress in detecting things like &#8220;cars&#8221; and &#8220;people&#8221;, such a level of understanding still cannot [&hellip;]<\/p>\n","protected":false},"featured_media":196059,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-186077","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/y4_rnmgKmRo","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/186077","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/186077\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/196059"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=186077"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=186077"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=186077"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=186077"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=186077"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=186077"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=186077"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=186077"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=186077"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=186077"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}