{"id":575712,"date":"2019-02-20T13:45:07","date_gmt":"2019-02-20T21:45:07","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=575712"},"modified":"2019-03-25T12:45:02","modified_gmt":"2019-03-25T19:45:02","slug":"adversarial-benchmarks-for-commonsense-reasoning","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/adversarial-benchmarks-for-commonsense-reasoning\/","title":{"rendered":"Adversarial Benchmarks for Commonsense Reasoning"},"content":{"rendered":"<p>Human intelligence involves comprehending new situations through a rich model of the world. Given a single image from a movie, or a paragraph from a novel, we can easily infer people\u2019s intentions, mental states, and actions. However, enabling machines to perform this kind of commonsense reasoning remains elusive. Beyond the inherent difficulty of building models that reason, we lack robust benchmarks that evaluate AI reasoning ability.<\/p>\n<p>In this talk, I will present two new large-scale benchmark datasets for commonsense reasoning, covering text (SWAG, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/rowanzellers.com\/swag\" target=\"_blank\" rel=\"noopener noreferrer\">rowanzellers.com\/swag<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>) and vision (VCR; <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/visualcommonsense.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">visualcommonsense.com<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>). These datasets pose new types of reasoning challenges: machines must abstract away from text and images and understand the entire situation, and then explain their predictions. Equally important is what these datasets don\u2019t contain: they are adversarially constructed using a suite of new techniques, so as to be resistant to biases. In addition, I will introduce models for these datasets, and discuss where the field might go next towards human-level commonsense reasoning.<\/p>\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/42802_Adversarial_Benchmarks_for_Commonsense_Reasoning.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">[Slides]<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Human intelligence involves comprehending new situations through a rich model of the world. Given a single image from a movie, or a paragraph from a novel, we can easily infer people\u2019s intentions, mental states, and actions. However, enabling machines to perform this kind of commonsense reasoning remains elusive. Beyond the inherent difficulty of building models [&hellip;]<\/p>\n","protected":false},"featured_media":575721,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[13561,13556],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-575712","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/nl6IsjfWKms","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/575712","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/575712\/revisions"}],"predecessor-version":[{"id":575724,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/575712\/revisions\/575724"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/575721"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=575712"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=575712"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=575712"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=575712"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=575712"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=575712"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=575712"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=575712"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=575712"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=575712"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}