{"version":"1.0","provider_name":"Microsoft Research","provider_url":"https:\/\/www.microsoft.com\/en-us\/research","author_name":"Jianlong Fu","author_url":"https:\/\/www.microsoft.com\/en-us\/research\/people\/jianf\/","title":"Vision and Language - Microsoft Research","type":"rich","width":600,"height":338,"html":"<blockquote class=\"wp-embedded-content\" data-secret=\"JV7UTqkJvu\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/video-and-language\/\">Vision and Language<\/a><\/blockquote><iframe sandbox=\"allow-scripts\" security=\"restricted\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/video-and-language\/embed\/#?secret=JV7UTqkJvu\" width=\"600\" height=\"338\" title=\"&#8220;Vision and Language&#8221; &#8212; Microsoft Research\" data-secret=\"JV7UTqkJvu\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" class=\"wp-embedded-content\"><\/iframe><script type=\"text\/javascript\">\n\/* <![CDATA[ *\/\n\/*! This file is auto-generated *\/\n!function(d,l){\"use strict\";l.querySelector&&d.addEventListener&&\"undefined\"!=typeof URL&&(d.wp=d.wp||{},d.wp.receiveEmbedMessage||(d.wp.receiveEmbedMessage=function(e){var t=e.data;if((t||t.secret||t.message||t.value)&&!\/[^a-zA-Z0-9]\/.test(t.secret)){for(var s,r,n,a=l.querySelectorAll('iframe[data-secret=\"'+t.secret+'\"]'),o=l.querySelectorAll('blockquote[data-secret=\"'+t.secret+'\"]'),c=new RegExp(\"^https?:$\",\"i\"),i=0;i<o.length;i++)o[i].style.display=\"none\";for(i=0;i<a.length;i++)s=a[i],e.source===s.contentWindow&&(s.removeAttribute(\"style\"),\"height\"===t.message?(1e3<(r=parseInt(t.value,10))?r=1e3:~~r<200&&(r=200),s.height=r):\"link\"===t.message&&(r=new URL(s.getAttribute(\"src\")),n=new URL(t.value),c.test(n.protocol))&&n.host===r.host&&l.activeElement===s&&(d.top.location.href=t.value))}},d.addEventListener(\"message\",d.wp.receiveEmbedMessage,!1),l.addEventListener(\"DOMContentLoaded\",function(){for(var e,t,s=l.querySelectorAll(\"iframe.wp-embedded-content\"),r=0;r<s.length;r++)(t=(e=s[r]).getAttribute(\"data-secret\"))||(t=Math.random().toString(36).substring(2,12),e.src+=\"#?secret=\"+t,e.setAttribute(\"data-secret\",t)),e.contentWindow.postMessage({message:\"ready\",secret:t},\"*\")},!1)))}(window,document);\n\/\/# sourceURL=https:\/\/www.microsoft.com\/en-us\/research\/wp-includes\/js\/wp-embed.min.js\n\/* ]]> *\/\n<\/script>\n","description":"Automatically describing visual content with natural language is a fundamental challenge of computer vision and multimedia. Sequence learning (e.g., Recurrent Neural Networks), attention mechanism, memory networks,\u00a0etc.,\u00a0have attracted increasing attention on visual interpretation. In this project, we are focusing on the following topics related to the emerging topic of &#8220;vision and language&#8221;: Image and video captioning, [&hellip;]","thumbnail_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/01\/video_and_language_sentence-1024x435.png"}