{"version":"1.0","provider_name":"Microsoft Research","provider_url":"https:\/\/www.microsoft.com\/en-us\/research","author_name":"Jeff Running","author_url":"https:\/\/www.microsoft.com\/en-us\/research\/people\/jeffrunn\/","title":"Exploiting comparable corpora - Microsoft Research","type":"rich","width":600,"height":338,"html":"<blockquote class=\"wp-embedded-content\" data-secret=\"KrAOvqQgDW\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/video\/exploiting-comparable-corpora\/\">Exploiting comparable corpora<\/a><\/blockquote><iframe sandbox=\"allow-scripts\" security=\"restricted\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/video\/exploiting-comparable-corpora\/embed\/#?secret=KrAOvqQgDW\" width=\"600\" height=\"338\" title=\"&#8220;Exploiting comparable corpora&#8221; &#8212; Microsoft Research\" data-secret=\"KrAOvqQgDW\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" class=\"wp-embedded-content\"><\/iframe><script type=\"text\/javascript\">\n\/* <![CDATA[ *\/\n\/*! This file is auto-generated *\/\n!function(d,l){\"use strict\";l.querySelector&&d.addEventListener&&\"undefined\"!=typeof URL&&(d.wp=d.wp||{},d.wp.receiveEmbedMessage||(d.wp.receiveEmbedMessage=function(e){var t=e.data;if((t||t.secret||t.message||t.value)&&!\/[^a-zA-Z0-9]\/.test(t.secret)){for(var s,r,n,a=l.querySelectorAll('iframe[data-secret=\"'+t.secret+'\"]'),o=l.querySelectorAll('blockquote[data-secret=\"'+t.secret+'\"]'),c=new RegExp(\"^https?:$\",\"i\"),i=0;i<o.length;i++)o[i].style.display=\"none\";for(i=0;i<a.length;i++)s=a[i],e.source===s.contentWindow&&(s.removeAttribute(\"style\"),\"height\"===t.message?(1e3<(r=parseInt(t.value,10))?r=1e3:~~r<200&&(r=200),s.height=r):\"link\"===t.message&&(r=new URL(s.getAttribute(\"src\")),n=new URL(t.value),c.test(n.protocol))&&n.host===r.host&&l.activeElement===s&&(d.top.location.href=t.value))}},d.addEventListener(\"message\",d.wp.receiveEmbedMessage,!1),l.addEventListener(\"DOMContentLoaded\",function(){for(var e,t,s=l.querySelectorAll(\"iframe.wp-embedded-content\"),r=0;r<s.length;r++)(t=(e=s[r]).getAttribute(\"data-secret\"))||(t=Math.random().toString(36).substring(2,12),e.src+=\"#?secret=\"+t,e.setAttribute(\"data-secret\",t)),e.contentWindow.postMessage({message:\"ready\",secret:t},\"*\")},!1)))}(window,document);\n\/\/# sourceURL=https:\/\/www.microsoft.com\/en-us\/research\/wp-includes\/js\/wp-embed.min.js\n\/* ]]> *\/\n<\/script>\n","thumbnail_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/exploiting-comparable-corpora-1.jpg","thumbnail_width":320,"thumbnail_height":240,"description":"One of the major bottlenecks in the development of Statistical Machine Translation systems for most language pairs is the lack of bilingual parallel training data. Currently available parallel corpora span relatively few language pairs and very few domains; building new ones of sufficiently large size and high quality is time-consuming and expensive. In this talk, [&hellip;]"}