{"id":1149202,"date":"2025-09-22T14:26:49","date_gmt":"2025-09-22T21:26:49","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-academic-program&#038;p=1149202"},"modified":"2026-01-26T08:28:28","modified_gmt":"2026-01-26T16:28:28","slug":"lingua-expanding-europes-voices-in-ai","status":"publish","type":"msr-academic-program","link":"https:\/\/www.microsoft.com\/en-us\/research\/academic-program\/lingua-expanding-europes-voices-in-ai\/","title":{"rendered":"LINGUA: Expanding Europe\u2019s Voices in AI"},"content":{"rendered":"\n\n<p><\/p>\n\n\n\n\n\n\n<div class=\"wp-block-media-text has-vertical-margin-small  has-vertical-padding-none  is-stacked-on-mobile is-vertically-aligned-top\" style=\"grid-template-columns:35% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-1024x576.jpg\" alt=\"LINGUA Open Call | overhead view of people rushing through an open space\" class=\"wp-image-1150202 size-full\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-walking_1400x788.jpg 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<h2 class=\"wp-block-heading\" id=\"lingua-announcing-the-awardees-from-microsoft-s-ai-for-good-lab-open-call\">LINGUA: Announcing the awardees from Microsoft\u2019s AI for Good Lab Open Call<\/h2>\n\n\n\n<p>On the European Day of Languages celebrating Europe\u2019s rich linguistic and cultural diversity, we released the LINGUA Open Call. The call invited proposals that advanced digital inclusion for Europe\u2019s low-resource languages. These are languages with limited online content and datasets, leading to underrepresentation in AI technologies compared to high-resource counterparts such as English, Spanish, French, or German. While many vulnerable and endangered languages fall into this category, the call was open to any European language that lacks the digital foundations required for fair representation and participation in the AI era.<\/p>\n\n\n\n<p>LINGUA aims to address this gap by supporting innovative projects that collect high-quality speech and text datasets for Europe\u2019s underrepresented languages. It is part of Microsoft\u2019s commitment to digital sovereignty and linguistic diversity in Europe, ensuring that every language has the opportunity to be represented in the future of AI. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/blogs.microsoft.com\/on-the-issues\/2025\/07\/20\/eudigitalunlock\/\" target=\"_blank\" rel=\"noopener noreferrer\">Read more about the initiative<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-media-text has-vertical-margin-small  has-vertical-padding-none  has-media-on-the-right is-stacked-on-mobile\" style=\"grid-template-columns:auto 35%\"><div class=\"wp-block-media-text__content\">\n<h3 class=\"wp-block-heading\" id=\"our-commitment\">Our commitment<\/h3>\n\n\n\n<p>At the Microsoft AI for Good Lab, we are deepening our commitment to Europe\u2019s digital future by supporting linguistic diversity, digital sovereignty, and inclusive innovation. The LINGUA Open Call is part of the EU Digital Unlock initiative, which aims to make Europe\u2019s languages and cultures more open and accessible in the digital era. We are proud to collaborate with nonprofits, universities, research institutes, startups, and cultural organizations to enhance resources for low-resource languages, close digital gaps, and maximize impact through shared knowledge and collective action.<\/p>\n\n\n\n<p>We are excited to launch this initiative in close coordination with the APERTUS project led by EPFL & ETH Zurich, and in consultation with the Council of Europe. Together, we are building data resources for European languages, expanding the supply of multilingual datasets, and enhancing the performance of low resource language LLMs. Our goal is to ensure that Europe\u2019s rich linguistic and cultural heritage is fully represented in the next generation of AI models (e.g., <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/ethz.ch\/en\/news-and-events\/eth-news\/news\/2025\/09\/press-release-apertus-a-fully-open-transparent-multilingual-language-model.html\" target=\"_blank\" rel=\"noopener noreferrer\">Apertus<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, EuroLLM, SmolLM3) by empowering communities, fostering innovation, and recognizing the people and organizations that make Europe a hub of creativity and inclusion.<\/p>\n<\/div><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-1024x576.jpg\" alt=\"LINGUA Open Call | abstract fan of blue and purple digital ribbons\" class=\"wp-image-1150203 size-full\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_digital-ribbons_1400x788.jpg 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"lingua-open-call-awardees\">LINGUA Open Call awardees&nbsp;&nbsp;<\/h3>\n\n\n\n<p>The selected projects span <strong>16 languages and dialects across 10 countries<\/strong>, reflecting a diverse mix of low-resource, vulnerable, and underrepresented linguistic communities.<\/p>\n\n\n\n<p>Based on applicant estimates, they collectively cover languages spoken by <strong>over 65 million people, <\/strong>including Icelandic, Luxembourgish, Basque, Maltese, Ladino, Romansh, Ladin, Ukrainian, Romani (and Greco-Romani), several Balkan languages (Serbian, Turkish, Bosnian), and Italian dialects (Neapolitan, Sicilian, Roman), alongside multi-language work.<\/p>\n\n\n\n<p>The awardees bring together universities, nonprofits, a government language center, and a public broadcaster, with efforts focused on open dataset creation and digitization, heritage language preservation, and new evaluation resources (including safety benchmarks) to strengthen multilingual AI and help safeguard Europe\u2019s linguistic diversity.<\/p>\n\n\n\n<p>We\u2019re grateful to MILA Quebec, Mozilla, and EPFL for their close collaboration and support throughout the evaluation and selection process.<\/p>\n\n\n\n<p>We\u2019re pleased to announce the selected projects for the LINGUA Open Call:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>BUDOVA: Building Ukrainian Domain-Specific, Open Voice & Text Archives<\/strong> \u2014 Kyiv National University of Construction and Architecture (Ukraine) \u2014 <em>Ukrainian<\/em><\/li>\n\n\n\n<li><strong>Collection and Digitization of Romani Language Data in Greece: Laying the Foundations for Representation in Artificial Intelligence<\/strong> \u2014 ARSIS \u2013 Association for the Social Support of Youth (Greece) \u2014 <em>Romani, Greco-Romani<\/em><\/li>\n\n\n\n<li><strong>Icelandic AI Safety Benchmarks: Creating Open Evaluation Datasets for LLM Safety in a Low-Resource Language<\/strong> \u2014 University of Iceland (Iceland) \u2014 <em>Icelandic<\/em><\/li>\n\n\n\n<li><strong>LuxVLD: Luxembourgish Vision-Language Dataset for Education and Digital Inclusion<\/strong> \u2014 SnT, University of Luxembourg (Luxembourg) \u2014 <em>Luxembourgish<\/em><\/li>\n\n\n\n<li><strong>PARLA CHIARO (Speak Clearly) \u2013 Protecting Italian Dialect Speakers from AI-generated Health Misinformation<\/strong> \u2014 University of Naples Federico II (Italy) \u2014 <em>Neapolitan, Sicilian, Roman<\/em><\/li>\n\n\n\n<li><strong>Protecting Kosovo&#8217;s languages through responsible AI<\/strong> \u2014 Radio Television of Kosovo &#8211; RTK (Kosovo) \u2014 <em>Serbian, Turkish, Bosnian, and Romani<\/em><\/li>\n\n\n\n<li><strong>RhaetoChat: LLM Fine-Tuning Data for Rhaeto-Romance Languages<\/strong> \u2014 Department of Computational Linguistics, University of Zurich (Switzerland) \u2014 <em>Romansh and Ladin<\/em><\/li>\n\n\n\n<li><strong>SaqWI: Korpus Malti ta\u2019 Mistoqsijiet u Twe\u0121ibiet \/ SaqWI: A Maltese Corpus of Qs & As (SaqWI-QA)<\/strong> \u2014 \u010aentru tal-Ilsien Malti (CIM) (Malta) \u2014 <em>Maltese<\/em><\/li>\n\n\n\n<li><strong>Scaling Finweb2-HQ: Multi-Signal Extraction and Quality Enhancement for European Language Models and Beyond<\/strong> \u2014 EPFL (Switzerland) \u2014 <em>Multi-language<\/em><\/li>\n\n\n\n<li><strong>Speaking Ladino: Open Speech and Text Datasets for AI-Powered Language Preservation<\/strong> \u2014 Inalco Paris (France) \u2014 <em>Ladino<\/em><\/li>\n\n\n\n<li><strong>Wikispeech for All: Basque Edition<\/strong> \u2014 Wikimedia Sverige (Sweden) \u2014 <em>Basque<\/em><\/li>\n<\/ul>\n\n\n\n<p>In addition to the selected awardees, further projects will receive support through Azure compute credits.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><a id=\"_msocom_1\"><\/a><\/p>\n\n\n\n<div class=\"wp-block-media-text has-vertical-margin-small  has-vertical-padding-none  is-stacked-on-mobile\" style=\"grid-template-columns:35% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-1024x576.jpg\" alt=\"LINGUA Open Call | photo of three people talking with a fourth in the background\" class=\"wp-image-1150201 size-full\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_people-talking_1400x788.jpg 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<h3 class=\"wp-block-heading\" id=\"our-commitment\">Eligibility<\/h3>\n\n\n\n<p>We encouraged proposals from nonprofits, NGOs, universities, research institutions, social enterprises, cultural organizations, and startups. Proposals with multiple collaborators were welcome, particularly from those committed to the public good and able to demonstrate strong community engagement and ethical data practices.<\/p>\n\n\n\n<p>To be eligible, applicants were required to demonstrate a commitment to producing fully open\u2011licensed datasets for text\u2011to\u2011text, speech\u2011to\u2011text, and text\u2011to\u2011speech applications. These efforts laid critical groundwork for the inclusion of low\u2011resource languages in open language and speech models.<\/p>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\"><\/div>\n\n\n\n<div style=\"padding-bottom:64px; padding-top:64px\" class=\"wp-block-msr-immersive-section alignfull row has-background has-black-background-color has-text-color has-white-color wp-block-msr-immersive-section\">\n\t\n\t<div class=\"container\">\n\t\t<div class=\"wp-block-msr-immersive-section__wrapper col-lg-11 col-xl-9 px-0 m-auto\">\n\t\t\t<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\"><\/div>\t\t<\/div>\n\t<\/div>\n\n\t<img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"600\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600.jpg\" class=\"wp-block-msr-immersive-section__background-image\" alt=\"AI for Good - abstract background with dark blue and black wavy lines\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600.jpg 1600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600-300x113.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600-1024x384.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600-768x288.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600-1536x576.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/01\/AI-Good-One-Future-Grant_Apply_1600x600-240x90.jpg 240w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\" \/><\/div>\n\n\n","protected":false},"featured_media":1149976,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":null,"footnotes":""},"msr-opportunity-type":[155533],"msr-region":[239178],"msr-locale":[268875],"msr-program-audience":[],"msr-post-option":[269148,269142],"msr-impact-theme":[],"class_list":["post-1149202","msr-academic-program","type-msr-academic-program","status-publish","has-post-thumbnail","hentry","msr-opportunity-type-grants-and-fellowships","msr-region-europe","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-include-in-river"],"msr_description":"","msr_social_media":[],"related-researchers":[],"tab-content":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-academic-program\/1149202","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-academic-program"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-academic-program"}],"version-history":[{"count":37,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-academic-program\/1149202\/revisions"}],"predecessor-version":[{"id":1160893,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-academic-program\/1149202\/revisions\/1160893"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1149976"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1149202"}],"wp:term":[{"taxonomy":"msr-opportunity-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-opportunity-type?post=1149202"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1149202"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1149202"},{"taxonomy":"msr-program-audience","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-program-audience?post=1149202"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1149202"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1149202"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}