{"id":1160590,"date":"2026-01-20T21:38:57","date_gmt":"2026-01-21T05:38:57","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=1160590"},"modified":"2026-01-28T13:25:24","modified_gmt":"2026-01-28T21:25:24","slug":"ai-for-low-resource-languages","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/ai-for-low-resource-languages\/","title":{"rendered":"AI for Low-Resource Languages"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720.jpg\" class=\"attachment-full size-full\" alt=\"AI for low resource languages - Inuktitut Syllabics\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720.jpg 1920w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720-300x113.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720-1024x384.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720-768x288.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720-1536x576.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720-1600x600.jpg 1600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/low-resource-languages_header_1920x720-240x90.jpg 240w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"bridging-the-language-gap\">Bridging the language gap<\/h1>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/ai-for-good-research-lab\/\">< AI for Good Lab <\/a><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<div class=\"wp-block-media-text has-vertical-margin-small  has-vertical-padding-none  has-media-on-the-right is-stacked-on-mobile is-style-border\" style=\"grid-template-columns:auto 40%\"><div class=\"wp-block-media-text__content\">\n<h2 class=\"wp-block-heading\" id=\"giving-every-language-a-place-in-the-digital-world\">Giving every language a place in the digital world<\/h2>\n\n\n\n<p>Language is more than communication\u2014it sustains <strong>culture, identity, and opportunity<\/strong>. Yet most of the world\u2019s languages remain underrepresented in today\u2019s AI systems because they have limited digital content, few high-quality datasets, and little benchmark data to measure progress.<\/p>\n\n\n\n<p>Our work focuses on practical, partner-driven pathways to make modern AI usable and safer in low-resource settings\u2014combining data stewardship, evaluation benchmarks, translation tools, and adaptable training workflows that can be reused across languages and contexts.<\/p>\n<\/div><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-1024x576.jpg\" alt=\"diagram, schematic for languages available on the internet\" class=\"wp-image-1160610 size-full\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-1536x864.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-2048x1152.jpg 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/Underrepresented-Languages-1920x1080.jpg 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<div class=\"wp-block-media-text has-vertical-margin-small  has-vertical-padding-none  is-stacked-on-mobile\" style=\"grid-template-columns:40% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-1024x576.jpg\" alt=\"LINGUA Open Call | abstract pattern of chat bubbles\" class=\"wp-image-1149976 size-full\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/LINGUA-program_hero-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<h2 class=\"wp-block-heading\" id=\"microsoft-ai-for-good-lab-lingua-awardees-announced\">Microsoft AI for Good Lab LINGUA awardees announced<\/h2>\n\n\n\n<p>The Microsoft AI for Good Lab has announced the awardees of <strong>LINGUA: Expanding Europe\u2019s Voices in AI<\/strong>, an open call supporting ethical, open dataset creation for European languages underrepresented in digital spaces and AI systems.<\/p>\n\n\n\n<p>The selected projects span 16 languages and dialects across 10 countries, representing a diverse mix of low\u2011resource, vulnerable, and underrepresented linguistic communities. Led by universities, nonprofits, a government language center and public broadcaster, the awardees are advancing multilingual AI by expanding access to speech and text data and strengthening Europe\u2019s linguistic diversity.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/academic-program\/lingua-expanding-europes-voices-in-ai\/\">Program overview & awardees<\/a><\/div>\n<\/div>\n<\/div><\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<div class=\"wp-block-columns are-vertically-aligned-top is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-top is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"vdwaVHcJKEQ\" data-poster=\"https:\/\/img.youtube.com\/vi\/vdwaVHcJKEQ\/maxresdefault.jpg\"><iframe aria-hidden=\"true\" tabindex=\"-1\" title=\"Supporting\u00a0endangered languages\" width=\"500\" height=\"281\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/vdwaVHcJKEQ?start=14&feature=oembed&rel=0&enablejsapi=1\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div>\n<\/div><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"over-2-000-languages-are-at-risk-of-disappearing-1\">Over 2,000 languages are at risk of disappearing<\/h3>\n\n\n\n<p>In the age of AI, the inclusion of all languages is essential for communities and culture. Learn how AI is being used to help preserve and expand access to low resource languages like Inuktitut.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/unlocked.microsoft.com\/llms-nunavut\/\" target=\"_blank\" rel=\"noreferrer noopener\">Read about Inuktitut<\/a><\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-top is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"942\" height=\"530\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720.png\" alt=\"Bring your own language (BYOL) scatter chart showing large language resource allocation\" class=\"wp-image-1160931\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720.png 942w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2026\/01\/BYOL_language-scatter-chart_1280x720-640x360.png 640w\" sizes=\"auto, (max-width: 942px) 100vw, 942px\" \/><\/figure>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"byol-bring-your-own-language-into-llms\">BYOL: Bring Your Own Language into LLMs<\/h3>\n\n\n\n<p>For low\u2011resource languages, the Bring Your Own Language (BYOL) framework uses data refinement, synthetic generation, and fine\u2011tuning to build language\u2011specific models that outperform multilingual baselines.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/arxiv.org\/pdf\/2601.10804\">Read the paper<\/a><\/div>\n\n\n\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/github.com\/microsoft\/byol\" target=\"_blank\" rel=\"noreferrer noopener\">Access the code<\/a><\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n","protected":false},"excerpt":{"rendered":"<p>< AI for Good Lab Language is more than communication\u2014it sustains culture, identity, and opportunity. Yet most of the world\u2019s languages remain underrepresented in today\u2019s AI systems because they have limited digital content, few high-quality datasets, and little benchmark data to measure progress. Our work focuses on practical, partner-driven pathways to make modern AI usable [&hellip;]\n<\/p>\n","protected":false},"featured_media":1161055,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556,13545],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1160590","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Ursula Hardy","user_id":42708,"people_section":"Section name 0","alias":"hardyurs"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1160590","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":33,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1160590\/revisions"}],"predecessor-version":[{"id":1161057,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1160590\/revisions\/1161057"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1161055"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1160590"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1160590"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1160590"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1160590"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1160590"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}