AI for low resource languages - Inuktitut Syllabics

Bridging the language gap

< AI for Good Lab

Giving every language a place in the digital world

Language is more than communication—it sustains culture, identity, and opportunity. Yet most of the world’s languages remain underrepresented in today’s AI systems because they have limited digital content, few high-quality datasets, and little benchmark data to measure progress.

Our work focuses on practical, partner-driven pathways to make modern AI usable and safer in low-resource settings—combining data stewardship, evaluation benchmarks, translation tools, and adaptable training workflows that can be reused across languages and contexts.

diagram, schematic for languages available on the internet
LINGUA Open Call | abstract pattern of chat bubbles

Microsoft AI for Good Lab LINGUA awardees announced

The Microsoft AI for Good Lab has announced the awardees of LINGUA: Expanding Europe’s Voices in AI, an open call supporting ethical, open dataset creation for European languages underrepresented in digital spaces and AI systems.

The selected projects span 16 languages and dialects across 10 countries, representing a diverse mix of low‑resource, vulnerable, and underrepresented linguistic communities. Led by universities, nonprofits, a government language center and public broadcaster, the awardees are advancing multilingual AI by expanding access to speech and text data and strengthening Europe’s linguistic diversity.


Over 2,000 languages are at risk of disappearing

In the age of AI, the inclusion of all languages is essential for communities and culture. Learn how AI is being used to help preserve and expand access to low resource languages like Inuktitut.

Bring your own language (BYOL) scatter chart showing large language resource allocation

BYOL: Bring Your Own Language into LLMs

For low‑resource languages, the Bring Your Own Language (BYOL) framework uses data refinement, synthetic generation, and fine‑tuning to build language‑specific models that outperform multilingual baselines.