MSR Africa Lab - photo of the Nairobi skyline

Microsoft Research Lab – Africa, Nairobi

headshot photo of speaker Alfred Marengo Condor

February Seminar

Beyond Swahili: Designing Inclusive AI for Bantu Languages

Join us on Wednesday February 18 at 3pm EAT for a talk by Alfred Malengo Kondoro

Swahili has become one of the most consistently represented African languages in modern AI benchmarks, spanning machine translation, language modeling, and multilingual evaluation suites, far exceeding the coverage of any other Bantu language. This prominence reflects its scale, standardization, and regional reach, but it also exposes the structural challenges of building AI for Bantu languages, including rich morphology, pervasive code-switching, and highly uneven data availability.

In this talk, Alfred will outline how these factors have shaped Swahili’s development within contemporary AI systems, showing why direct transfer from dominant global languages often fails to capture Bantu linguistic structure. Drawing on work in benchmarking, dataset creation, and cross‑lingual modelling, he will illustrate how Swahili provides a technically viable bridge for Bantu languages in machine translation, representation learning, and multilingual evaluation—an approach less tractable through non‑Bantu pivot languages. The talk shall conclude with a discussion on how Swahili can be used responsibly as a bridge rather than a proxy. This would enable scalable cross-language transfer while avoiding the erasure of linguistic diversity across the Bantu language family.

Past seminars