News & features
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages
| Mercy Muchai, Kevin Chege, Nick Mumero, and Stephanie Nyairo
Microsoft Research unveils Paza, a human-centered speech pipeline, and PazaBench, the first leaderboard for low-resource languages. It covers 39 African languages and 52 models and is tested with communities in real settings.
MMCTAgent: Enabling multimodal reasoning over large video and image collections
| Akshay Nambi, Kavyansh Chourasia, and Tanuja Ganu
MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.
In the news | Microsoft Research Story
Advancing AI to meet needs of the global majority
AI tools can perform poorly in non-Western languages and lack critical cultural context for many populations. Project Gecko uses small language models to bring vital expertise to farmers in underserved areas using local languages and multi-modal content.