Nouvelles et reportages
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages
| Mercy Muchai, Kevin Chege, Nick Mumero, et Stephanie Nyairo
Microsoft Research unveils Paza, a human-centered speech pipeline, and PazaBench, the first leaderboard for low-resource languages. It covers 39 African languages and 52 models and is tested with communities in real settings.
MMCTAgent: Enabling multimodal reasoning over large video and image collections
| Akshay Nambi, Kavyansh Chourasia, et Tanuja Ganu
MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.
Dans l’actualité | Microsoft Research Story
Advancing AI to meet needs of the global majority
AI tools can perform poorly in non-Western languages and lack critical cultural context for many populations. Project Gecko uses small language models to bring vital expertise to farmers in underserved areas using local languages and multi-modal content.