뉴스 & 기능
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages
| Mercy Muchai, Kevin Chege, Nick Mumero, 그리고 Stephanie Nyairo
Microsoft Research unveils Paza, a human-centered speech pipeline, and PazaBench, the first leaderboard for low-resource languages. It covers 39 African languages and 52 models and is tested with communities in real settings.
MMCTAgent: Enabling multimodal reasoning over large video and image collections
| Akshay Nambi, Kavyansh Chourasia, 그리고 Tanuja Ganu
MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.
뉴스에서 | Microsoft Research Story
Advancing AI to meet needs of the global majority
AI tools can perform poorly in non-Western languages and lack critical cultural context for many populations. Project Gecko uses small language models to bring vital expertise to farmers in underserved areas using local languages and multi-modal content.