{"id":1119384,"date":"2025-02-20T05:34:34","date_gmt":"2025-02-20T13:34:34","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=1119384"},"modified":"2026-03-27T00:39:20","modified_gmt":"2026-03-27T07:39:20","slug":"project-gecko","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-gecko\/","title":{"rendered":"Project Gecko"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background-gable-green card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"2048\" height=\"1365\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804.jpg\" class=\"attachment-full size-full\" alt=\"Image of peppers growing on a vine.\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804.jpg 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804-300x200.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804-1024x683.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804-768x512.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804-1536x1024.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/DSC05804-240x160.jpg 240w\" sizes=\"auto, (max-width: 2048px) 100vw, 2048px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"project-gecko\">Project Gecko<\/h1>\n\n\n\n<p><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>As generative AI transforms productivity and access to knowledge, its benefits must extend beyond English-speaking and Western-centric contexts. Historically, each industrial revolution has introduced equity gaps profoundly impacting societal divides. In this fourth industrial revolution driven by AI, there is a valid concern that these disparities could widen further. The challenge lies in how this technology navigates the complexities of non-English languages and cultures. However, at Microsoft Research we are committed to building AI models and copilots that are equitable by design, capable of serving diverse linguistic and cultural contexts, and focused on closing, not widening, global opportunity gaps.<\/p>\n\n\n\n<p>Project Gecko is a cross-lab Microsoft Research initiative advancing the Equitable Generative AI Project by developing equitable models and trusted multilingual copilots\u2014designed for population scale use that can serve entire communities.<\/p>\n\n\n\n<p>Co-led by the Microsoft Research Accelerator, Microsoft Research India, and Microsoft Research Africa, Nairobi, Gecko explores how AI can be adapted and deployed widely in low-resource settings. The project integrates research in multilingual small and large language models, multilingual speech models, synthetic data creation for low resource settings, HCI and grounded evaluation. To do this we deploy a blend of NLP, ethnographic design, and machine learning methods to create new multidisciplinary paradigms for building human-centered AI\u2014 with real-world deployments spanning agriculture and education.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"why-this-matters\">Why this matters:<\/h2>\n\n\n\n<p>Large language models have the potential to transform how people access information and services. Yet, most are trained on high-resource languages and reflect dominant cultural contexts. Gecko seeks to reverse this dynamic \u2014 building AI systems from the ground up, shaped by the knowledge, languages, and modalities of the global majority. Achieving population-scale impact requires a fundamental rethinking of how AI is localized, evaluated, and deployed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"core-research-areas\">Core research areas:<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Multimodal grounding using trusted community videos<\/li>\n\n\n\n<li>Speech-first interfaces for oral-first language users<\/li>\n\n\n\n<li>Local adaptation of LLMs, SLMs, and speech models to support a range of languages (including Kiswahili, Hindi, Kikuyu) and cultural contexts in India and East Africa<\/li>\n\n\n\n<li>Synthetic data creation for low resource settings<\/li>\n\n\n\n<li>Human-centered evaluation frameworks for trust, relevance, and cultural alignment<\/li>\n\n\n\n<li>Ethnomethodologically-informed research and design to co-create AI experiences with local users<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"initial-focus-1\">Initial focus:<\/h2>\n\n\n\n<p>The initial phase of Project Gecko centers on agriculture in East Africa and South Asia, where we are investigating how trusted, multilingual Copilots can operate at population scale. In partnership with Microsoft Research Africa, Nairobi, Microsoft Research India, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/digitalgreen.org\/\">Digital Green<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> \u2014 a global development organization that partners with governments and grassroots organizations to build community-driven digital infrastructure for agriculture \u2014 we are exploring the infrastructure, modeling strategies (particularly around small language models), and evaluation protocols necessary to enable scaled deployment.<\/p>\n\n\n\n<p>This milestone involves <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/farmerchat.digitalgreen.org\/\">Farmer.Chat<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, an AI-powered web app assistant developed by Digital Green that enables smallholder farmers to engage with community-contributed agricultural video content via a speech-first interface tailored for use in Kiswahili and Kikuyu.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--1\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/story\/advancing-ai-to-meet-needs-of-the-global-majority\/\">Read the story<\/a><\/div>\n<\/div>\n\n\n\n<p>Initial pilots in Kenya demonstrate measurable improvements in response quality, usability, and user trust\u2014offering early signals for how community-grounded, multilingual Copilots might perform in similar contexts.<\/p>\n\n\n\n<p>Through this effort, we are also evaluating how the VeLLM platform can serve as a replicable playbook for grounded Copilot development. By analyzing what works in this agricultural context, we aim to identify generalizable design patterns, tools, and infrastructure that can be extended to future domains such as education and health \u2014 helping to enable scalable, locally relevant Copilots across a wide range of linguistic and cultural settings.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"our-mission\">Our mission:<\/h2>\n\n\n\n<p>To accelerate generative AI adoption in regions where low-resource languages, oral knowledge, and community media are central to daily life.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"platform-foundation\">Platform foundation:<\/h2>\n\n\n\n<p>Project Gecko is built on <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-vellm\/\">VeLLM<\/a> (uniVersal Empowerment with LLMs)\u2014a platform developed by Microsoft Research India that supports:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Multilingual and multimodal Copilot development<\/li>\n\n\n\n<li>Grounding in culturally contextual and community-contributed data<\/li>\n\n\n\n<li>Principled evaluation of trust, utility, and equity<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"key-highlights\">Key highlights:<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Target Audience:<\/strong>&nbsp;Project Gecko focuses on enabling population-scale Copilots for the global majority \u2014 prioritizing support for low-resource languages and content grounded in oral and video-based knowledge.<\/li>\n\n\n\n<li><strong>Core Platform:<\/strong> Project Gecko leverages VeLLM, a platform developed by MSR India, to support multilingual, multimodal Copilot creation grounded in culturally relevant data. VeLLM is designed to be a replicable foundation for Copilots across domains and geographies.<\/li>\n\n\n\n<li><strong>Initial Phase:<\/strong> The first milestone is centered in East Africa and South Asia, in collaboration with Microsoft Research Africa, Microsoft Research India, and Digital Green. The pilot explores agricultural Copilots using community video, small language models, and speech-first interfaces via the farmer.chat app.<\/li>\n\n\n\n<li><strong>Research Focus:<\/strong> Spans applied research in multimodal retrieval, small language models, grounding and trust evaluation, and participatory design. The goal is to identify scalable patterns for Copilot development across underserved contexts.<\/li>\n\n\n\n<li><strong>Looking Ahead: <\/strong>Project Gecko reflects a broader commitment within Microsoft Research: to ensure that the next generation of AI is not only powerful \u2014 but globally inclusive, culturally relevant, and shaped by the communities it aims to serve.<\/li>\n<\/ul>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n\n\n<div style=\"height:40px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading has-text-align-center\" id=\"toolkits-for-building-more-equitable-ai-systems\">Toolkits for Building More Equitable AI Systems<\/h2>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"has-text-align-center\" id=\"explore-our-practical-guides-learnings-and-frameworks-for-deploying-human-centred-ai-solutions-across-cultures-building-robust-speech-models-and-creating-multi-lingual-and-multi-cultural-ai-systems\">Explore our practical guides, learnings, and frameworks for deploying human-centred AI solutions across cultures, building robust speech models, and creating multi-lingual and multi-cultural AI systems.<\/p>\n\n\n\n<div style=\"height:40px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<div class=\"wp-block-cover\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" class=\"wp-block-cover__image-background wp-image-1162073 size-large\" alt=\"shape, rectangle\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Atlas-Background-1024x640.png\" data-object-fit=\"cover\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Atlas-Background-1024x640.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Atlas-Background-300x188.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Atlas-Background-768x480.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Atlas-Background-240x150.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Atlas-Background.png 1440w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><span aria-hidden=\"true\" class=\"wp-block-cover__background has-background-dim\" style=\"background-color:#9f1459\"><\/span><div class=\"wp-block-cover__inner-container is-layout-constrained wp-block-cover-is-layout-constrained\">\n<h4 class=\"wp-block-heading has-text-align-center\" id=\"atlas\">Atlas<\/h4>\n\n\n\n<p class=\"has-text-align-center\">Deploy human-centred AI tools across cultures<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link has-text-align-center wp-element-button\" href=\"https:\/\/microsoft.github.io\/AtlasPlaybook\/\" target=\"_blank\" rel=\"noreferrer noopener\">View Atlas<\/a><\/div>\n<\/div>\n<\/div><\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<div class=\"wp-block-cover\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" class=\"wp-block-cover__image-background wp-image-1162077 size-large\" alt=\"shape, rectangle\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Paza-Background-1024x640.png\" data-object-fit=\"cover\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Paza-Background-1024x640.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Paza-Background-300x188.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Paza-Background-768x480.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Paza-Background-240x150.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Paza-Background.png 1440w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><span aria-hidden=\"true\" class=\"wp-block-cover__background has-background-dim\" style=\"background-color:#0e5a5d\"><\/span><div class=\"wp-block-cover__inner-container is-layout-constrained wp-block-cover-is-layout-constrained\">\n<h4 class=\"wp-block-heading has-text-align-center\" id=\"paza\">Paza<\/h4>\n\n\n\n<p class=\"has-text-align-center\">Build robust speech models<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/microsoft.github.io\/Paza\/\" target=\"_blank\" rel=\"noreferrer noopener\">View Paza<\/a><\/div>\n<\/div>\n<\/div><\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<div class=\"wp-block-cover\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" class=\"wp-block-cover__image-background wp-image-1162079 size-large\" alt=\"shape\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Vibhasha-Background-1024x640.png\" data-object-fit=\"cover\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Vibhasha-Background-1024x640.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Vibhasha-Background-300x188.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Vibhasha-Background-768x480.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Vibhasha-Background-240x150.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/02\/Vibhasha-Background.png 1440w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><span aria-hidden=\"true\" class=\"wp-block-cover__background has-background-dim\" style=\"background-color:#312a9a\"><\/span><div class=\"wp-block-cover__inner-container is-layout-constrained wp-block-cover-is-layout-constrained\">\n<h4 class=\"wp-block-heading has-text-align-center\" id=\"vibhasha\">Vibhasha<\/h4>\n\n\n\n<p class=\"has-text-align-center\">Build multi-lingual and multi-cultural AI systems<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"http:\/\/aka.ms\/vibhasha\" target=\"_blank\" rel=\"noreferrer noopener\">View Vibhasha<\/a><\/div>\n<\/div>\n<\/div><\/div>\n<\/div>\n<\/div>\n\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"mmctagent-enabling-multimodal-reasoning-over-large-video-and-image-collections\">MMCTAgent: Enabling multimodal reasoning over large video and image collections<\/h2>\n\n\n\n<p>Modern multimodal AI models can recognize objects, describe scenes, and answer questions about images and short video clips, but they struggle with long-form and large-scale visual data, where real-world reasoning requires moving beyond object recognition and short-clip analysis. Existing models typically perform single-pass inference, producing one-shot answers. This limits their ability to handle tasks that require temporal reasoning, cross-modal grounding, and iterative refinement.<\/p>\n\n\n\n<p>To meet these challenges, we developed the&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/mmctagent-multi-modal-critical-thinking-agent-framework-for-complex-visual-reasoning\/?msockid=153992cb7df169482b9487167c0968e9\" target=\"_blank\" rel=\"noreferrer noopener\">Multi-modal Critical Thinking Agent<\/a>, or MMCTAgent, for structured reasoning over long-form video and image data, available on&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/MMCTAgent\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;and featured on&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/labs.ai.azure.com\/projects\/mmct-agent\/\" target=\"_blank\" rel=\"noopener noreferrer\">Azure AI Foundry Labs<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<p>Built on&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/autogen\" target=\"_blank\" rel=\"noreferrer noopener\">AutoGen<\/a>, Microsoft\u2019s open-source multi-agent system, MMCTAgent provides multimodal question-answering with a Planner\u2013Critic architecture. This design enables planning, reflection, and tool-based reasoning, bridging perception and deliberation in multimodal tasks. It links language, vision, and temporal understanding, transforming static multimodal tasks into dynamic reasoning workflows.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Unlike conventional models that produce one-shot answers, MMCTAgent has modality-specific agents, including ImageAgent and VideoAgent, which include tools like get_relevant_query_frames() or object_detection-tool(). These agents perform&nbsp;deliberate, iterative reasoning\u2014selecting the right tools for each modality, evaluating intermediate results, and refining conclusions through a Critic loop. This enables MMCTAgent to analyze complex queries across long videos and large image libraries with explainability, extensibility, and scalability.<br><\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/labs.ai.azure.com\/projects\/mmct-agent\/\" target=\"_blank\" rel=\"noreferrer noopener\">MMCTAgent on Azure AI Foundry Labs<\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:32px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"pazabench-the-first-asr-leaderboard-for-low-resource-languages-1\">PazaBench: The first ASR leaderboard for low-resource languages<\/h2>\n\n\n\n<p>PazaBench&nbsp;is the first automatic speech recognition (ASR) leaderboard dedicated to low\u2011resource languages. It launches&nbsp;with&nbsp;initial&nbsp;coverage&nbsp;for&nbsp;39 African languages and benchmarks&nbsp;52 state\u2011of\u2011the\u2011art ASR and language models, including newly released Paza ASR models for six Kenyan languages. The platform aggregates leading public and community datasets from diverse styles of speech including conversational, scripted read aloud, unscripted, broadcast news, and domain-specific data\u2014into one easy\u2011to\u2011explore platform per language. This makes it easier for&nbsp;researchers, developers, and product teams to easily assess which models perform best across underserved languages and diverse regions, understand trade-offs between speed and accuracy&nbsp;while&nbsp;identifying&nbsp;where gaps persist.&nbsp;<\/p>\n\n\n\n<div style=\"height:32px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"PazaBench Walkthrough\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/jAuuh0saMUI?feature=oembed&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<div style=\"height:32px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><strong>PazaBench tracks three core metrics:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Character Error Rate (CER)<\/strong>&nbsp;which is important for languages with rich word forms, where meaning is built by combining word parts, therefore errors at the character level can significantly impact meaning<\/li>\n\n\n\n<li><strong>Word Error Rate (WER)<\/strong>&nbsp;for word-level transcript accuracy<\/li>\n\n\n\n<li><strong>RTFx (Inverse Real\u2011Time Factor)<\/strong>&nbsp;which measures how fast transcription runs relative to real\u2011time audio duration<em>.<\/em><\/li>\n<\/ol>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/huggingface.co\/spaces\/microsoft\/paza-bench\" target=\"_blank\" rel=\"noreferrer noopener\">Explore PazaBench<\/a><\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"paza-asr-models-built-with-and-for-kenyan-languages\">Paza ASR Models: Built with and for Kenyan languages<\/h2>\n\n\n\n<p>The Paza ASR models&nbsp;consist of&nbsp;three fine-tuned ASR models built on top of state\u2011of\u2011the\u2011art model architectures. Each model targets&nbsp;<em>Swahili,&nbsp;<\/em>a mid-resource language and five low\u2011resource Kenyan languages;&nbsp;<em>Dholuo, Kalenjin, Kikuyu, Maasai&nbsp;and Somali<\/em>.&nbsp;The models are&nbsp;fine-tuned&nbsp;on&nbsp;public and curated proprietary datasets.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Fine\u2011tuning the three models allowed us to explore supportive approaches toward a shared goal: building speech recognition systems that are usable for local contexts starting with the six Kenyan languages and bridging the gaps of multi-lingual and multi-modal video question and answering through the&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/labs.ai.azure.com\/projects\/mmct-agent\/\" target=\"_blank\" rel=\"noopener noreferrer\">MMCT agent.<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n\n\n\n<p>Early versions of two models in Kikuyu and Swahili were deployed on mobile devices and tested directly with farmers in real\u2011world settings, enabling the team to observe how the models performed with everyday use.his feedback loop directly informed subsequent fine\u2011tuning, ensuring model improvements were driven not only by benchmark scores, but by the needs and expectations of the communities they are intended to serve.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/huggingface.co\/collections\/microsoft\/paza\" target=\"_blank\" rel=\"noreferrer noopener\">Explore Paza Model Collection<\/a><\/div>\n<\/div>\n\n\n\n<p><\/p>\n\n\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As generative AI transforms productivity and access to knowledge, its benefits must extend beyond English-speaking and Western-centric contexts. Historically, each industrial revolution has introduced equity gaps profoundly impacting societal divides. In this fourth industrial revolution driven by AI, there is a valid concern that these disparities could widen further. The challenge lies in how this [&hellip;]<\/p>\n","protected":false},"featured_media":1136473,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556,13545,13554,13559,13568],"msr-locale":[268875],"msr-impact-theme":[261667],"msr-pillar":[],"class_list":["post-1119384","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-research-area-human-computer-interaction","msr-research-area-social-sciences","msr-research-area-technology-for-emerging-markets","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"2024-07-01","related-publications":[963351,1083309,1151247],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[1140561],"related-opportunities":[],"related-posts":[1153693,1160691],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Ade Famoti","user_id":43005,"people_section":"Section name 0","alias":"adfamoti"},{"type":"user_nicename","display_name":"Tanuja Ganu","user_id":38883,"people_section":"Section name 0","alias":"taganu"},{"type":"user_nicename","display_name":"Kalika Bali","user_id":32477,"people_section":"Section name 0","alias":"kalikab"},{"type":"user_nicename","display_name":"Jacki O&#039;Neill","user_id":32172,"people_section":"Section name 0","alias":"jaoneil"},{"type":"user_nicename","display_name":"Sunayana Sitaram","user_id":37287,"people_section":"Section name 0","alias":"susitara"},{"type":"user_nicename","display_name":"Akshay Nambi","user_id":38169,"people_section":"Section name 0","alias":"akshayn"},{"type":"user_nicename","display_name":"Ogbemi Ekwejunor-Etchie","user_id":43797,"people_section":"Section name 0","alias":"ogbemie"},{"type":"user_nicename","display_name":"Mercy Muchai","user_id":40846,"people_section":"Section name 0","alias":"mercymuchai"},{"type":"user_nicename","display_name":"Samuel Chege Maina","user_id":40321,"people_section":"Section name 0","alias":"samuelmaina"},{"type":"user_nicename","display_name":"Stephanie Nyairo","user_id":40282,"people_section":"Section name 0","alias":"snyairo"},{"type":"guest","display_name":"Kevin Chege","user_id":1162212,"people_section":"Section name 0","alias":""},{"type":"guest","display_name":"Nick Mumero","user_id":1162214,"people_section":"Section name 0","alias":""},{"type":"guest","display_name":"Prashant Kodali","user_id":1162218,"people_section":"Section name 0","alias":""},{"type":"guest","display_name":"Liz  Ankrah","user_id":1162216,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Millicent Ochieng","user_id":40678,"people_section":"Section name 0","alias":"mochieng"},{"type":"user_nicename","display_name":"Kavyansh Chourasia","user_id":43029,"people_section":"Section name 0","alias":"kchourasia"},{"type":"user_nicename","display_name":"Najeeb G. Abdulhamid","user_id":40894,"people_section":"Section name 0","alias":"nabdulhamid"},{"type":"user_nicename","display_name":"Amber Tingle","user_id":42681,"people_section":"Section name 0","alias":"ambertingle"}],"msr_research_lab":[199562,199565,1021599],"msr_impact_theme":["Empowerment"],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1119384","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":38,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1119384\/revisions"}],"predecessor-version":[{"id":1166907,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1119384\/revisions\/1166907"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1136473"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1119384"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1119384"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1119384"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1119384"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1119384"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}