ELaTE
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like ELaTE is a zero-shot text-to-speech (TTS) system that can generate natural laughing speech from any speaker based on a speaker prompt to mimic the voice characteristic, a…
An Interactive Agent Foundation Model
Orca-2-13B
Orca 2 is a finetuned version of LLAMA-2. It is built for research purposes only and provides a single turn response in tasks such as reasoning over user given data, reading comprehension, math problem solving…
Orca-2-7B
Orca 2 is a finetuned version of LLAMA-2. It is built for research purposes only and provides a single turn response in tasks such as reasoning over user given data, reading comprehension, math problem solving…
Research Focus: Week of January 22, 2024
Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. Join Microsoft Research Forum (opens in new tab)…
Deep Learning Acoustic Model in Microsoft Cortana Voice Assistant
Deep Learning Acoustic Modeling has been widely deployed to real-world speech recognition products and services that benefit millions of users. In this talk, I will first briefly describe selected developments and investigations at Microsoft to…
TaskWeaver: A code-first agent framework for efficient data analytics and domain adaptation
AI-backed virtual assistants face challenges in handling complex data structures. TaskWeaver helps users build assistants that understand diverse domain questions, follow examples, and efficiently execute customizable algorithms on complex data structures.
Research Focus: Week of January 8, 2024
Mixture-of-linear-experts for long-term time series forecasting; Weakly-supervised streaming multilingual speech model with truly zero-shot capability; KBFormer: Diffusion model for structured entity completion; Identifying risks of AI-mediated data access: