Articles

VALL-E 2，大幅提升语音大模型的稳健性与自然度

September 10, 2024

作者：刘树杰编者按：文本到语音合成（Text-to-Speech，TTS）是一种将书面文字转化为自然语音的技术，在提高无障碍性、增强跨语言交流等方面发挥着重要作用。微软亚洲研究院此前推出了第一个离散编码的语音大模型 VALL-E，并在此基础上通过重复感知采样和分组编码建模技术将其升级为 VALL-E 2 版本。新版本突破了语音稳健性、自然度和说话人相似度方面的界限，让零样本 TTS 性能在 Li...

Microsoft Research Blog

MedFuzz: Exploring the robustness of LLMs on medical challenge problems

September 10, 2024 | Robert Osazuwa Ness

Medfuzz tests LLMs by breaking benchmark assumptions, exposing vulnerabilities to bolster real-world accuracy.

Articles

VALL-E 2: Enhancing the robustness and naturalness of text-to-speech models

September 10, 2024

Author: Shujie Liu In recent years, the rapid advancement of AI has continually expanded the capabilities of Text-to-Speech (TTS) technology. Ongoing optimizations and innovations in TTS have enriched and simplified voice interaction experiences. These research developments hold significant potential across…

GraphRAG image on blue to green gradient

Microsoft Research Blog

GraphRAG auto-tuning provides rapid adaptation to new domains

September 9, 2024 | Alonso Guevara Fernández, Katy Smith, Joshua Bradley, Darren Edge, Ha Trinh, Sarah Smith, Ben Cutler, Steven Truitt, and Jonathan Larson

GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.

Microsoft Research Podcast

Collaborators: Silica in space with Richard Black and Dexter Greene

September 5, 2024 | Gretchen Huizinga, Richard Black, and Dexter Greene

College freshman Dexter Greene and Microsoft research manager Richard Black discuss how technology that stores data in glass is supporting students as they expand earlier efforts to communicate what it means to be human to extraterrestrials.

Silica glass platter containing descriptive instructions related to the Avenues Golden Record 2.0 project

In the news | GeekWire

Microsoft joins with students to document humanity with a ‘Golden Record’ of glass

September 5, 2024

Forty-seven years after NASA sent a “Golden Record” into deep space to document humanity’s view of the world, Microsoft’s Project Silica is teaming up with a citizen-science effort to lay the groundwork — or, more aptly, the glasswork — for…

In the news | GZERO

AI’s evolving role in society

September 4, 2024

In a world where humanity put a man on the moon before adding wheels to luggage, the rapid advancements in AI seem almost paradoxical. Microsoft’s chief data scientist Juan Lavista, in a recent Global Stage conversation with Tony Maciulis, highlighted this contrast…

Articles

跨越模态边界，探索原生多模态大语言模型

September 3, 2024

编者按：当前多模态模型大致分为两类，一类是专用多模态模型，如文本生成图像、文本生成视频等；另一类则是通用型多模态大语言模型，这类模型的目标是让人工智能具备自然语言理解和生成、图像识别，以及语音和视频的交互能力。近日，微软亚洲研究院又提供了一个新的选择——原生多模态大语言模型。它能够更深入地理解物理世界并执行多模态推理和跨模态迁移，其在不同模态的数据学习中还涌现出了新的能力。随着人工智能技术的持续...

Research Forum | Episode 4 Talk 5 | Mihaela Vorvoreanu

Articles

Fostering appropriate reliance on AI

September 3, 2024

Because of their probabilistic nature, all AI systems will make mistakes. One of the main challenges in human-AI interaction is to foster appropriate reliance on AI and empower users of AI systems to determine when to accept or not accept…