作者:刘树杰 编者按:文本到语音合成(Text-to-Speech,TTS)是一种将书面文字转化为自然语音的技术,在提高无障碍性、增强跨语言交流等方面发挥着重要作用。微软亚洲研究院此前推出了第一个离散编码的语音大模型 VALL-E,并在此基础上通过重复感知采样和分组编码建模技术将其升级为 VALL-E 2 版本。新版本突破了语音稳健性、自然度和说话人相似度方面的界限,让零样本 TTS 性能在 Li...
| Robert Osazuwa Ness
Medfuzz tests LLMs by breaking benchmark assumptions, exposing vulnerabilities to bolster real-world accuracy.
Author: Shujie Liu In recent years, the rapid advancement of AI has continually expanded the capabilities of Text-to-Speech (TTS) technology. Ongoing optimizations and innovations in TTS have enriched and simplified voice interaction experiences. These research developments hold significant potential across…
| Alonso Guevara Fernández, Katy Smith, Joshua Bradley, Darren Edge, Ha Trinh, Sarah Smith, Ben Cutler, Steven Truitt, and Jonathan Larson
GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.
| Gretchen Huizinga, Richard Black, and Dexter Greene
College freshman Dexter Greene and Microsoft research manager Richard Black discuss how technology that stores data in glass is supporting students as they expand earlier efforts to communicate what it means to be human to extraterrestrials.
In the news | GeekWire
Forty-seven years after NASA sent a “Golden Record” into deep space to document humanity’s view of the world, Microsoft’s Project Silica is teaming up with a citizen-science effort to lay the groundwork — or, more aptly, the glasswork — for…
In the news | GZERO
In a world where humanity put a man on the moon before adding wheels to luggage, the rapid advancements in AI seem almost paradoxical. Microsoft’s chief data scientist Juan Lavista, in a recent Global Stage conversation with Tony Maciulis, highlighted this contrast…
编者按:当前多模态模型大致分为两类,一类是专用多模态模型,如文本生成图像、文本生成视频等;另一类则是通用型多模态大语言模型,这类模型的目标是让人工智能具备自然语言理解和生成、图像识别,以及语音和视频的交互能力。近日,微软亚洲研究院又提供了一个新的选择——原生多模态大语言模型。它能够更深入地理解物理世界并执行多模态推理和跨模态迁移,其在不同模态的数据学习中还涌现出了新的能力。 随着人工智能技术的持续...
Because of their probabilistic nature, all AI systems will make mistakes. One of the main challenges in human-AI interaction is to foster appropriate reliance on AI and empower users of AI systems to determine when to accept or not accept…