Loading...
多模态大语言模型
Articles

跨越模态边界,探索原生多模态大语言模型 

September 3, 2024

编者按:当前多模态模型大致分为两类,一类是专用多模态模型,如文本生成图像、文本生成视频等;另一类则是通用型多模态大语言模型,这类模型的目标是让人工智能具备自然语言理解和生成、图像识别,以及语音和视频的交互能力。近日,微软亚洲研究院又提供了一个新的选择——原生多模态大语言模型。它能够更深入地理解物理世界并执行多模态推理和跨模态迁移,其在不同模态的数据学习中还涌现出了新的能力。 随着人工智能技术的持续...

Research Forum | Episode 4 Talk 5 | Mihaela Vorvoreanu
Articles

Fostering appropriate reliance on AI 

September 3, 2024

Because of their probabilistic nature, all AI systems will make mistakes. One of the main challenges in human-AI interaction is to foster appropriate reliance on AI and empower users of AI systems to determine when to accept or not accept…

Research Forum | Episode 4 Talk 4 | Kevin Yang
Articles

A generative model of biology for in-silico experimentation and discovery 

September 3, 2024

This talk discusses how deep learning is enabling us to generate novel and useful biomolecules, allowing researchers and practitioners to better understand biology.

Research Forum | Episode 4 Talk 3 | Megan Stanley
Articles

Project Aurora: The first large-scale foundation model of the atmosphere 

September 3, 2024

This talk discusses Aurora, a cutting-edge foundation model that offers a new approach to weather forecasting that could transform our ability to predict and mitigate the impacts of extreme events, air pollution, and the changing climate.

Research Forum | Episode 4 Talk 2 | Corby Rosset
Articles

Direct Nash Optimization: Teaching language models to self-improve with general preferences 

September 3, 2024

This talk discusses teaching language models to self-improve using a preference oracle like GPT-4, framing it as a two-player game to find an optimal policy at a Nash equilibrium, and achieving state-of-the-art win rates against GPT-4 Turbo on benchmarks such…

Research Forum | Episode 4 Talk 1 | Francesca Parmigiani and Jiaqi Chu
Articles

Analog optical computing for sustainable AI and beyond 

September 3, 2024

This talk discusses a new kind of computer—an analog optical computer—that has the potential to accelerate AI inference and hard optimization workloads by 100x, leveraging hardware-software co-design to improve the efficiency and sustainability of real-world applications.

Research Forum | Episode 4 Panel | John Langford, Hoifung Poon, Katja Hofmann, Jianwei Yang
Articles

Panel Discussion: Beyond Language: The future of multimodal models in healthcare, gaming, and AI 

September 3, 2024

Microsoft researchers John Langford, Hoifung Poon, Katja Hofmann, and Jianwei Yang share their thoughts on future directions, bridging gaps, and fostering synergies within the field. 

Research Forum | Episode 4 Keynote | Jianfeng Gao
Articles

Keynote: Phi-3-Vision: A highly capable and “small” language vision model 

September 3, 2024

This talk introduces Phi-3-Vision, an advanced and economical open-source multimodal model. As a member of the Phi-3 model family, Phi-3-Vision enhances language models by integrating multisensory skills, seamlessly combining language and vision capabilities.

Research Forum | Episode 4 - abstract chalkboard background with colorful network nodes and circular icons
Stories

Research Forum Brief | September 2024 

September 3, 2024

In this episode, learn about the latest multimodal AI models, advanced benchmarks for AI evaluation and model self-improvement, and an entirely new kind of computer for AI inference and hard optimization. Discover how these research breakthroughs and more can help…

  • Previous
  • 1
  • …
  • 57
  • 58
  • 59
  • 60
  • 61
  • …
  • 569
  • Next