Microsoft Research Blog
Breaking cross-modal boundaries in multimodal AI: Introducing CoDi, composable diffusion for any-to-any generation
Imagine an AI model that can seamlessly generate high-quality content across text, images, video, and audio, all at once. Such a model would more accurately capture the multimodal nature of the world and human comprehension,…
Project
Project VeLLM
uniVersal Empowerment with LLMs The technology landscape is being rapidly transformed by Large Language Models (LLMs), allowing users to address real-world applications in various domains. However, a digital divide exists that may exclude large populations…