Microsoft Research Blog

English

Eureka: Evaluating and understanding progress in AI

September 17, 2024

How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings.
AMEGO: Active Memory from long EGOcentric videos

September 17, 2024 | Gabriele Goletto, Tushar Nagarajan, Giuseppe Averta, and D. Damen

Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-long egocentric videos. Inspired by the human's ability to maintain information from…
Principal Type Inference under a Prefix (TR)

September 17, 2024 | Daan Leijen and Wenjia Ye

At the heart of the Damas-Hindley-Milner (HM) type system lies the abstraction rule which derives a function type for a lambda expression. This rule allows the type of the parameter to be "guessed", which allows for multiple possible types for functions like the identity function.…
Microsoft to open its first Middle East ‘AI for Good Lab’ in Abu Dhabi

September 17, 2024

Plans to create a centre, founded and funded by Microsoft and G42, that will "identify, develop and advance best practices and industry standards for the responsible use of AI in the Middle East and the Global South" were unveiled. Microsoft, based in Redmond, Washington, also…
‘Their capacity to emulate human language and thought is immensely powerful’: Far from ending the world, AI systems might actually save it

September 17, 2024

From disaster recovery to conservation and healthcare, plenty of AI projects will greatly benefit humanity, Microsoft experts Juan M. Lavista Ferres and William B. Weeks say. Over the last few years, artificial intelligence (AI) has been firmly in the world's spotlight, and the rapidly advancing technology can…
EUREKA: Evaluating and Understanding Large Foundation Models

September 17, 2024

Rigorous and reproducible evaluation of large foundation models is critical for assessing the state of the art, informing next steps in model improvement, and for guiding scientific advances in Artificial Intelligence (AI). Evaluation is also important for informing the increasing number of application developers that…
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

September 16, 2024

Instruction-tuned language models (LM) are able to respond to imperative commands, providing a more natural user interface compared to their base counterparts. In this work, we present Promptriever, the first retrieval model able to be prompted like an LM. To train Promptriever, we curate and…
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

September 16, 2024

Transformer-based Large Language Models (LLMs) have become increasingly important. However, due to the quadratic time complexity of attention computation, scaling LLMs to longer contexts incurs extremely slow inference speed and high GPU memory consumption for caching key-value (KV) vectors. This paper proposes RetrievalAttention, a training-free…
Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot

September 15, 2024

Large Language Models (LLMs) are widely used in healthcare, but limitations like hallucinations, incomplete information, and bias hinder their reliability. To address these, researchers released the Build Your Own expert Bot (BYOeB) platform, enabling developers to create LLM-powered chatbots with integrated expert verification. CataractBot, its…
Evaluating the risk of data loss due to particle radiation damage in a DNA data storage system

September 14, 2024

DNA data storage is a potential alternative to magnetic tape for archival storage purposes, promising substantial gains in information density. Critical to the success of DNA as a storage media is an understanding of the role of environmental factors on the longevity of the stored…
Research Focus: Week of September 9, 2024

September 12, 2024

Investigating vulnerabilities in LLMs; A novel total-duration-aware (TDA) duration model for text-to-speech (TTS); Generative expert metric system through iterative prompt priming; Integrity protection in 5G fronthaul networks:
Physiological feedback for predictive models

September 12, 2024

US Patent App. 18/118,849. This document relates to employing biosignals to evaluate predictions made by predictive models. For example, user attention can be inferred from a user attention signal such as gaze. When the user directs attention to a prediction output by a given predictive…

No results