AI and Microsoft Research header - abstract neural network pattern on dark spectrum background

AI Frontiers

ツール

OmniParser V2

10月 2024

OmniParser is an advanced vision-based screen parsing module that converts user interface (UI) screenshots into structured elements, allowing agents to execute actions across various applications using visual data . By harnessing large vision-language model capabilities, OmniParser improves both efficiency and…

Access

Eureka ML Insights

9月 2024

This repository contains the code for the Eureka ML Insights, a framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. The framework is designed to help researchers and practitioners run reproducible evaluations of generative models using a…

Github

Trace

7月 2024

Trace is a new AutoDiff-like tool for training AI systems end-to-end with general feedback (like numerical rewards or losses, natural language text, compiler errors, etc.). Trace generalizes the back-propagation algorithm by capturing and propagating an AI system’s execution trace. Trace…

Access

Phi-3

4月 2024

The Phi-3-Mini-128K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to…

Access

KITAB Dataset

2月 2024

🕮 KITAB is a challenging dataset and a dynamic data collection approach for testing abilities of Large Language Models (LLMs) in answering information retrieval queries with constraint filters. A filtering query with constraints can be of the form “List all books…

Access

HoloAssist

1月 2024

A large-scale egocentric human interaction dataset, where two people collaboratively complete physical manipulation tasks.

Access

Orca-2-13B

1月 2024

Orca 2 is a finetuned version of LLAMA-2. It is built for research purposes only and provides a single turn response in tasks such as reasoning over user given data, reading comprehension, math problem solving and text summarization. The model…

Access

Orca-2-7B

1月 2024

Access

LLF-Bench

1月 2024

LLF Bench is a benchmark for evaluating learning agents that provides a diverse collection of interactive learning problems where the agent gets language feedback instead of rewards (as in RL) or action feedback (as in imitation learning).

Github