Research Tools: code, datasets, & models

Tool

Learning to Detect Scene Landmarks for Camera Localization

Source code and data for the CVPR 2022 paper “Learning to Detect Scene Landmarks for Camera Localization”.

Tool

Data for society catalog

Microsoft is working to make data that is relevant to important social problems as open as possible, including by contributing open data ourselves. The Data for Society resource center provides access to Microsoft’s open datasets,…

GitHub

Tool

Admin-Torch

Here, we provide a plug-in-and-play implementation of Admin, which stabilizes previously-diverged Transformer training and achieves better performance, without introducing additional hyper-parameters. The design of Admin is half-precision friendly and can be reparameterized into the original…

GitHub

Tool

XtremeDistil

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale.

GitHub Publication Publication

Tool

LoRA

This repo contains the source code of the Python package loralib and several examples of how to integrate it with PyTorch models, such as those in HuggingFace. We only support PyTorch for now. See our…

GitHub Publication Publication

Tool

MoLeR: A Model for Molecule Generation

Implementation of MoLeR: a generative model of molecular graphs which supports scaffold-constrained generation. This open-source code accompanies our paper “Learning to Extend Molecular Scaffolds with Structural Motifs”, which has been accepted at the ICLR 2022…

GitHub Publication

Tool

Project Iris

Github link to Iris – pretrained summarization models for structured datasets and cardinality estimation.

Access

Tool

AMOS

Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators. This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Accompanying paper accepted to ICLR 2022: “Pretraining Text Encoders…

GitHub Publication

Tool

Ekya: Continuous Learning on the Edge

Ekya is a system which enables continuous learning on resource constrained devices. Given a set of video streams and pre-trained models, Ekya can continuously fine-tune the models to maximize accuracy by intelligently allocating resources between…

GitHub Publication

Tool

KID: Knowledge Infused Decoding

Knowledge Infused Decoding (KID) is a decoding algorithm that infuses knowledge (from Wikipedia) into each step decoding of text generation.

GitHub Publication