LoRA
This repo contains the source code of the Python package loralib and several examples of how to integrate it with PyTorch models, such as those in HuggingFace. We only support PyTorch for now. See our…
Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
This repo contains the source code of the Python package loralib and several examples of how to integrate it with PyTorch models, such as those in HuggingFace. We only support PyTorch for now. See our…
Implementation of MoLeR: a generative model of molecular graphs which supports scaffold-constrained generation. This open-source code accompanies our paper “Learning to Extend Molecular Scaffolds with Structural Motifs”, which has been accepted at the ICLR 2022…
Github link to Iris – pretrained summarization models for structured datasets and cardinality estimation.
Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators. This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Accompanying paper accepted to ICLR 2022: “Pretraining Text Encoders…
Ekya is a system which enables continuous learning on resource constrained devices. Given a set of video streams and pre-trained models, Ekya can continuously fine-tune the models to maximize accuracy by intelligently allocating resources between…
Knowledge Infused Decoding (KID) is a decoding algorithm that infuses knowledge (from Wikipedia) into each step decoding of text generation.
Jigsaw Dataset: Natural language to Python Pandas code. Two datasets (PandasEval1 and PandasEval2) described in our paper, “Jigsaw: Large Language Models meet Program Synthesis”.
Microsoft Collective Communication Library (MSCCL) is a platform to execute custom collective communication algorithms for multiple accelerators supported by Microsoft Azure.
The goal of this project is to use audio recordings and corresponding annotations to build an automatic classifier for calls from four different species of blue whales, and to estimate the total number of calls…
Maximal Update Parametrization (ÎĽP) and Hyperparameter Transfer (ÎĽTransfer), in association with the paper: Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer