Differentially Private Set Union
This repository contains the code and dataset for the following paper: Differentially Private Set Union with Applications to Vocabulary Generation
Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
This repository contains the code and dataset for the following paper: Differentially Private Set Union with Applications to Vocabulary Generation
This repository contains a representative subset of the first-party DNN training workloads on Microsoft’s internal Philly clusters. The trace is a sanitized subset of the workload described in “Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN…
DeBERTa (Decoding-enhanced BERT with disentangled attention) improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content…
The VQA-Introspect dataset consists of 238K new perception questions which serve as sub questions corresponding to the set of perceptual tasks needed to effectively answer the complex reasoning questions in the Reasoning split of the…
This repository contains source code necessary to reproduce the results presented in the paper Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks. We propose a new cross-modal pre-training method Oscar (Object-Semantics Aligned Pre-training). It leverages object…
This repository contains the code for reproducing the quantitative experiments in our publication “Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations.”
This data was collected for and used in our ACL 2020 paper that demonstrates the potential to effectively combine explanations and demonstrations to learn web-based procedures. This data consists of 520 explanations and corresponding demonstrations…
SPLASH is dataset for the task of semantic parse correction with natural language feedback. The task, dataset along with baseline results are presented in: Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback Ahmed…
Using machine learning to detect beluga whale calls in hydrophone recordings. Of the five populations of beluga whales in Alaska, the Cook Inlet population is the smallest and has declined by about seventy-five percent since…