Reproducibility with Microsoft Research Open Data

  • Vani Mandava

Invited talk at The AAAI 2020 Workshop on Reproducible AI - RAI2020, NYC

Related File

Access to repositories with open data sources is critical for reproducibility of research. Microsoft Research Open Data is a unique initiative that combines features of a traditional
data repository with easy access to compute resources. The main aim is to increase reproducibilty of research outcomes published by Microsoft Research. We accomplish this by making datasets associated with research papers published by MSR available on the cloud. These datasets are hosted along with relevant metadata to makes it easier to discover related assets that aid reproducibilty. In addition,  the repository allows directly instantiating cloud compute resources alleviating the need for data movement through network and bandwidth constraints.