Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
Scalable Hyperlink Store
The Scalable Hyperlink Store is a specialized “database” for the web graph. SHS maintains the web graph in main memory, distributed over many machines.
Tweet Entity Linking Data: IE-driven and IR-driven sets
In this dataset, we release the labeled data for people to evaluate and compare entity linking systems on tweets.
The use of Melodic Scales in Bollywood Music: An Empirical Study
Hindi film music, which is commonly referred to as Bollywood music, is one of the most popular forms of music in the world today. One of the reasons for its popularity has been the willingness…
Image Cropping Dataset
The Image Cropping Dataset contains the cropping parameters for 1000 images that were manually cropped by an experienced photographer. The cropping parameters indicate the coordinates of the upper-left and bottom-right corners of the crop box.…
MSR Identity Toolbox (With Binaries)
This is the MSR Identity Toolbox: A MATLAB toolbox for speaker-recognition research. This toolbox contains a collection of MATLAB tools and routines that can be used for research and development in speaker recognition. Version 1.0…