Data for Society
Using open data to accelerate the development of solutions to solve society’s most pressing challenges.

US broadband usage percentages
Broadband internet access is critical to providing communities with education, employment, and telecare. The broadband usage percentages dataset shows broadband access at the US county-level to help address gaps in service availability.

Solar farms mapping in India
The solar farms mapping data can help researchers identify factors driving land suitability for solar projects and help public agencies better plan siting of solar energy development in India.

HKH glacier mapping
Glacier mapping is key to ecological monitoring in the Hindu Kush Himalaya (HKH) region, climate change poses a risk to those dependent on the health of glacier ecosystems. The (HKH) glacier mapping dataset includes imagery with locations of glaciers.

Chesapeake land cover
The Chesapeake Conservancy created a landcover dataset for conservation efforts, this same data containing high-resolution aerial imagery and land cover labels can be used to train ML models to map an even wider area of land cover.

BankNote-Net
Worldwide millions of people have low or no vision. BankNote-Net was created as an open dataset for assistive universal currency recognition to help with daily tasks such as currency recognition.

BankNote-Net
Worldwide millions of people have low or no vision. BankNote-Net was created as an open dataset for assistive universal currency recognition to help with daily tasks such as currency recognition.

United States broadband usage dataset
Broadband internet access is critical to providing communities with education, employment, and telecare. The broadband usage percentages dataset shows broadband access at the US county-level to help address gaps in service availability.

MS-ASL American Sign Language (ASL) dataset
In the US, over 500,000 people use ASL for communication. This ASL dataset of over 25,000 annotated videos with sign and action recognition can help researchers build machine learning models to advance sign language recognition.

Tagged hands dataset
Development of a rich hand-gesture-based interface is currently a tedious process. This dataset of 3,500 labeled depth frames of various hand poses and 140 gesture clips helps enable easy development of a gesture-based interface.

Generative Neural Visual Artist (GeNeVA)
Intelligent systems can generate images and video for a range of applications, from education to accessibility. This dataset has sequences of images, associated instructions and linguistic feedback, and a modified version of the Compositional Language and Elementary Visual Reasoning (CLEVR) dataset.

Learning from analog pen use to improve digital ink experiences
To help researchers understand the gaps between analog versus digital pens and improve digital experiences, this dataset contains 493 entries of a diary study with 26 participants using analog pens and 178 entries from 30 participants using digital pens.

Microsoft Machine Reading Comprehension (MS MARCO)
AI and automated assistants need strong machine reading comprehension (MRC) and question answering (QA) capabilities to understand real-world dialog. This dataset contains 1,010,916 questions and 182,669 answers to improve QA and MRC.

Digital Civility Gender Equality Dataset
Microsoft recognizes the importance of advocating for and advancing the release of gender disaggregated data to realize gender equality and to close the data divide. This dataset can be leverage by researchers and organizations to advance better gender data policies and solutions.

Microsoft Nonprofit Innovation Hub
The Nonprofit Innovation Hub is an open-source GitHub repository from Microsoft Tech for Social Impact with lightweight solutions including sample code, templates, and other assets from across the Microsoft ecosystem that enable nonprofits to innovate.

Open Data Campaign
The Microsoft Open Data Campaign aims to close the data divide and help organizations of all sizes to realize the benefits of data and the new technologies it powers.

Explore open data in action
Explore Microsoft’s data collaborations for examples of using open data and data sharing models to solve societal challenges.
Data collaborations
Make data sharing easier
Data sharing agreements can take months to draw up, oftentimes deterring organizations from sharing data at all. To help, Microsoft has drafted agreements that govern the sharing of data, particularly in the context of training AI models.
Data sharing agreement templates
Inspire leaders to put data to work
A tool organizational leaders can use to further understand how best to put data to work to solve important societal challenges.
Open Data for Social Impact Framework
Differential privacy
Differential privacy introduces statistical noise–slight alterations–to mask datasets and protect the privacy of individuals.
Learn about differential privacy
Azure confidential computing
Confidential computing helps to protect sensitive data in the cloud by offering security through data-in-use encryption–additional protection for your data while it's being processed.
Read about Azure confidential computing
Azure open datasets
A curated collection of publicly available datasets that are ready to use in machine learning workflows and easy to access from Azure services.
Review the Azure open datasets
Microsoft Research Open Data
A collection of free datasets from Microsoft Research to advance state-of-the-art research in areas such as natural language processing, computer vision, and domain specific sciences.
Explore Microsoft Research Open DataAbout the Data for Society resource center
Microsoft is working to make data that is relevant to important social problems as open as possible, including by contributing open data ourselves.
The Data for Society resource center provides access to Microsoft’s open datasets, resources, and tools to make data sharing, research, and collaboration easier.
