Aerial view of colorful tiled rooftops clustered around a curving street.

Advancing open data

Microsoft aims to close the data divide and help organizations of all sizes to realize the benefits of data and the new technologies it powers.

Language diversity
Open Science
Open government data
Additional resources

Share this page

Data commons

Microsoft supports the creation of data commons: diverse, high-quality datasets that are responsibly governed and made available for reuse in the public interest. Data commons account for a diverse range of data access and sustainability models that meet the needs of the community and enable them to be maintained over time.

Sunlit bookshelf filled with colorful books neatly arranged on dark shelves.

Open data initiatives for all builders of AI

We are supporting the Institutional Data Initiative and CORE to expand access to high-quality data for AI innovation.

Learn more about these initiatives

Night cityscape overlaid with glowing network nodes and connecting lines.

Emerging funding and business models

The Open Data Policy Lab examines emerging funding and business models for data commons that support equitable, sustainable reuse of data for public-interest AI.

Learn more about the emerging models

Woman speaking into a microphone on stage during a presentation.

New Commons Challenge awards

The New Commons Challenge recognized two projects that demonstrate innovation in data commons – foundational initiatives that are using AI to understand data and create a bridge to humanitarian and community support.

Learn more about the awardees

Language and cultural diversity in the AI era

The Microsoft Open Innovation Center (MOIC), based in Strasbourg, France, is established to support the development of public interest AI that strengthens European linguistic diversity, cultural sovereignty, and societal resilience. Our approach is through enabling open innovation, including more open and accessible data, open science, and open technologies.

Learn about our European commitments Learn more about LINGUA

Layered colorful silhouettes and speech bubbles in a textured painted collage.

Multichannel pipette injecting liquid into a microtiter plate.

Open Science for AI-driven discovery

AI is transforming research, but its impact depends on access to high‑quality, open, machine‑readable data. Microsoft advances open science by sharing research, building open-enabling platforms, partnering globally, and advocating for policies that expand access and usability of scientific data.

Read the Fostering Innovation Network report Explore COAR resources for repository managers

Open Government Data

For governments around the world, opening data can help lead to more representative AI and help citizens and government employees leverage AI to improve government responsiveness and efficiency. Our technical guide provides considerations for governments to establish high-quality, useful, and beneficial data commons.

Access the guide

Additional resources

Open Data for Social Impact Framework

A tool leaders can use to put data to work to solve important societal issues.

Read the framework

The Open Data Opportunity

The importance behind data sharing explained.

AI for Good Lab Open Source Database

Making open datasets, research code, and tools freely available for a global community of problem solvers.

Visit the database

Capabilities

Learn more about the tools and resources available to support innovation with data.

Microsoft Discovery

An AI‑powered research platform that supports the open ecosystem by enabling researchers to integrate and reuse data, code, and models from across an open scientific ecosystem.

Learn about transforming R&D with agentic AI

Azure Data Factory

A fully managed, serverless data integration service to ingest, transform, and orchestrate data from a wide range of sources.

Learn about simplifying hybrid data integration

Microsoft Foundry

Microsoft's unified AI platform for building, deploying, and governing AI agents at enterprise scale.

Review the models

Researcher tools

Explore a collection of datasets, code, and models from Microsoft Research for the broader academic community to advance state-of-the-art research across all disciplines.

Explore researcher tools

Legal frameworks

Data sharing agreements can take months to draw up, oftentimes deterring organizations from sharing data at all. As a first step toward building better processes and tools, we're sharing a set of data agreements to govern the sharing of data, particularly in the context of training AI models.

CDLA Permissive 2.0

The Community Data License Agreement (CDLA) Permissive 2.0 is an open data agreement designed to make it easier to share and collaborate with open data.

Read the CDLA Get more details

C-UDA 1.0

The Computational Use of Data Agreement (C-UDA) 1.0 is intended for use with datasets that may include material not owned by the data provider, but where it may have been assembled lawfully from publicly accessible sources.

Read the C-UDA See the annotated agreement Find the agreement on GitHub

DUA-OAI

The Data Use Agreement for Open AI Model Development (DUA-OAI) provides terms to govern the sharing of data by an organization with another for the purpose of allowing that second organization to use the data to train an AI model, where the trained model is open sourced.

Read the DUA-OAI Find the annotated agreement Get the details

DUA-DC

The Data Use Agreement for Data Commons (DUA-DC) can be used by multiple parties who want to share data through a common, Application Programming Interface (API)-enabled database.

Read the DUA-DC Get the annotated agreement Find out more

Follow Microsoft