United States Change | All Microsoft Sites
Citizen Science
Science at Microsoft

Citizen Science

Citizen science is about individual volunteers, many with no specific scientific training, contributing to research efforts. Many citizen science projects harness the CPU power of volunteers’ idle personal computers or servers to perform large-scale distributed calculations for scientific research. Such projects range from climate modeling to searching for pulsars to predicting folded conformations of proteins. They help researchers overcome the challenges of conducting data-intensive research—a mode of scientific discovery that Jim Gray described as the fourth paradigm. Microsoft Research has been actively involved in many citizen-science projects; here are three examples.

Galaxy Zoo

One of the most successful citizen-science efforts, Galaxy Zoo enlists individuals worldwide to assist in the classification of galaxies. It represents the world’s largest astronomical collaboration, bringing professional astronomers together with hundreds of thousands of volunteers. Independent assessments show that classifications provided by Galaxy Zoo volunteers are as accurate as those from professional astronomers, and the results have:

  • Informed research into the formation of elliptical galaxies
  • Provided a sample of merging galaxies of unprecedented breadth and fidelity
  • Highlighted the dominance in some environments of red spirals, in which star formation has been rapidly and mysteriously extinguished

The influence of Galaxy Zoo stretches beyond the band of astronomers who seek to understand the evolution of the universe. The Zooniverse, which grew out of Galaxy Zoo, now hosts 10 projects inviting volunteers to do everything from transcribing ancient papyri to searching for planets around other stars. Meanwhile, the rich datasets that citizen science of this sort provides also inspire new approaches to machine learning—something that will be essential for automated and human classifiers to cope with the next generation of surveys, which will produce terabytes of data every night.


Evidence of global climate change is now unequivocal, but what does that mean for the weather where you live? Will damaging weather events increase, or might your weather become less extreme? To answer these questions, climateprediction.net has partnered with the Met Office (the United Kingdom’s national weather service) to create weatherathome.net, an international project that is open to anyone with a computer and Internet access.

With support from Microsoft Research, weatherathome.net will enable anyone in the world to download and run a regional climate model on their home PC. The model is initially available for three target regions: Europe, the western United States, and southern Africa. Participants produce simulations that will enable scientists to estimate how often heat waves, floods, and hurricanes will strike in the next few decades. The initiative will also indicate how much of the blame for these events can be attributed to greenhouse gas emissions caused by humans.

The model has been developed by the Met Office, and results from different regions are being used directly by scientists who specialize in the climates of those regions. The European region is being analyzed by the Met Office and by Oxford, Edinburgh, and Leeds universities; southern Africa by the University of Cape Town; and the western United States by Oregon State University. Results are also made available to scientists who are interested in climate impacts in the various regions.

The first results from the weatherathome.net experiment were recently published in the journal, Geophysical Research Letters (February 2012).


Although most citizen-science experiments allow volunteers to observe the work that their computer is performing on behalf of the project, there has been no mechanism to visually present the overall results and the aggregate contributions of individual volunteers or teams. Thanks to a first-of-its-kind feedback system collaboratively developed by Microsoft Research and the University of Washington (UW), that’s no longer the case.

The system was created for the UW’s Rosetta@Home project, which uses the distributed computing power of citizen participants to help predict and design the 3-dimensional structures of natural and synthetic proteins by using mini-mum energy calculations. In Rosetta@Home, an individual computational run generates a folded protein conformation, along with two key metrics: Energy (Rosetta score of the computed protein structure, which is analogous to free energy) and RMSD (Ca root mean square deviation from the experimentally determined structure). A typical experiment consists of generating a large number of conformations in an effort to find the lowest energy structure, which, ideally, should also have the lowest RMSD.

By using Microsoft SQL Server, the researchers created an Internet-based feedback report that illustrates the contributions of individuals or teams, displayed in the context of overall results. A SQL Server relational database tracks individual results and processes reporting queries on demand, and its associated Reporting Services render plots from query results—graphics that can be fully integrated with the project’s public website. The feedback reports have been extremely popular with participants, who find it reinforcing to see the actual impact of their contribution. Moreover, the reporting system has also proven to be useful to investigators inside the UW lab, allowing them to monitor the progress of experiments and view results quickly.

Since user engagement is critical to the success of community computing projects like Rosetta@Home, there’s every reason to believe that this reporting solution can encourage participation in a variety of citizen-science projects.

Primary Researchers

Arfon Smith

Arfon Smith is the director of Citizen Science at the Adler Planetarium in Chicago and technical lead of the Zooniverse. Before joining the Zooniverse he earned a Ph.D. in astrochemistry from The University of Nottingham in 2006 and then worked as a senior developer in the production software group at the Wellcome Trust Sanger Institute in Cambridge.


Chris Lintott

Chris Lintott is a researcher in the Department of Physics at the University of Oxford, where he leads a team of scientists, educators and developers in the development of citizen science projects. Passionately committed to scientific outreach, he is best known as co-presenter of the BBC’s long-running Sky at Night series.


Suzanne Rosier

Suzanne Rosier is research scientist and joint coordinator of climateprediction.net. She is investigating the results of climateprediction.net in the European experiment, and planning for research in the Australia/New Zealand regions. She is also coordinating author of the associated climateeducation.net project. Rosier was very involved in helping launch climateprediction.net's first regional modeling initiative, “weatherathome,” which is modeling limited areas of the world in sufficient detail to enable scientists to deduce future changes in extreme weather patterns.

Philip W. Mote

Philip W. Mote is a professor in the College of Oceanic and Atmospheric Sciences at Oregon State University, director of the Oregon Climate Change Research Institute for the Oregon University System, and director of Oregon Climate Service (state climate office). Dr. Mote's current research interests include scenario development, regional climate modeling with a superensemble generated by volunteers' personal computers, and adaptation to climate change.

David Baker

David Baker is professor at the University of Washington, an investigator of the Howard Hughes Medical Institute, and a member of the National Academy of Sciences. Baker is a world leader in protein structure prediction and design. He leads the Baker Laboratory with a research focus on the prediction and design of protein structures and protein-protein interactions.

Stuart Ozer

Stuart Ozer currently leads a team in the Microsoft Server and Tools business, working with the largest-scale and most demanding data-intensive workloads on SQL Server and Windows Azure. He is a veteran of Microsoft Research’s eScience team, where he worked on academic collaboration efforts to apply modern database-centered technologies and workflows to diverse problems in biological sequence analysis, protein structure, and environmental sensor networks.