Citizen science is about individual volunteers, many with no specific scientific training, contributing to research efforts. Many citizen science projects harness the CPU power of volunteers’ idle personal computers or servers to perform large-scale distributed calculations for scientific research. Such projects range from climate modeling to searching for pulsars to predicting folded conformations of proteins. They help researchers overcome the challenges of conducting data-intensive research—a mode of scientific discovery that Jim Gray described as the fourth paradigm. Microsoft Research has been actively involved in many citizen-science projects; here are three examples.
One of the most successful citizen-science efforts, Galaxy Zoo enlists individuals worldwide to assist in the classification of galaxies. It represents the world’s largest astronomical collaboration, bringing professional astronomers together with hundreds of thousands of volunteers. Independent assessments show that classifications provided by Galaxy Zoo volunteers are as accurate as those from professional astronomers, and the results have:
- Informed research into the formation of elliptical galaxies
- Provided a sample of merging galaxies of unprecedented breadth and fidelity
- Highlighted the dominance in some environments of red spirals, in which star formation has been rapidly and mysteriously extinguished
The influence of Galaxy Zoo stretches beyond the band of astronomers who seek to understand the evolution of the universe. The Zooniverse, which grew out of Galaxy Zoo, now hosts 10 projects inviting volunteers to do everything from transcribing ancient papyri to searching for planets around other stars. Meanwhile, the rich datasets that citizen science of this sort provides also inspire new approaches to machine learning—something that will be essential for automated and human classifiers to cope with the next generation of surveys, which will produce terabytes of data every night.
Evidence of global climate change is now unequivocal, but what does that mean for the weather where you live? Will damaging weather events increase, or might your weather become less extreme? To answer these questions, climateprediction.net has partnered with the Met Office (the United Kingdom’s national weather service) to create weatherathome.net, an international project that is open to anyone with a computer and Internet access.
With support from Microsoft Research, weatherathome.net will enable anyone in the world to download and run a regional climate model on their home PC. The model is initially available for three target regions: Europe, the western United States, and southern Africa. Participants produce simulations that will enable scientists to estimate how often heat waves, floods, and hurricanes will strike in the next few decades. The initiative will also indicate how much of the blame for these events can be attributed to greenhouse gas emissions caused by humans.
The model has been developed by the Met Office, and results from different regions are being used directly by scientists who specialize in the climates of those regions. The European region is being analyzed by the Met Office and by Oxford, Edinburgh, and Leeds universities; southern Africa by the University of Cape Town; and the western United States by Oregon State University. Results are also made available to scientists who are interested in climate impacts in the various regions.
The first results from the weatherathome.net experiment were recently published in the journal, Geophysical Research Letters (February 2012).
Although most citizen-science experiments allow volunteers to observe the work that their computer is performing on behalf of the project, there has been no mechanism to visually present the overall results and the aggregate contributions of individual volunteers or teams. Thanks to a first-of-its-kind feedback system collaboratively developed by Microsoft Research and the University of Washington (UW), that’s no longer the case.
The system was created for the UW’s Rosetta@Home project, which uses the distributed computing power of citizen participants to help predict and design the 3-dimensional structures of natural and synthetic proteins by using mini-mum energy calculations. In Rosetta@Home, an individual computational run generates a folded protein conformation, along with two key metrics: Energy (Rosetta score of the computed protein structure, which is analogous to free energy) and RMSD (Ca root mean square deviation from the experimentally determined structure). A typical experiment consists of generating a large number of conformations in an effort to find the lowest energy structure, which, ideally, should also have the lowest RMSD.
By using Microsoft SQL Server, the researchers created an Internet-based feedback report that illustrates the contributions of individuals or teams, displayed in the context of overall results. A SQL Server relational database tracks individual results and processes reporting queries on demand, and its associated Reporting Services render plots from query results—graphics that can be fully integrated with the project’s public website. The feedback reports have been extremely popular with participants, who find it reinforcing to see the actual impact of their contribution. Moreover, the reporting system has also proven to be useful to investigators inside the UW lab, allowing them to monitor the progress of experiments and view results quickly.
Since user engagement is critical to the success of community computing projects like Rosetta@Home, there’s every reason to believe that this reporting solution can encourage participation in a variety of citizen-science projects.