Cloud computing charts a path to scientific discovery
With the rapid rise of data-intensive scientific research—across disciplines and around the globe—scientists in Asia, as elsewhere, face massive computing needs and challenges.
Mindful of our role in helping scientists turn big data into big discoveries, Beijing-based Microsoft Research Asia has collaborated closely with domestic and international researchers on a wide range of topics, including the environment, data modeling, biological computing, climate change, and urban computing.
As part of these collaborative efforts, we have worked to help researchers apply Microsoft Azure, the company’s cloud-computing platform, to data-intensive scientific research. As Eric Chang, senior director of technology strategy at Microsoft Research Asia, observes, “In this era of big data, cloud computing offers scientists a platform for dealing with massive amounts of data and the growing requirements of distributed, multidisciplinary collaborations to drive new discoveries.” Here, then, are five examples of our collaborative efforts to harness the power of the cloud for scientific research.
Understanding ecological and hydrologic processes and their interactions in large watersheds is important to a society in need of sustainable freshwater supplies. As part of a major new research program, Professor Chunmiao Zheng and Researcher Guoliang Cao of Peking University are using Microsoft Azure to support comprehensive data processing and numerical modeling of the hydrologic cycle of the Heihe River Basin, and to continue developing cloud computing as a cost-effective solution to large-scale integrated eco-hydrologic modeling.
Numerical modeling of eco-hydrological processes in the Heihe
River basin using Microsoft Azure
Facilitating the analysis of climate data: Sea ice is an important component of the Earth’s climate system, and coupled climate models are indispensable tools in its study. The Coupled Model Intercomparison Project (CMIP) provides a set of coordinated climate model experiments for use by climate-modeling groups. By intercomparing the resulting model outputs, CMIP can assess the mechanisms responsible for model differences, determining why similar models produce a range of responses. A total of 1.5 petabytes of model output data, including sea-ice data, was produced by more than 30 modeling groups around the world during CMIP5 (the project’s fifth phase). Unfortunately, CMIP’s current web-based data dissemination system supports only data search and download. All other necessary data processing functions must be performed by researchers in their local facilities. Professor Yuqi Bai of Tsinghua University led his group to establish an integrated research environment for archiving, searching, analyzing, and intercomparing CMIP5 data with the CMIP5 Sea Ice Data Portal. This pilot project clearly demonstrates Microsoft Azure’s value in enabling a web-based, data-intensive computing environment.
An integrated research environment for archiving, searching, analyzing, and intercomparing climate model output data with CMIP5 Sea Ice Data Portal
Studying terrestrial ecosystems: Terrestrial ecosystems influence climate through a complex system of bio-geophysical feedback, including carbon and water exchange with the atmosphere. Honglin He, Fan Li, and Xiaoli Ren of the Chinese Academy of Sciences have been working to build a carbon-water flux data storage system for the Qinghai-Tibet Plateau ecosystem. Their system would enable model simulation and provide a platform for uncertainty analysis. The researchers based their system on Microsoft Azure’s virtually unlimited storage capacity and its data-intensive computing architecture, which can handle enormous amounts of multisource heterogeneous data.
Improving healthcare. Professor Yan Xu of Beihang University has been conducting research on the value of using large-scale histopathology image analysis to detect colon cancer, a common and potentially deadly disease that has a huge impact on public health. While such images provide an excellent tool for detecting early-stage colon cancer, a digitized histopathological image at 40 times resolution is roughly 15,000 x 15,000 pixels. Microsoft Research Asia is applying Microsoft Azure to histopathology classification, segmentation, and clustering, a project that will help physicians improve the accuracy of their diagnoses, thereby helping to reduce costs and save lives.
Cataloguing biodiversity: The Biodiversity Heritage Library is an international cooperative project that has scanned and openly shares more than 100,000 volumes—totaling some 43 million pages and 97 million species records. Zheping Xu of the Chinese Academy of Sciences is leading a project that will extract information from the library’s vast store of biodiversity literature, unearthing buried information about the distribution of species. The project will generate different thematic maps, enabling researchers to extract information on species distribution according to time or region, as well as to use file formats from Bing Map and other online mapping products to display multiple types of geographic information in new ways. This information can be used in efforts to further conservation efforts and wildlife management.
These five pioneering projects demonstrate the immense value of using Microsoft Azure in scientific research. Moreover, these early efforts strengthen our determination to bring “cloud power” to researchers from diverse disciplines.
To learn more about Microsoft Research’s efforts to help scientific researchers accelerate their discoveries through the computational and collaborative power of Microsoft Azure, visit Microsoft Azure for Research.
—Xin Ma, Senior Research Program Manager, Microsoft Research Asia