2012 Jim Gray eScience Award Presentation
At the Microsoft eScience Workshop 2012, Microsoft Research Connections Vice President Tony Hey introduces the Jim Gray eScience Award and announces this year’s winner, Antony John Williams, who delivers the following presentation.
The Possibilities and Pitfalls Internet-Based Chemical Data
In less than a decade, the Internet has provided us access to enormous quantities of chemistry data. Chemists have embraced the web as a rich source of data and knowledge. However, all that glitters is not gold and—while online searches can now provide us access to information associated with many tens of millions of chemicals, can allow us to traverse patents, publications, and public domain databases—the promise of high quality data on the web needs to be tempered with caution.
In recent years, the crowdsourcing approach to developing curated content has been growing. Can such approaches allow us to bring to bear the collective wisdom of the crowd to validate and enhance the availability of trusted chemistry data online or are algorithms likely to be more powerful in terms of validating data? While it is now possible to search the web by using a query language form natural to chemists—that of “structure searching the web”—increasingly, scientists are likely going to have to accept joint responsibility for the quality of data online for the foreseeable future. Their participation is likely to come through engaging in open science, the provision of data under open licenses, and by offering their skills to the community.
This presentation provides an overview of the present state of chemistry data online, the challenges and risks of managing and accessing data in the wild, and how an Internet for chemistry continues to expand in scope and possibilities.