Human-Powered Data Management


February 28, 2013


Fully automated algorithms are inadequate for a number of data analysis tasks, especially those involving images, video, or text. Thus, there is a need to combine “human computation” (or crowdsourcing), together with traditional computation, in order to improve the process of understanding and analyzing data. My thesis addresses several topics in the general area of human-powered data management. I design algorithms and systems for combining human and traditional computation for: (a) data processing, e.g., using humans to help sort, cluster, or clean data; (b) data extraction, e.g., having humans help create structured data from information in unstructured web pages; and (c) data gathering, i.e., asking humans to provide data that they know about or can locate, but that would be difficult to gather automatically. My focus in all of these areas is to find solutions that expend as few resources as possible (e.g., time waiting, human effort, or money spent), while still providing high quality results.


Aditya Parameswaran

Aditya Parameswaran is a Ph.D. student in the InfoLab at Stanford University, advised by Prof. Hector Garcia-Molina. He is broadly interested in data management, with research results in human computation, information extraction, and recommendation systems. Aditya is a recipient of the Key Scientific Challenges Award from Yahoo! Research (2010), two best-of-conference citations (VLDB 2010 and KDD 2012), the Terry Groswith graduate fellowship at Stanford University (2007), and the Gold Medal in Computer Science at IIT Bombay (2007).