Synonym Resolution on the Web


March 2, 2007


Alexander Yates


University of Washington


The Web is a vast resource of information on practically anything one can think of. Unfortunately, the information is mostly in unstructured text, making it difficult for machines to process. This talk presents new methods for identifying synonymous objects and relations on the Web, on top of an information extraction system. New techniques developed for this problem include a novel probabilistic model for synonym extraction, and a highly scalable clustering algorithm. The results have been integrated into an application that allows searching over a large set of relations extracted from the Web, and they hold promise for improved search technology.


Alexander Yates

Alex Yates is a Senior Ph.D is a student at the University of Washington. He expects to graduate in June, 2007 (or so he is told). His research interests span many areas of artificial intelligence and computational linguistics, including information extraction, natural language interfaces, parsing, data mining, and probabilistic methods. He is especially interested in building unsupervised and minimally supervised models from large corpora.