Spelling correction as an iterative process that exploits the collective knowledge of web users

Silviu Cucerzan, Eric Brill

Proceedings of EMNLP 2004 |

Logs of user queries to an internet search engine provide a large amount of implicit and explicit information about language. In this paper, we investigate their use in spelling correction of search queries, a task which poses many additional challenges beyond the traditional spelling correction problem. We present an approach that uses an iterative transformation of the input query strings into other strings that correspond to more and more likely queries according to statistics extracted from internet search query logs.