Abstract

Lexicons are important resources for semantic tagging. However,
commonly used lexicons collected from entity databases suffer
from multiple problems, such as ambiguity, limited coverage and
lack of relative importance. In this work we present a lexicon
modeling technique that automatically expands the lexicon and
assigns weights to its elements. For lexicon expansion, we use a
generative model to extract patterns from query logs using known
lexicon seeds, and discover new lexicon elements using the learned
patterns. For lexicon weighting, we propose two approaches based
on generative and discriminative models to learn the relative
importance of lexicon elements from user click statistics.
Experiments on text queries in multiple domains show that our
lexicon modeling technique can significantly improve semantic
tagging performance.

‚Äč