Spurious words usually have no counterpart
in other languages, and are therefore a
headache in machine translation. In this
paper, we propose a novel framework,
skeleton-enhanced translation, in which a
conventional SMT decoder can boost itself
by considering the skeleton of the source
input and the translation of such skeleton.
By the skeleton of a sentence it is meant
the sentence with its spurious words removed.
We will introduce two models for
identifying spurious words: one is a context-
insensitive model, which removes all
tokens of certain words; another is a context-
sensitive model, which makes separate
decision for each word token. We will also
elaborate two methods to improve a translation
decoder using skeleton translation:
one is skeleton-enhanced re-ranking, which
re-ranks the n-best output of a conventional
SMT decoder with respect to a translated
skeleton; another is skeleton-enhanced decoding,
which re-ranks the translation hypotheses
of not only the entire sentence but
any span of the sentence. Our experiments
show significant improvement (1.6 BLEU)
over the state-of-the-art SMT performance.