Abstract

Language models adopted by most existing error detection and correctoin approaches of Chinese text are N-Gram models of character, word or POS tag. Their deficiencies are that only local language constraints are employed and there is no language model unification process. A feature-based automatic error detection and correction approach is presented. It uses both local language features and wide-scope semantic features. Winnow is adopted in the learning step. In experiment, this method achieves an error detection recall rate of 85%, precise rate of 41%, and error correction rate of 51%. It shows better performance than existing approaches based on N-Gram models.