Jeunghyun Byun, Seung-Wook Lee, Young-In Song, Hae-Chang Rim
In this paper, we propose a new model for refining SMS text messages where two different kinds of grammatical errors frequently occur together. A two-phase approach based on the divide and conquer strategy is presented where HMM-based model is used for correcting spacing errors in the first phase, and rule-based correction model is used for correcting spelling errors in the second phase. Experimental results show that the proposed approach yields better performance than the translation based approach.
Subjects: 13. Natural Language Processing; Please choose a second document classification
Submitted: May 4, 2008