Learning Pattern Rules for Chinese Named Entity Extraction

Tat-Seng Chua and Jimin Liu, National University of Singapore

Named entity (NE) extraction in Chinese is very difficult task because of the flexibility in the language structure and uncertainty in word segmentation. It is equivalent to relation and information extraction problems in English. This paper presents a hybrid rule induction approach to extract NEs in Chinese. The method induces rules and names and their context, and generalizes these rules using linguistic lexical chaining. In order to handle the ambiguities and other contextual problems peculiar to Chinese, we supplement the basic method with other approaches such as the default-exception tree and decision tree. We tested our method on the MET2 test set and the method has been found to out-perform all reported methods with an overall F1 measure of over 91%.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.