Learning a Named Entity Tagger from Gazetteers with the Partial Perceptron

Authors

Andrew Carlson

Scott Gaffney

and Flavian Vasile

Proceedings:

No. 7: Learning by Reading and Learning to Read

Volume

Issue:

Papers from the 2009 AAAI Spring Symposium

Track:

Contents

Downloads:

Download PDF

Abstract:

While gazetteers can be used to perform named entity recognition through lookup-based methods, ambiguity and incomplete gazetteers lead to relatively low recall. A sequence model which uses more general features can achieve higher recall while maintaining reasonable precision, but typically requires expensive annotated training data. To circumvent the need for such training data, we bootstrap the learning of a sequence model with a gazetteer-driven labeling algorithm which only labels tokens in unlabeled data that it can label confidently. We present an algorithm, called the Partial Perceptron, for discriminatively learning the parameters of a sequence model from such partially labeled data. The algorithm is easy to implement and trains much more quickly than a state-of-the-art algorithm based on Conditional Random Fields with equivalent performance. Experimental results show that the learned model yields a substantial relative improvement in recall (77.3%) with some loss in precision (a 28.7% relative decrease) when compared to the gazetteer-driven method.

Spring

Papers from the 2009 AAAI Spring Symposium

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.