A Bootstrapping Approach to Information Extraction Domain Porting

Authors

Cheng Niu

Wei Li

and Rohini K. Srihari

Track:

Contents

Downloads:

Download PDF

Abstract:

This paper presents a seed-driven, bootstrapping approachto domain porting that could be used to customize a genericinformation extraction (IE) capability for a specific domain.The approach taken is based on the existence of a robust,domain-independent IE engine that can continue to beenhanced, independent of any particular domain. Thisapproach combines the strengths of parsing-based symbolicrule learning and the high performance linear string-basedHidden Markov Model (HMM) to automatically derive acustomized IE system with balanced precision and recall.The key idea is to apply precision-oriented symbolic ruleslearned in the first stage to a large corpus in order toconstruct an automatically tagged training corpus. Thistraining corpus is then used to train an HMM to boost therecall. The experiments conducted in named entity (NE)tagging and relationship extraction show a performanceclose to the performance of supervised learning systems.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.