FgER: Fine-Grained Entity Recognition

Authors

Abhishek Abhishek

Indian Institute of Technology Guwahati

Published:

2018-02-08

Proceedings:

Proceedings of the AAAI Conference on Artificial Intelligence, 32

Volume

Issue:

Thirty-Second AAAI Conference on Artificial Intelligence 2018

Track:

Doctoral Consortium

Downloads:

Download PDF

Abstract:

Fine-grained Entity Recognition (FgER) is the task of detecting and classifying entity mentions into more than 100 types. The type set can span various domains including biomedical (e.g., disease, gene), sport (e.g., sports event, sports player), religion and mythology (e.g., religion, god) and entertainment (e.g., movies, music). Most of the existing literature for Entity Recognition (ER) focuses on coarse-grained entity recognition (CgER), i.e., recognition of entities belonging to few types such as person, location and organization. In the past two decades, several manually annotated datasets spanning different genre of texts were created to facilitate the development and evaluation of CgER systems (Nadeau and Sekine 2007). The state-of-the-art CgER systems use supervised statistical learning models trained on manually annotated datasets (Ma and Hovy 2016). In contrast, FgER systems are yet to match the performance level of CgER systems. There are two major challenges associated with failure of FgER systems. First, manually annotating a large-scale multi-genre training data for FgER task is expensive, time-consuming and error-prone. Note that, a human-annotator will have to choose a subset of types from a large set of types and types for the same entity might differ in sentences based on the contextual information. Second, supervised statistical learning models when trained on automatically generated noisy training data fits to noise, impacting the model’s performance. The objective of my thesis is to create a FgER system by exploring an off the beaten path which can eliminate the need for manually annotating large-scale multi-genre training dataset. The path includes: (1) automatically generating a large-scale single-genre training dataset, (2) noise-aware learning models that learn better in noisy datasets, and (3) use of knowledge transfer approaches to adapt FgER system to different genres of text.

DOI:

10.1609/aaai.v32i1.11352

AAAI

Thirty-Second AAAI Conference on Artificial Intelligence 2018

ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.