CSI: A Coarse Sense Inventory for 85% Word Sense Disambiguation

Authors

  • Caterina Lacerra Sapienza University of Rome
  • Michele Bevilacqua Sapienza University of Rome
  • Tommaso Pasini Sapienza University of Rome
  • Roberto Navigli Sapienza University of Rome

DOI:

https://doi.org/10.1609/aaai.v34i05.6324

Abstract

Word Sense Disambiguation (WSD) is the task of associating a word in context with one of its meanings. While many works in the past have focused on raising the state of the art, none has even come close to achieving an F-score in the 80% ballpark when using WordNet as its sense inventory. We contend that one of the main reasons for this failure is the excessively fine granularity of this inventory, resulting in senses that are hard to differentiate between, even for an experienced human annotator. In this paper we cope with this long-standing problem by introducing Coarse Sense Inventory (CSI), obtained by linking WordNet concepts to a new set of 45 labels. The results show that the coarse granularity of CSI leads a WSD model to achieve 85.9% F1, while maintaining a high expressive power. Our set of labels also exhibits ease of use in tagging and a descriptiveness that other coarse inventories lack, as demonstrated in two annotation tasks which we performed. Moreover, a few-shot evaluation proves that the class-based nature of CSI allows the model to generalise over unseen or under-represented words.

Downloads

Published

2020-04-03

How to Cite

Lacerra, C., Bevilacqua, M., Pasini, T., & Navigli, R. (2020). CSI: A Coarse Sense Inventory for 85% Word Sense Disambiguation. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 8123-8130. https://doi.org/10.1609/aaai.v34i05.6324

Issue

Section

AAAI Technical Track: Natural Language Processing