Proceedings:
Book One
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 20
Track:
Natural Language Processing and Speech Recognition
Downloads:
Abstract:
A critical problem faced by current supervised WSD systems is the lack of manually annotated training data. Tackling this data acquisition bottleneck is crucial, in order to build high accuracy and wide-coverage WSD systems. In this paper, we show that the approach of automatically gathering training examples from parallel texts is scalable to a large set of nouns. We conducted evaluation on the nouns of SENSEVAL-2 English all-words task, using fine-grained sense scoring. Our evaluation shows that training on examples gathered from 680MB of parallel texts achieves accuracy comparable to the best system of SENSEVAL-2 English all-words task, and significantly outperforms the baseline of always choosing sense 1 of WordNet.
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 20