Proceedings:
No. 14: AAAI-21 Technical Tracks 14
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 35
Track:
AAAI Technical Track on Speech and Natural Language Processing I
Downloads:
Abstract:
Automatic evaluation for open-ended natural language generation tasks remains a challenge. We propose a learned evaluation metric: Perception Score. It utilizes a pre-trained model and considers context information for conditional generation. Perception Score assigns a holistic score along with the uncertainty measurement. We conduct experiments on three open-ended conditional generation tasks and two open-ended unconditional generation tasks. Perception Score achieves state-of-the-art results on all the tasks consistently in terms of correlation with human evaluation scores.
DOI:
10.1609/aaai.v35i14.17526
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 35