Jan Ulrich, Gabriel Murray, Giuseppe Carenini
Annotated email corpora are necessary for evaluation and training of machine learning summarization techniques. The scarcity of corpora has been a limiting factor for research in this field. We describe our process of creating a new annotated email thread corpus that will be made publicly available. We present the trade-offs of the different annotation methods that could be used.
Subjects: 1.10 Information Retrieval; 13.1 Discourse
Submitted: May 5, 2008
This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.