Discourse Factors in Multi-document Summarization

Ani Nenkova

The over-abundance of information today, especially on-line, has established the need for natural language technologies that can help the user find relevant information; multi-document summarization (MDS) and question answering (QA) are two examples. The requirement in MDS and open-ended QA to produce multi-sentential answers imposes the extra demand that the output of such systems be a coherent discourse. The problem of generating appropriate referring expressions to entities in these texts is non-trivial, because different sentences are taken from their original context and put together to form a text. The new context of the summary often requires changes in surface realization of the references, demanding the inclusion of additional information or removal of redundant information. Such changes can be implemented by gathering a collection of possible references to an entity from the input documents and then rewriting the references in the sentences selected for inclusion in the summary. A question arises how to determine which attributes or descriptions of the referent would be appropriate for the context of the summary.

Subjects: 13. Natural Language Processing; 13.1 Discourse

Submitted: Apr 5, 2005

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.