Simone Teufel, Marc Moens
Knowledge about the discourse level structure of a scientific article is useful for flexible and sub-domain independent automatic abstraction. We are interested in the automatic identification of content units (argumentative entities) in the source text, such as GOAL or PROBLEM STATEMENT, CONCLUSIONS and RESULTS. In this paper, we present an extension of Kupiec et al.’s methodology for trainable statistical sentence extraction (1995). Our extension additionally classifies the extracted sentences according to their argumentative status; because only low-level properties of the sentence are taken into account and no external knowledge sources other than meta-linguistic ones are used, it achieves robustness and partial domain-independence.