Using Layout for the Generation Understanding or Retrieval of Documents:
Papers from the AAAI Fall Symposium
Richard Power and Donia Scott, Cochairs
November 5-7, 1999, North Falmouth, Massachusetts
This symposium brought together academic researchers exploring computational treatments of layout as a feature of text, and practitioners of information design where layout plays a major role. The participants reflected a range of areas within linguistics and computer science in which layout can be studied and used.
The symposium had four main sessions centered on the key themes. The opening session explored the general significance of layout in written language. The program included contributions from people who are active in commercial document production as well as theoretical research. They discussed practical issues of document design, the relationship between layout and genre, and research on the ways in which layout can influence text comprehension.
The focus then turned to the role of layout in natural language generation. Since layout and wording interact, it is desirable that a system, which automatically generates documents, should include layout specifications in its output. The symposium discussed specific interactions between graphical and linguistic features, and how these affect the architecture of the system and the nature of text planning.
The third session considered the converse problem of exploiting layout features during information extraction. Because layout often expresses semantic or rhetorical distinctions, a system that automatically extracts data from a document can be enhanced by taking account of graphical features, even perhaps superficial ones. The symposium discussed examples like table recognition, use of indenting, and the role of formatting in identifying key words.
In the fourth session the symposium discussed systems that utilize rhetorical markup in order to perform automatic document formatting, focusing, in particular, on the important empirical problem of discovering the relationship between rhetorical structure and layout in specific genres and domains.