Hypertext Summary Extraction for Fast Document Browsing

Kavi Mahesh

This article describes an application of Natural Langnage Processing (NLP) techniques to enable fast browsing of on-line documents by automatically generating Hypertext summaries of one or more documents. Unlike previous work on summarization, the system described here, HyperGen, does not produce plain-text snmmaries and does not throw away parts of the document that weren’t included in the summary. HyperGen is based on the view that snmmarization is essentially the task of synthesizing Hypertext structure in a document so that parts of the document "important" to the user are accessible up front while other parts are hidden in multiple layers of increasing detail. In fact, HyperGen generates short descriptions of the contents and rhetorical purposes of the hidden parts to label the Hypertext links between the summary and the different layers of detail that it generates. A prototype HyperGen system has been implemented to illustrate the techniques and demonstrate its usefulness in browsing World Wide Web documents.

