WAVE: An Incremental Algorithm for Information Extraction

Jonathan H. Aseltine

This paper describes WAVE, a fully automatic, incremental induction algorithm for learning information extraction rules. Unlike traditional batch learners, WAVE learns from a stream of training instances, not a set. WAVE overcomes the inherent problems of incremental operation by maintaining a generalization hierarchy of rules. Use of a hierarchy allows similar rules to be found efficiently, provides a natural bound on generalization, enables recall/precision trade-offs without retraining, and speeds extraction since all rules need not be applied to an instance. Finally, because the reliability of rule predictions are continually updated throughout storage, the hierarchy can be used for extraction at any time. Experiments show that WAVE performs as well as CRYSTAL, a related batch algorithm, in two very different extraction domains. WAVE is significantly faster in a simulated incremental application setting.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.