Active Learning for Hierarchical Wrapper Induction

Ion Muslea, Steve Minton, and Craig Knoblock, University of Southern California

As an alternative to manually writing extraction rules, we created STALKER, which is a wrapper induction algorithm that learns high-accuracy extraction rules. The major novelty introduced by STALKER is the concept of hierarchical wrapper induction: the extraction of the relevant data is performed in a hierarchical manner based on the embedded catalog tree (ECT), which is a user-provided description of the information to be extracted.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.