Nick Kushmerick and Brett Grace
There is much interest in systems that automatically interact with Internet information sites. Such systems are hard to build, partly because they use hand-crafted wrappers to extract a site’s content. We advocate wrapper induction, a technique for automatically learning wrappers. Our Wrapper Induction ENvironment (WIEN) enables users to quickly capture a set of example page; our wrapper learning algorithm then handles the low-level details of constructing the wrapper.