Gleaning Answers from the Web

Nicholas Kushmerick

This position paper summarizes my recent and ongoing research on Web information extraction and retrieval. I describe wrapper induction and verification techniques for extracting data from structured sources; boosted wrapper induction, an extension of these techniques to handle natural text; ELIXIR, our efficient and expressive language for XML information retrieval; techniques and applications for text genre classification; and stochastic models for XML. The unifying theme of these various research projects is to develop enabling technologies that facilitate the rapid development of large Web services for data access and integration.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.