Learning User Interests Across Heterogeneous Document Databases

Bruce Krulwich

This paper discusses an intelligent agent that learns to identify documents of interest to particular users, in a distributed and dynamic database environment with databases consisting of mail messages, news articles, technical articles, on-line discussions, client information, proposals, design documentation, and so on. The agent interacts with the user to categorize each liked or disliked document, uses significant-phrase extraction and inductive learning techniques to determine recognition criteria for each category, and routinely gathers new documents that match the user’s interests. We present the models used to describe the databases and the user’s interests, and discuss the importance of techniques for acquiring high-quality input for learning algorithms.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.