Constructing Career Histories: A Case Study in Disentangling the Threads

Paul R. Cohen

We present an algorithm for organizing partially-ordered observations into multiple "threads," some of which may be concurrent., The algorithm is applied to the problem of constructing career histories for individual scientists from the abstracts of published papers. Because abstracts generally do not provide rich information about the contents of papers, we developed a novel relational method for judging the similarity of papers. We report four experiments that demonstrate the advantage of this method over the traditional Dice and Tanimoto coefficients, and that evaluate the quality of induced multi-thread career histories.

Subjects: 12. Machine Learning and Discovery; 10. Knowledge Acquisition

Submitted: Oct 16, 2006

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.