AAAI Publications, The Thirtieth International Flairs Conference

Font Size: 
High Recall Text Classification for Public Health Systematic Review
Paul McNamee, James Mayfield, Samantha Y. Rowe, Alexander K. Rowe, Hannah L. Jackson, Megan Baker

Last modified: 2017-05-03


Some information retrieval applications demand manageable levels of precision at high levels of recall. Examples include e-discovery, patent search, and systematic review. In this paper we present a real-world case study supporting a broad topic systematic review in the public health domain. We provide experimental results that demonstrate how retrieval performance on bibliographic citations can be materially improved. We attained an average precision of 0.57 and recall approaching 80% at a very reasonable screening depth. These results represent 18% and 23% relative gains over a baseline classifier. We also address pragmatic issues that arise when working on “noisy” real-world data, such as coping with citation records that often have empty fields.


Text Classification; Systematic Review; Machine Learning

Full Text: PDF