Iterative Human-in-the-Loop Discovery of Unknown Unknowns in Image Datasets

Authors

Lei Han,Xiao Dong,Gianluca Demartini

The University of Queensland,Sun Yat-sen University,The University of Queensland

Published:

2021-11-14

Proceedings:

Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 9

Volume

Issue:

Vol. 9 (2021): Proceedings of the Ninth AAAI Conference on Human Computation and Crowdsourcing

Track:

Full Archival Papers

Downloads:

Download PDF

Abstract:

Automatic predictions (e.g., recognizing objects in images) may result in systematic errors if certain classes are not well represented by training instances (these errors are called unknowns). When a model assigns high confidence scores to these wrong predictions (this type of error is called unknown unknowns), it becomes challenging to automatically identify them. In this paper, we present the first work on leveraging human intelligence to discover unknown unknowns (UUs) in an iterative way. The proposed methodology first differentiates the feature space generated by crowd workers labelling instances (e.g., images) in an active learning fashion from the space learned by the prediction model over a batch training phase, and thus identifies the predictions most likely to be UUs. Next, we add crowd labels collected for these discovered UUs to the training set and re-train the model with this extended dataset. This process is then repeated iteratively to discover more instances of both unknown and under-represented classes. Our experimental results show that the proposed methodology is able to (1) efficiently discover UUs, (2) significantly improve the quality of model predictions, and (3) to push UUs into known unknowns (i.e., the model makes mistakes but at least its classification confidence on those instances is low so those predictions can be discarded or post-processed) for further investigation. We additionally discuss the trade-off between prediction quality improvements and the human effort required to achieve those improvements. Our results bear implications on building cost-effective systems to discover UUs with humans in the loop.

DOI:

10.1609/hcomp.v9i1.18941

HCOMP

Vol. 9 (2021): Proceedings of the Ninth AAAI Conference on Human Computation and Crowdsourcing

ISBN 978-1-57735-872-5

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.