As most recently proposed methods for human detection have achieved a sufficiently high recall rate within a reasonable number of proposals, in this paper, we mainly focus on how to improve the precision rate of human detectors. In order to address the two main challenges in precision improvement, i.e., i) hard background instances and ii) redundant partial proposals, we propose the novel PoseHD framework, a top-down pose-based approach on the basis of an arbitrary state-of-the-art human detector. In our proposed PoseHD framework, we first make use of human pose estimation (in a batch manner) and present pose heatmap classification (by a convolutional neural network) to eliminate hard negatives by extracting the more detailed structural information; then, we utilize pose-based proposal clustering and reranking modules, filtering redundant partial proposals by comprehensively considering both holistic and part information. The experimental results on multiple pedestrian benchmark datasets validate that our proposed PoseHD framework can generally improve the overall performance of recent state-of-the-art human detectors (by 2-4% in both mAP and MR metrics). Moreover, our PoseHD framework can be easily extended to object detection with large-scale object part annotations. Finally, in this paper, we present extensive ablative analysis to compare our approach with these traditional bottom-up pose-based models and highlight the importance of our framework design decisions.
Published Date: 2018-02-08
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2018, Association for the Advancement of Artificial Intelligence All Rights Reserved.