Temporal action detection is a challenging task due to vagueness of action boundaries. To tackle this issue, we propose an end-to-end progressive boundary refinement network (PBRNet) in this paper. PBRNet belongs to the family of one-stage detectors and is equipped with three cascaded detection modules for localizing action boundary more and more precisely. Specifically, PBRNet mainly consists of coarse pyramidal detection, refined pyramidal detection, and fine-grained detection. The first two modules build two feature pyramids to perform the anchor-based detection, and the third one explores the frame-level features to refine the boundaries of each action instance. In the fined-grained detection module, three frame-level classification branches are proposed to augment the frame-level features and update the confidence scores of action instances. Evidently, PBRNet integrates the anchor-based and frame-level methods. We experimentally evaluate the proposed PBRNet and comprehensively investigate the effect of the main components. The results show PBRNet achieves the state-of-the-art detection performances on two popular benchmarks: THUMOS'14 and ActivityNet, and meanwhile possesses a high inference speed.
Published Date: 2020-06-02
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved