Pyramidal Feature Shrinking for Salient Object Detection

Authors

Mingcan Ma

State Key Laboratory of Virtual Reality Technology and Systems, SCSE, Beihang University, Beijing, China Pengcheng Laboratory, Shenzhen, China

Changqun Xia

Pengcheng Laboratory, Shenzhen, China

Jia Li

State Key Laboratory of Virtual Reality Technology and Systems, SCSE, Beihang University, Beijing, China Pengcheng Laboratory, Shenzhen, China

Proceedings:

No. 3: AAAI-21 Technical Tracks 3

Volume

Issue:

Proceedings of the AAAI Conference on Artificial Intelligence, 35

Track:

AAAI Technical Track on Computer Vision II

Downloads:

Download PDF

Abstract:

Recently, we have witnessed the great progress of salient object detection (SOD), which benefits from the effectiveness of various feature aggregation strategies. However, existing methods usually aggregate the low-level features containing details and the high-level features containing semantics over a large span, which introduces noise into the aggregated features and generate inaccurate saliency map. To address this issue, we propose pyramidal feature shrinking network (PFSNet), which aims to aggregate adjacent feature nodes in pairs with layer-by-layer shrinkage, so that the aggregated features fuse effective details and semantics together and discard interference information. Specifically, pyramidal shrinking decoder (PSD) is proposed to aggregate adjacent features hierarchically in an asymptotic manner. Unlike other methods that aggregate features with significantly different information, this method only focuses on adjacent feature nodes in each layer and shrinks them to a final unique feature node. Besides, we propose adjacent fusion module (AFM) to perform mutual spatial enhancement between the adjacent features so as to dynamically weight the features and adaptively fuse the appropriate information. In addition, scale-aware enrichment module (SEM) based on the features extracted from backbone is utilized to obtain rich scale information and generate diverse initial features with dilated convolutions. Extensive quantitative and qualitative experiments demonstrate that the proposed intuitive framework outperforms 14 state-of-the-art approaches on 5 public datasets.

DOI:

10.1609/aaai.v35i3.16331

AAAI

Proceedings of the AAAI Conference on Artificial Intelligence, 35

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.