Proceedings:
No. 3: AAAI-22 Technical Tracks 3
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 36
Track:
AAAI Technical Track on Computer Vision III
Downloads:
Abstract:
Existing image semantic segmentation methods favor learning consistent representations by extracting long-range contextual features with the attention, multi-scale, or graph aggregation strategies. These methods usually treat the misclassified and correctly classified pixels equally, hence misleading the optimization process and causing inconsistent intra-class pixel feature representations in the embedding space during learning. In this paper, we propose the auxiliary representation calibration head (RCH), which consists of the image decoupling, prototype clustering, error calibration modules and a metric loss function, to calibrate these error-prone feature representations for better intra-class consistency and segmentation performance. RCH could be incorporated into the hidden layers, trained together with the segmentation networks, and decoupled in the inference stage without additional parameters. Experimental results show that our method could significantly boost the performance of current segmentation methods on multiple datasets (e.g., we outperform the original HRNet and OCRNet by 1.1% and 0.9% mIoU on the Cityscapes test set). Codes are available at https://github.com/VipaiLab/RCH.
DOI:
10.1609/aaai.v36i3.20145
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 36