No. 2: AAAI-22 Technical Tracks 2
AAAI Technical Track on Computer Vision II
Iterative Contrast-Classify for Semi-supervised Temporal Action Segmentation
PDFJPV-Net: Joint Point-Voxel Representations for Accurate 3D Object Detection
PDFFully Attentional Network for Semantic Segmentation
PDFSelf-Supervised Object Localization with Joint Graph Partition
PDFCorrelation Field for Boosting 3D Object Detection in Structured Scenes
PDFBoost Supervised Pretraining for Visual Transfer Learning: Implications of Self-Supervised Contrastive Representation Learning
PDFDual Contrastive Learning for General Face Forgery Detection
PDFSSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal
PDFAdversarial Bone Length Attack on Action Recognition
PDFSparse MLP for Image Recognition: Is Self-Attention Really Necessary?
PDFNot All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective
PDFTransfer Learning for Color Constancy via Statistic Perspective
PDFTVT: Three-Way Vision Transformer through Multi-Modal Hypersphere Learning for Zero-Shot Sketch-Based Image Retrieval
PDFGuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled Images as Reference
PDFMTLDesc: Looking Wider to Describe Better
PDFActive Boundary Loss for Semantic Segmentation
PDFOnline-Updated High-Order Collaborative Networks for Single Image Deraining
PDFFCA: Learning a 3D Full-Coverage Vehicle Camouflage for Multi-View Physical Adversarial Attack
PDFWhen Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism
PDFSelf-Supervised Representation Learning Framework for Remote Physiological Measurement Using Spatiotemporal Augmentation Loss
PDFSelf-Supervised Category-Level 6D Object Pose Estimation with Deep Implicit Shape Representation
PDFSemantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels
PDFReX: An Efficient Approach to Reducing Memory Cost in Image Classification
PDFCPRAL: Collaborative Panoptic-Regional Active Learning for Semantic Segmentation
PDFActivation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation
PDFTransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework Using Self-Supervised Multi-Task Learning
PDFDeep Implicit Statistical Shape Models for 3D Medical Image Delineation
PDFDecompose the Sounds and Pixels, Recompose the Events
PDFLearning from Label Proportions with Prototypical Contrastive Clustering
PDFBeyond Learning Features: Training a Fully-Functional Classifier with ZERO Instance-Level Labels
PDFReference-Guided Pseudo-Label Generation for Medical Semantic Segmentation
PDFInformation-Theoretic Bias Reduction via Causal View of Spurious Correlation
PDFImproving Scene Graph Classification by Exploiting Knowledge from Texts
PDFReliable Inlier Evaluation for Unsupervised Point Cloud Registration
PDFExplainable Survival Analysis with Convolution-Involved Vision Transformer
PDFUn-mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning
PDFOn the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals
PDFSocial Interpretable Tree for Pedestrian Trajectory Prediction
PDFP^3-Net: Part Mobility Parsing from Point Cloud Sequences via Learning Explicit Point Correspondence
PDFImproving Zero-Shot Phrase Grounding via Reasoning on External Knowledge and Spatial Relations
PDFA Fusion-Denoising Attack on InstaHide with Data Augmentation
PDFDeep Neural Networks Learn Meta-Structures from Noisy Labels in Semantic Segmentation
PDFStochastic Planner-Actor-Critic for Unsupervised Deformable Image Registration
PDFAdaptive Poincaré Point to Set Distance for Few-Shot Classification
PDFGenerative Adaptive Convolutions for Real-World Noisy Image Denoising
PDFREMOTE: Reinforced Motion Transformation Network for Semi-supervised 2D Pose Estimation in Videos
PDFLearning from the Target: Dual Prototype Network for Few Shot Semantic Segmentation
PDFMOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation
PDFTowards Bridging Sample Complexity and Model Capacity
PDFTowards Accurate Facial Motion Retargeting with Identity-Consistent and Expression-Exclusive Constraints
PDFCan Vision Transformers Learn without Natural Images?
PDFFederated Learning for Face Recognition with Gradient Correction
PDFRestorable Image Operators with Quasi-Invertible Networks
PDFTEACh: Task-Driven Embodied Agents That Chat
PDFLabel-Efficient Hybrid-Supervised Learning for Medical Image Segmentation
PDFLess Is More: Pay Less Attention in Vision Transformers
PDFUnsupervised Representation for Semantic Segmentation by Implicit Cycle-Attention Contrastive Learning
PDFGraph-Based Point Tracker for 3D Object Tracking in Point Clouds
PDFSyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
PDFVision Transformers Are Robust Learners
PDFMemory-Based Jitter: Improving Visual Recognition on Long-Tailed Data with Diversity in Memory
PDFDebiased Batch Normalization via Gaussian Process for Generalizable Person Re-identification
PDFParallel and High-Fidelity Text-to-Lip Generation
PDFSiamTrans: Zero-Shot Multi-Frame Image Restoration with Pre-trained Siamese Transformers
PDFSingle-Domain Generalization in Medical Image Segmentation via Test-Time Adaptation from Shape Dictionary
PDFLearning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints
PDFOVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning
PDFFeature Generation and Hypothesis Verification for Reliable Face Anti-spoofing
PDFImage-Adaptive YOLO for Object Detection in Adverse Weather Conditions
PDFVisual Sound Localization in the Wild by Cross-Modal Interference Erasing
PDFLearning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
PDFHighlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection
PDFDMN4: Few-Shot Learning via Discriminative Mutual Nearest Neighbor Neural Network
PDFMulti-Knowledge Aggregation and Transfer for Semantic Segmentation
PDFUnsupervised Coherent Video Cartoonization with Perceptual Motion Consistency
PDFTask-Customized Self-Supervised Pre-training with Scalable Dynamic Routing
PDFPose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction
PDFPMAL: Open Set Recognition via Robust Prototype Mining
PDFBarely-Supervised Learning: Semi-supervised Learning with Very Few Labeled Images
PDFLearning Optical Flow with Adaptive Graph Reasoning
PDFDeconfounding Physical Dynamics with Global Causal Relation and Confounder Transmission for Counterfactual Prediction
PDFOne More Check: Making “Fake Background” Be Tracked Again
PDFSemantically Contrastive Learning for Low-Light Image Enhancement
PDFSelf-Supervised Spatiotemporal Representation Learning by Exploiting Video Continuity
PDFInharmonious Region Localization by Magnifying Domain Discrepancy
PDFDistribution Aware VoteNet for 3D Object Detection
PDFContrastive Instruction-Trajectory Learning for Vision-Language Navigation
PDFInterventional Multi-Instance Learning with Deconfounded Instance-Level Prediction
PDFA Causal Debiasing Framework for Unsupervised Salient Object Detection
PDFA Causal Inference Look at Unsupervised Video Anomaly Detection
PDFUnpaired Multi-Domain Stain Transfer for Kidney Histopathological Images
PDFDynamic Spatial Propagation Network for Depth Completion
PDFLocal Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks
PDFFedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition
PDFMemory-Guided Semantic Learning Network for Temporal Sentence Grounding
PDFExploring Motion and Appearance Information for Temporal Sentence Grounding
PDFUnsupervised Temporal Video Grounding with Deep Semantic Clustering
PDFSpikeConverter: An Efficient Conversion Framework Zipping the Gap between Artificial Neural Networks and Spiking Neural Networks
PDFPerceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition
PDFAnchorFace: Boosting TAR@FAR for Practical Face Recognition
PDFLogit Perturbation
PDFNeighborhood-Adaptive Structure Augmented Metric Learning
PDFStereo Neural Vernier Caliper
PDFEditVAE: Unsupervised Parts-Aware Controllable 3D Point Cloud Shape Generation
PDFSelf-Training Multi-Sequence Learning with Transformer for Weakly Supervised Video Anomaly Detection
PDFTA2N: Two-Stage Action Alignment Network for Few-Shot Action Recognition
PDFBest-Buddy GANs for Highly Detailed Image Super-resolution
PDFSCAN: Cross Domain Object Detection with Semantic Conditioned Adaptation
PDFHybrid Instance-Aware Temporal Fusion for Online Video Instance Segmentation
PDFClose the Loop: A Unified Bottom-Up and Top-Down Paradigm for Joint Image Deraining and Segmentation
PDFUncertainty Estimation via Response Scaling for Pseudo-Mask Noise Mitigation in Weakly-Supervised Semantic Segmentation
PDFMulti-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking
PDFDefending against Model Stealing via Verifying Embedded External Features
PDFTowards an Effective Orthogonal Dictionary Convolution Strategy
PDFELMA: Energy-Based Learning for Multi-Agent Activity Forecasting
PDFEqual Bits: Enforcing Equally Distributed Binary Network Weights
PDFSimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-training for Spatial-Aware Visual Representations
PDFImproving Human-Object Interaction Detection via Phrase Learning and Label Composition
PDFRethinking the Optimization of Average Precision: Only Penalizing Negative Instances before Positive Ones Is Enough
PDFReliability Exploration with Self-Ensemble Learning for Domain Adaptive Person Re-identification
PDFAmplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation
PDFSiamese Network with Interactive Transformer for Video Object Segmentation
PDFAdversarial Attack for Asynchronous Event-Based Data
PDFIteratively Selecting an Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier
PDFSCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation
PDFShrinking Temporal Attention in Transformers for Video Action Recognition
PDFDanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer
PDFInterpretable Generative Adversarial Networks
PDFCross-Modal Object Tracking: Modality-Aware Representations and a Unified Benchmark
PDFYou Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation
PDFKnowledge Distillation for Object Detection via Rank Mimicking and Prediction-Guided Feature Imitation
PDFRethinking Pseudo Labels for Semi-supervised Object Detection
PDFAction-Aware Embedding Enhancement for Image-Text Retrieval
PDFRetinomorphic Object Detection in Asynchronous Visual Streams
PDFLearning from Weakly-Labeled Web Videos via Exploring Sub-concepts
PDFLearning Universal Adversarial Perturbation by Adversarial Example