No. 3: AAAI-22 Technical Tracks 3
AAAI Technical Track on Computer Vision III
Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head
PDFSCALoss: Side and Corner Aligned Loss for Bounding Box Regression
PDFSepFusion: Finding Optimal Fusion Structures for Visual Sound Separation
PDFPan-Sharpening with Customized Transformer and Invertible Neural Network
PDFPromoting Single-Modal Optical Flow Network for Diverse Cross-Modal Flow Estimation
PDFEdge-Aware Guidance Fusion Network for RGB–Thermal Scene Parsing
PDFTiGAN: Text-Based Interactive Image Generation and Manipulation
PDFCross-Domain Empirical Risk Minimization for Unbiased Long-Tailed Classification
PDFDeep Recurrent Neural Network with Multi-Scale Bi-directional Propagation for Video Deblurring
PDFI Can Find You! Boundary-Guided Separated Attention Network for Camouflaged Object Detection
PDFMoCaNet: Motion Retargeting In-the-Wild via Canonicalization Networks
PDFRobust Depth Completion with Uncertainty-Driven Loss Functions
PDFEfficient Model-Driven Network for Shadow Removal
PDFLearning Disentangled Classification and Localization Representations for Temporal Action Localization
PDFACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama Depth Estimation
PDFMaking Adversarial Examples More Transferable and Indistinguishable
PDFClass Guided Channel Weighting Network for Fine-Grained Semantic Segmentation
PDFContext-Based Contrastive Learning for Scene Text Recognition
PDFLearning Network Architecture for Open-Set Recognition
PDFAn Adversarial Framework for Generating Unseen Images by Activation Maximization
PDFContrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation
PDFPose-Invariant Face Recognition via Adaptive Angular Distillation
PDFEnd-to-End Learning the Partial Permutation Matrix for Robust 3D Point Cloud Registration
PDFPetsGAN: Rethinking Priors for Single Image Generation
PDFNested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding
PDFOA-FSUI2IT: A Novel Few-Shot Cross Domain Object Detection Framework with Object-Aware Few-Shot Unsupervised Image-to-Image Translation
PDFStatic-Dynamic Co-teaching for Class-Incremental 3D Object Detection
PDFLocal Surface Descriptor for Geometry and Feature Preserved Mesh Denoising
PDFBoosting Generative Zero-Shot Learning by Synthesizing Diverse Features with Attribute Augmentation
PDFSelf-Supervised Pretraining for RGB-D Salient Object Detection
PDFAdaptive Logit Adjustment Loss for Long-Tailed Visual Recognition
PDFCADRE: A Cascade Deep Reinforcement Learning Framework for Vision-Based Autonomous Urban Driving
PDFLearning from the Tangram to Solve Mini Visual Tasks
PDFHandling Slice Permutations Variability in Tensor Recovery
PDFBoosting Contrastive Learning with Relation Knowledge Distillation
PDFWeakly Supervised Video Moment Localization with Contrastive Negative Sample Mining
PDFSelf-Labeling Framework for Novel Category Discovery over Domains
PDFEfficient Compact Bilinear Pooling via Kronecker Product
PDFHybrid Graph Neural Networks for Few-Shot Learning
PDFSOIT: Segmenting Objects with Instance-Aware Transformers
PDFMSML: Enhancing Occlusion-Robustness by Multi-Scale Segmentation-Based Mask Learning for Face Recognition
PDFDetecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics
PDFTask-Level Self-Supervision for Cross-Domain Few-Shot Learning
PDFImproving 360 Monocular Depth Estimation via Non-local Dense Prediction Transformer and Joint Supervised and Self-Supervised Learning
PDFHomography Decomposition Networks for Planar Object Tracking
PDFPatch Diffusion: A General Module for Face Manipulation Detection
PDFSemi-supervised Object Detection with Adaptive Class-Rebalancing Self-Training
PDFShow Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching
PDFSCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-resolution
PDFEnergy-Based Generative Cooperative Saliency Prediction
PDFAttention-Based Transformation from Latent Features to Point Clouds
PDFSuppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning
PDFLGD: Label-Guided Self-Distillation for Object Detection
PDFUncertainty Modeling with Second-Order Transformer for Group Re-identification
PDFDeep Spatial Adaptive Network for Real Image Demosaicing
PDFMAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-Based Image Captioning
PDFClinical-BERT: Vision-Language Pre-training for Radiograph Diagnosis and Reports Generation
PDFInferring Prototypes for Multi-Label Few-Shot Image Classification with Word Vector Guided Attention
PDFUnsupervised Domain Adaptive Salient Object Detection through Uncertainty-Aware Pseudo-Label Learning
PDFTransmission-Guided Bayesian Generative Model for Smoke Segmentation
PDFCross-Species 3D Face Morphing via Alignment-Aware Controller
PDFExploring Visual Context for Weakly Supervised Person Search
PDFCross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation
PDFMutual Contrastive Learning for Visual Representation Learning
PDFTemporal Action Proposal Generation with Background Constraint
PDFCross-Modal Federated Human Activity Recognition via Modality-Agnostic and Modality-Specific Representation Learning
PDFPolygon-to-Polygon Distance Loss for Rotated Object Detection
PDFAn Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
PDFACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization
PDFEnhancing Pseudo Label Quality for Semi-supervised Domain-Generalized Medical Image Segmentation
PDFImage Difference Captioning with Pre-training and Contrastive Learning
PDFSafe Distillation Box
PDFJoint Deep Multi-Graph Matching and 3D Geometry Learning from Inhomogeneous 2D Image Collections
PDFContent-Variant Reference Image Quality Assessment via Knowledge Distillation
PDFWidth & Depth Pruning for Vision Transformers
PDFAnisotropic Fourier Features for Neural Image-Based Rendering and Relighting
PDFVideo as Conditional Graph Hierarchy for Multi-Granular Question Answering
PDFAdaptivePose: Human Parts as Adaptive Points
PDFLearning Quality-Aware Representation for Multi-Person Pose Regression
PDFAttribute-Based Progressive Fusion Network for RGBT Tracking
PDFDetailed Facial Geometry Recovery from Multi-View Images by Learning an Implicit Function
PDFFINet: Dual Branches Feature Interaction for Partial-to-Partial Point Cloud Registration
PDFRendering-Aware HDR Environment Map Prediction from a Single Image
PDFTopology-Aware Convolutional Neural Network for Efficient Skeleton-Based Action Recognition
PDFTranscoded Video Restoration by Temporal Spatial Auxiliary Network
PDFDIRL: Domain-Invariant Representation Learning for Generalizable Semantic Segmentation
PDFBehind the Curtain: Learning Occluded Shapes for 3D Object Detection
PDFDomain Disentangled Generative Adversarial Network for Zero-Shot Sketch-Based 3D Shape Retrieval
PDFDual Attention Networks for Few-Shot Fine-Grained Recognition
PDFSparse Cross-Scale Attention Network for Efficient LiDAR Panoptic Segmentation
PDFTowards Fully Sparse Training: Information Restoration with Spatial Similarity
PDFHierarchical Image Generation via Transformer-Based Sequential Patch Selection
PDFReliable Propagation-Correction Modulation for Video Object Segmentation
PDFAdaptive Hypergraph Neural Network for Multi-Person Pose Estimation
PDFEvo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
PDFMobileFaceSwap: A Lightweight Framework for Video Face Swapping
PDFTexture Reformer: Towards Fast and Universal Interactive Texture Transfer
PDFInteract, Embed, and EnlargE: Boosting Modality-Specific Representations for Multi-Modal Person Re-identification
PDFCan Semantic Labels Assist Self-Supervised Visual Representation Learning?
PDFRethinking the Two-Stage Framework for Grounded Situation Recognition
PDFBoosting the Transferability of Video Adversarial Examples via Temporal Translation
PDFTowards Transferable Adversarial Attacks on Vision Transformers
PDFL-CoDe:Language-Based Colorization Using Color-Object Decoupled Conditions
PDFNeural Interferometry: Image Reconstruction from Astronomical Interferometers Using Transformer-Conditioned Neural Fields
PDFTDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition
PDFLearning Token-Based Representation for Image Retrieval
PDFMulti-Modal Answer Validation for Knowledge-Based VQA
PDFNeighborhood Consensus Contrastive Learning for Backward-Compatible Representation
PDFPale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
PDFStyle Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation
PDFMulti-Centroid Representation Network for Domain Adaptive Person Re-ID
PDFEfficient Non-local Contrastive Attention for Image Super-resolution
PDFCoarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-Based Super-resolution
PDFCross-Domain Collaborative Normalization via Structural Knowledge
PDFReMoNet: Recurrent Multi-Output Network for Efficient Video Denoising
PDFTransfer Learning from Synthetic to Real LiDAR Point Cloud for Semantic Segmentation
PDFUCTransNet: Rethinking the Skip Connections in U-Net from a Channel-Wise Perspective with Transformer
PDFRenovate Yourself: Calibrating Feature Representation of Misclassified Pixels for Semantic Segmentation
PDFSeparated Contrastive Learning for Organ-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation
PDFContrastive Quantization with Code Memory for Unsupervised Image Retrieval
PDFLearning Temporally and Semantically Consistent Unpaired Video-to-Video Translation through Pseudo-Supervision from Synthetic Optical Flow
PDFCross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving
PDFScaled ReLU Matters for Training Vision Transformers
PDFCQA-Face: Contrastive Quality-Aware Attentions for Face Recognition
PDFCategory-Specific Nuance Exploration Network for Fine-Grained Object Retrieval
PDFDetail-Preserving Transformer for Light Field Image Super-resolution
PDFOne-Shot Talking Face Generation from Single-Speaker Audio-Visual Correlation Learning
PDFPose-Guided Feature Disentangling for Occluded Person Re-identification Based on Transformer
PDFFFNet: Frequency Fusion Network for Semantic Scene Completion
PDFPrivacy-Preserving Face Recognition in the Frequency Domain
PDFAnchor DETR: Query Design for Transformer-Based Detector
PDFPanini-Net: GAN Prior Based Degradation-Aware Feature Interpolation for Face Restoration
PDFEnd-to-End Transformer Based Model for Image Captioning
PDFLearning to Detect 3D Facial Landmarks via Heatmap Regression with Graph Convolutional Network
PDFLow-Light Image Enhancement with Normalizing Flow
PDFNegative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
PDF