ECVA

ECCV Conference Papers


Learning Depth from Focus in the Wild
Changyeon Won, Hae-Gon Jeon
[pdf]
[DOI]

Learning-Based Point Cloud Registration for 6D Object Pose Estimation in the Real World
Zheng Dang, Lizhou Wang, Yu Guo, Mathieu Salzmann
[pdf]
[DOI]

An End-to-End Transformer Model for Crowd Localization
Dingkang Liang, Wei Xu, Xiang Bai
[pdf]
[DOI]

Few-Shot Single-View 3D Reconstruction with Memory Prior Contrastive Network
Zhen Xing, Yijiang Chen, Zhixin Ling, Xiangdong Zhou, Yu Xiang
[pdf]
[DOI]

DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection
Liang Peng, Xiaopei Wu, Zheng Yang, Haifeng Liu, Deng Cai
[pdf]
[DOI]

Adaptive Co-Teaching for Unsupervised Monocular Depth Estimation
Weisong Ren, Lijun Wang, Yongri Piao, Miao Zhang, Huchuan Lu, Ting Liu
[pdf]
[DOI]

Fusing Local Similarities for Retrieval-Based 3D Orientation Estimation of Unseen Objects
Chen Zhao, Yinlin Hu, Mathieu Salzmann
[pdf]
[DOI]

Lidar Point Cloud Guided Monocular 3D Object Detection
Liang Peng, Fei Liu, Zhengxu Yu, Senbo Yan, Dan Deng, Zheng Yang, Haifeng Liu, Deng Cai
[pdf]
[DOI]

Structural Causal 3D Reconstruction
Weiyang Liu, Zhen Liu, Liam Paull, Adrian Weller, Bernhard Schölkopf
[pdf]
[DOI]

3D Human Pose Estimation Using Möbius Graph Convolutional Networks
Niloofar Azizi, Horst Possegger, Emanuele Rodolà, Horst Bischof
[pdf]
[DOI]

Learning to Train a Point Cloud Reconstruction Network without Matching
Tianxin Huang, Xuemeng Yang, Jiangning Zhang, Jinhao Cui, Hao Zou, Jun Chen, Xiangrui Zhao, Yong Liu
[pdf]
[DOI]

PanoFormer: Panorama Transformer for Indoor 360° Depth Estimation
Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, Yao Zhao
[pdf]
[DOI]

Self-supervised Human Mesh Recovery with Cross-Representation Alignment
Xuan Gong, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, David Doermann, Ziyan Wu
[pdf]
[DOI]

AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction
Zerui Chen, Yana Hasson, Cordelia Schmid, Ivan Laptev
[pdf]
[DOI]

A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation
Yiming Qian, James H. Elder
[pdf]
[DOI]

PS-NeRF: Neural Inverse Rendering for Multi-View Photometric Stereo
Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
[pdf]
[DOI]

Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency
Tom Monnier, Matthew Fisher, Alexei A. Efros, Mathieu Aubry
[pdf]
[DOI]

Towards Comprehensive Representation Enhancement in Semantics-Guided Self-Supervised Monocular Depth Estimation
Jingyuan Ma, Xiangyu Lei, Nan Liu, Xian Zhao, Shiliang Pu
[pdf]
[DOI]

AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture
Zhe Li, Zerong Zheng, Hongwen Zhang, Chaonan Ji, Yebin Liu
[pdf]
[DOI]

Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers
Junhyeong Cho, Kim Youwang, Tae-Hyun Oh
[pdf]
[DOI]

GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping
Pan Ji, Qingan Yan, Yuxin Ma, Yi Xu
[pdf]
[DOI]

Multi-modal Masked Pre-training for Monocular Panoramic Depth Completion
Zhiqiang Yan, Xiang Li, Kun Wang, Zhenyu Zhang, Jun Li, Jian Yang
[pdf]
[DOI]

GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation
Shi Gong, Xiaoqing Ye, Xiao Tan, Jingdong Wang, Errui Ding, Yu Zhou, Xiang Bai
[pdf]
[DOI]

Learning Visibility for Robust Dense Human Body Estimation
Chun-Han Yao, Jimei Yang, Duygu Ceylan, Yi Zhou, Yang Zhou, Ming-Hsuan Yang
[pdf]
[DOI]

Towards High-Fidelity Single-View Holistic Reconstruction of Indoor Scenes
Haolin Liu, Yujian Zheng, Guanying Chen, Shuguang Cui, Xiaoguang Han
[pdf]
[DOI]

CompNVS: Novel View Synthesis with Scene Completion
Zuoyue Li, Tianxing Fan, Zhenqiang Li, Zhaopeng Cui, Yoichi Sato, Marc Pollefeys, Martin R. Oswald
[pdf]
[DOI]

SketchSampler: Sketch-Based 3D Reconstruction via View-Dependent Depth Sampling
Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu
[pdf]
[DOI]

LocalBins: Improving Depth Estimation by Learning Local Distributions
Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka
[pdf]
[DOI]

2D GANs Meet Unsupervised Single-View 3D Reconstruction
Feng Liu, Xiaoming Liu
[pdf]
[DOI]

InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images
Zhengqi Li, Qianqian Wang, Noah Snavely, Angjoo Kanazawa
[pdf]
[DOI]

Semi-Supervised Single-View 3D Reconstruction via Prototype Shape Priors
Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang
[pdf]
[DOI]

Bilateral Normal Integration
Xu Cao, Hiroaki Santo, Boxin Shi, Fumio Okura, Yasuyuki Matsushita
[pdf]
[DOI]

S2Contact: Graph-Based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning
Tze Ho Elden Tse, Zhongqun Zhang, Kwang In Kim, Aleš Leonardis, Feng Zheng, Hyung Jin Chang
[pdf]
[DOI]

SC-wLS: Towards Interpretable Feed-Forward Camera Re-localization
Xin Wu, Hao Zhao, Shunkai Li, Yingdian Cao, Hongbin Zha
[pdf]
[DOI]

FloatingFusion: Depth from ToF and Image-Stabilized Stereo Cameras
Andreas Meuleman, Hakyeong Kim, James Tompkin, Min H. Kim
[pdf]
[DOI]

DELTAR: Depth Estimation from a Light-Weight ToF Sensor and RGB Image
Yijin Li, Xinyang Liu, Wenqi Dong, Han Zhou, Hujun Bao, Guofeng Zhang, Yinda Zhang, Zhaopeng Cui
[pdf]
[DOI]

3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform
Yining Zhao, Chao Wen, Zhou Xue, Yue Gao
[pdf]
[DOI]

RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation
Ruida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji
[pdf]
[DOI]

Monocular 3D Object Reconstruction with GAN Inversion
Junzhe Zhang, Daxuan Ren, Zhongang Cai, Chai Kiat Yeo, Bo Dai, Chen Change Loy
[pdf]
[DOI]

Map-Free Visual Relocalization: Metric Pose Relative to a Single Image
Eduardo Arnold, Jamie Wynn, Sara Vicente, Guillermo Garcia-Hernando, Aron Monszpart, Victor Prisacariu, Daniyar Turmukhambetov, Eric Brachmann
[pdf]
[DOI]

Self-Distilled Feature Aggregation for Self-Supervised Monocular Depth Estimation
Zhengming Zhou, Qiulei Dong
[pdf]
[DOI]

Planes vs. Chairs: Category-Guided 3D Shape Learning without Any 3D Cues
Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg
[pdf]
[DOI]

MHR-Net: Multiple-Hypothesis Reconstruction of Non-rigid Shapes from 2D Views
Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
[pdf]
[DOI]

Depth Map Decomposition for Monocular Depth Estimation
Jinyoung Jun, Jae-Han Lee, Chul Lee, Chang-Su Kim
[pdf]
[DOI]

Monitored Distillation for Positive Congruent Depth Completion
Tian Yu Liu, Parth Agrawal, Allison Chen, Byung-Woo Hong, Alex Wong
[pdf]
[DOI]

Resolution-Free Point Cloud Sampling Network with Data Distillation
Tianxin Huang, Jiangning Zhang, Jun Chen, Yuang Liu, Yong Liu
[pdf]
[DOI]

Organic Priors in Non-rigid Structure from Motion
Suryansh Kumar, Luc Van Gool
[pdf]
[DOI]

Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation
Yinlin Hu, Pascal Fua, Mathieu Salzmann
[pdf]
[DOI]

DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks
Shih-Yang Su, Timur Bagautdinov, Helge Rhodin
[pdf]
[DOI]

"CHORE: Contact, Human and Object REconstruction from a Single RGB Image"
Xianghui Xie, Bharat Lal Bhatnagar, Gerard Pons-Moll
[pdf]
[DOI]

Learned Vertex Descent: A New Direction for 3D Human Model Fitting
Enric Corona, Gerard Pons-Moll, Guillem Alenyà, Francesc Moreno-Noguer
[pdf]
[DOI]

Self-Calibrating Photometric Stereo by Neural Inverse Rendering
Junxuan Li, Hongdong Li
[pdf]
[DOI]

3D Clothed Human Reconstruction in the Wild
Gyeongsik Moon, Hyeongjin Nam, Takaaki Shiratori, Kyoung Mu Lee
[pdf]
[DOI]

Directed Ray Distance Functions for 3D Scene Reconstruction
Nilesh Kulkarni, Justin Johnson, David F. Fouhey
[pdf]
[DOI]

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image
Zhaoxin Fan, Zhenbo Song, Jian Xu, Zhicheng Wang, Kejian Wu, Hongyan Liu, Jun He
[pdf]
[DOI]

Uncertainty Quantification in Depth Estimation via Constrained Ordinal Regression
Dongting Hu, Liuhua Peng, Tingjin Chu, Xiaoxing Zhang, Yinian Mao, Howard Bondell, Mingming Gong
[pdf]
[DOI]

CostDCNet: Cost Volume Based Depth Completion for a Single RGB-D Image
Jaewon Kam, Jungeon Kim, Soongjin Kim, Jaesik Park, Seungyong Lee
[pdf]
[DOI]

"ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization"
Muhammad Zubair Irshad, Sergey Zakharov, Rareș Ambruș, Thomas Kollar, Zsolt Kira, Adrien Gaidon
[pdf]
[DOI]

3D Siamese Transformer Network for Single Object Tracking on Point Clouds
Le Hui, Lingpeng Wang, Linghua Tang, Kaihao Lan, Jin Xie, Jian Yang
[pdf]
[DOI]

Object Wake-Up: 3D Object Rigging from a Single Image
Ji Yang, Xinxin Zuo, Sen Wang, Zhenbo Yu, Xingyu Li, Bingbing Ni, Minglun Gong, Li Cheng
[pdf]
[DOI]

IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-View Human Reconstruction
Kennard Yanting Chan, Guosheng Lin, Haiyu Zhao, Weisi Lin
[pdf]
[DOI]

Realistic One-Shot Mesh-Based Head Avatars
Taras Khakhulin, Vanessa Sklyarova, Victor Lempitsky, Egor Zakharov
[pdf]
[DOI]

A Kendall Shape Space Approach to 3D Shape Estimation from 2D Landmarks
Martha Paskin, Daniel Baum, Mason N. Dean, Christoph von Tycowicz
[pdf]
[DOI]

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion
Zian Wang, Wenzheng Chen, David Acuna, Jan Kautz, Sanja Fidler
[pdf]
[DOI]

Perspective Phase Angle Model for Polarimetric 3D Reconstruction
Guangcheng Chen, Li He, Yisheng Guan, Hong Zhang
[pdf]
[DOI]

DeepShadow: Neural Shape from Shadow
Asaf Karnieli, Ohad Fried, Yacov Hel-Or
[pdf]
[DOI]

Camera Auto-Calibration from the Steiner Conic of the Fundamental Matrix
Yu Liu, Hui Zhang
[pdf]
[DOI]

Super-Resolution 3D Human Shape from a Single Low-Resolution Image
Marco Pesavento, Marco Volino, Adrian Hilton
[pdf]
[DOI]

Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion
Weng Fei Low, Gim Hee Lee
[pdf]
[DOI]

ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing
Daxuan Ren, Jianmin Zheng, Jianfei Cai, Jiatong Li, Junzhe Zhang
[pdf]
[DOI]

CATRE: Iterative Point Clouds Alignment for Category-Level Object Pose Refinement
Xingyu Liu, Gu Wang, Yi Li, Xiangyang Ji
[pdf]
[DOI]

Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation
Jingyu Gong, Fengqi Liu, Jiachen Xu, Min Wang, Xin Tan, Zhizhong Zhang, Ran Yi, Haichuan Song, Yuan Xie, Lizhuang Ma
[pdf]
[DOI]

Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction
Haocheng Yuan, Chen Zhao, Shichao Fan, Jiaxi Jiang, Jiaqi Yang
[pdf]
[DOI]

MvDeCor: Multi-View Dense Correspondence Learning for Fine-Grained 3D Segmentation
Gopal Sharma, Kangxue Yin, Subhransu Maji, Evangelos Kalogerakis, Or Litany, Sanja Fidler
[pdf]
[DOI]

SUPR: A Sparse Unified Part-Based Human Representation
Ahmed A. A. Osman, Timo Bolkart, Dimitrios Tzionas, Michael J. Black
[pdf]
[DOI]

Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach
Rolandos Alexandros Potamias, Giorgos Bouritsas, Stefanos Zafeiriou
[pdf]
[DOI]

Masked Autoencoders for Point Cloud Self-Supervised Learning
Yatian Pang, Wenxiao Wang, Francis E.H. Tay, Wei Liu, Yonghong Tian, Li Yuan
[pdf]
[DOI]

Intrinsic Neural Fields: Learning Functions on Manifolds
Lukas Koestler, Daniel Grittner, Michael Moeller, Daniel Cremers, Zorah Lähner
[pdf]
[DOI]

Skeleton-Free Pose Transfer for Stylized 3D Characters
Zhouyingcheng Liao, Jimei Yang, Jun Saito, Gerard Pons-Moll, Yang Zhou
[pdf]
[DOI]

Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu, Mu Cai, Yong Jae Lee
[pdf]
[DOI]

FBNet: Feedback Network for Point Cloud Completion
Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhihong Wu, Di Xie, Shiliang Pu, Li Lu
[pdf]
[DOI]

Meta-Sampler: Almost-Universal yet Task-Oriented Sampling for Point Clouds
Ta-Ying Cheng, Qingyong Hu, Qian Xie, Niki Trigoni, Andrew Markham
[pdf]
[DOI]

A Level Set Theory for Neural Implicit Evolution under Explicit Flows
Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi
[pdf]
[DOI]

Efficient Point Cloud Analysis Using Hilbert Curve
Wanli Chen, Xinge Zhu, Guojin Chen, Bei Yu
[pdf]
[DOI]

TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement
Keyang Zhou, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-Moll
[pdf]
[DOI]

LaTeRF: Label and Text Driven Object Radiance Fields
Ashkan Mirzaei, Yash Kant, Jonathan Kelly, Igor Gilitschenski
[pdf]
[DOI]

MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis
Yaqian Liang, Shanshan Zhao, Baosheng Yu, Jing Zhang, Fazhi He
[pdf]
[DOI]

Unsupervised Deep Multi-Shape Matching
Dongliang Cao, Florian Bernard
[pdf]
[DOI]

Texturify: Generating Textures on 3D Shape Surfaces
Yawar Siddiqui, Justus Thies, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai
[pdf]
[DOI]

Autoregressive 3D Shape Generation via Canonical Mapping
An-Chieh Cheng, Xueting Li, Sifei Liu, Min Sun, Ming-Hsuan Yang
[pdf]
[DOI]

PointTree: Transformation-Robust Point Cloud Encoder with Relaxed K-D Trees
Jun-Kun Chen, Yu-Xiong Wang
[pdf]
[DOI]

UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation
Shenhan Qian, Jiale Xu, Ziwei Liu, Liqian Ma, Shenghua Gao
[pdf]
[DOI]

PRIF: Primary Ray-Based Implicit Function
Brandon Y. Feng, Yinda Zhang, Danhang Tang, Ruofei Du, Amitabh Varshney
[pdf]
[DOI]

Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction
Hanxue Liang, Hehe Fan, Zhiwen Fan, Yi Wang, Tianlong Chen, Yu Cheng, Zhangyang Wang
[pdf]
[DOI]

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes
Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh
[pdf]
[DOI]

PlaneFormers: From Sparse View Planes to 3D Reconstruction
Samir Agarwala, Linyi Jin, Chris Rockwell, David F. Fouhey
[pdf]
[DOI]

Learning Implicit Templates for Point-Based Clothed Human Modeling
Siyou Lin, Hongwen Zhang, Zerong Zheng, Ruizhi Shao, Yebin Liu
[pdf]
[DOI]

Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks
Qianjiang Hu, Daizong Liu, Wei Hu
[pdf]
[DOI]

Structure-Aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation
Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu
[pdf]
[DOI]

MoFaNeRF: Morphable Facial Neural Radiance Field
Yiyu Zhuang, Hao Zhu, Xusen Sun, Xun Cao
[pdf]
[DOI]

PointInst3D: Segmenting 3D Instances by Points
Tong He, Wei Yin, Chunhua Shen, Anton van den Hengel
[pdf]
[DOI]

Cross-Modal 3D Shape Generation and Manipulation
Zezhou Cheng, Menglei Chai, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Subhransu Maji, Sergey Tulyakov
[pdf]
[DOI]

Latent Partition Implicit with Surface Codes for 3D Representation
Chao Chen, Yu-Shen Liu, Zhizhong Han
[pdf]
[DOI]

Implicit Field Supervision for Robust Non-rigid Shape Matching
Ramana Sundararaman, Gautam Pai, Maks Ovsjanikov
[pdf]
[DOI]

Learning Self-Prior for Mesh Denoising Using Dual Graph Convolutional Networks
Shota Hattori, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki
[pdf]
[DOI]

diffConv: Analyzing Irregular Point Clouds with an Irregular View
Manxi Lin, Aasa Feragen
[pdf]
[DOI]

PD-Flow: A Point Cloud Denoising Framework with Normalizing Flows
Aihua Mao, Zihui Du, Yu-Hui Wen, Jun Xuan, Yong-Jin Liu
[pdf]
[DOI]

SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer
Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong Lu, Ying Tai, Chengjie Wang
[pdf]
[DOI]

DeepMend: Learning Occupancy Functions to Represent Shape for Repair
Nikolas Lamb, Sean Banerjee, Natasha Kholgade Banerjee
[pdf]
[DOI]

A Repulsive Force Unit for Garment Collision Handling in Neural Networks
Qingyang Tan, Yi Zhou, Tuanfeng Wang, Duygu Ceylan, Xin Sun, Dinesh Manocha
[pdf]
[DOI]

Shape-Pose Disentanglement Using SE(3)-Equivariant Vector Neurons
Oren Katzir, Dani Lischinski, Daniel Cohen-Or
[pdf]
[DOI]

3D Equivariant Graph Implicit Functions
Yunlu Chen, Basura Fernando, Hakan Bilen, Matthias Nießner, Efstratios Gavves
[pdf]
[DOI]

PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation
Bo Sun, Vladimir G. Kim, Noam Aigerman, Qixing Huang, Siddhartha Chaudhuri
[pdf]
[DOI]

3D Shape Sequence of Human Comparison and Classification Using Current and Varifolds
Emery Pierson, Mohamed Daoudi, Sylvain Arguillere
[pdf]
[DOI]

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification
Jianxiong Shen, Antonio Agudo, Francesc Moreno-Noguer, Adria Ruiz
[pdf]
[DOI]

Unsupervised Pose-Aware Part Decomposition for Man-Made Articulated Objects
Yuki Kawana, Yusuke Mukuta, Tatsuya Harada
[pdf]
[DOI]

MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks
Benoît Guillard, Federico Stella, Pascal Fua
[pdf]
[DOI]

SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement
Zhaofan Qiu, Yehao Li, Yu Wang, Yingwei Pan, Ting Yao, Tao Mei
[pdf]
[DOI]

The Shape Part Slot Machine: Contact-Based Reasoning for Generating 3D Shapes from Parts
Kai Wang, Paul Guerrero, Vladimir G. Kim, Siddhartha Chaudhuri, Minhyuk Sung, Daniel Ritchie
[pdf]
[DOI]

Spatiotemporal Self-Attention Modeling with Temporal Patch Shift for Action Recognition
Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Xian-Sheng Hua, Lei Zhang
[pdf]
[DOI]

Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

Semi-Supervised Temporal Action Detection with Proposal-Free Masking
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video
Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof
[pdf]
[DOI]

S2N: Suppression-Strengthen Network for Event-Based Recognition under Variant Illuminations
Zengyu Wan, Yang Wang, Ganchao Tan, Yang Cao, Zheng-Jun Zha
[pdf]
[DOI]

CMD: Self-Supervised 3D Action Representation Learning with Cross-Modal Mutual Distillation
Yunyao Mao, Wengang Zhou, Zhenbo Lu, Jiajun Deng, Houqiang Li
[pdf]
[DOI]

Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, Haibin Ling
[pdf]
[DOI]

Hunting Group Clues with Transformers for Social Group Activity Recognition
Masato Tamura, Rahul Vishwakarma, Ravigopal Vennelakanti
[pdf]
[DOI]

Contrastive Positive Mining for Unsupervised 3D Action Representation Learning
Haoyuan Zhang, Yonghong Hou, Wenjing Zhang, Wanqing Li
[pdf]
[DOI]

Target-Absent Human Attention
Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
[pdf]
[DOI]

Uncertainty-Based Spatial-Temporal Attention for Online Action Detection
Hongji Guo, Zhou Ren, Yi Wu, Gang Hua, Qiang Ji
[pdf]
[DOI]

Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows
Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen
[pdf]
[DOI]

Rethinking Zero-Shot Action Recognition: Learning from Latent Atomic Actions
Yijun Qian, Lijun Yu, Wenhe Liu, Alexander G. Hauptmann
[pdf]
[DOI]

Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection
Xiaoqian Wu, Yong-Lu Li, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu
[pdf]
[DOI]

Collaborating Domain-Shared and Target-Specific Feature Clustering for Cross-Domain 3D Action Recognition
Qinying Liu, Zilei Wang
[pdf]
[DOI]

Is Appearance Free Action Recognition Possible?
Filip Ilic, Thomas Pock, Richard P. Wildes
[pdf]
[DOI]

Learning Spatial-Preserved Skeleton Representations for Few-Shot Action Recognition
Ning Ma, Hongyi Zhang, Xuhui Li, Sheng Zhou, Zhen Zhang, Jun Wen, Haifeng Li, Jingjun Gu, Jiajun Bu
[pdf]
[DOI]

Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization
Mengyuan Chen, Junyu Gao, Shicai Yang, Changsheng Xu
[pdf]
[DOI]

Global-Local Motion Transformer for Unsupervised Skeleton-Based Action Learning
Boeun Kim, Hyung Jin Chang, Jungho Kim, Jin Young Choi
[pdf]
[DOI]

AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition
Yulin Wang, Yang Yue, Xinhong Xu, Ali Hassani, Victor Kulikov, Nikita Orlov, Shiji Song, Humphrey Shi, Gao Huang
[pdf]
[DOI]

Panoramic Human Activity Recognition
Ruize Han, Haomin Yan, Jiacheng Li, Songmiao Wang, Wei Feng, Song Wang
[pdf]
[DOI]

Delving into Details: Synopsis-to-Detail Networks for Video Recognition
Shuxian Liang, Xu Shen, Jianqiang Huang, Xian-Sheng Hua
[pdf]
[DOI]

A Generalized & Robust Framework for Timestamp Supervision in Temporal Action Segmentation
Rahul Rahaman, Dipika Singhania, Alexandre Thiery, Angela Yao
[pdf]
[DOI]

Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning
Sipeng Zheng, Shizhe Chen, Qin Jin
[pdf]
[DOI]

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens
Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles
[pdf]
[DOI]

Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection
Guoqiu Li, Guanxiong Cai, Xingyu Zeng, Rui Zhao
[pdf]
[DOI]

Compound Prototype Matching for Few-Shot Action Recognition
Yifei Huang, Lijin Yang, Yoichi Sato
[pdf]
[DOI]

Continual 3D Convolutional Neural Networks for Real-Time Processing of Videos
Lukas Hedegaard, Alexandros Iosifidis
[pdf]
[DOI]

Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition
Tianjiao Li, Lin Geng Foo, Qiuhong Ke, Hossein Rahmani, Anran Wang, Jinghua Wang, Jun Liu
[pdf]
[DOI]

Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection
Zhiwei Yang, Peng Wu, Jing Liu, Xiaotao Liu
[pdf]
[DOI]

Action Quality Assessment with Temporal Parsing Transformer
Yang Bai, Desen Zhou, Songyang Zhang, Jian Wang, Errui Ding, Yu Guan, Yang Long, Jingdong Wang
[pdf]
[DOI]

Entry-Flipped Transformer for Inference and Prediction of Participant Behavior
Bo Hu, Tat-Jen Cham
[pdf]
[DOI]

Pairwise Contrastive Learning Network for Action Quality Assessment
Mingzhe Li, Hong-Bo Zhang, Qing Lei, Zongwen Fan, Jinghua Liu, Ji-Xiang Du
[pdf]
[DOI]

Geometric Features Informed Multi-Person Human-Object Interaction Recognition in Videos
Tanqiu Qiao, Qianhui Men, Frederick W. B. Li, Yoshiki Kubotani, Shigeo Morishima, Hubert P. H. Shum
[pdf]
[DOI]

ActionFormer: Localizing Moments of Actions with Transformers
Chen-Lin Zhang, Jianxin Wu, Yin Li
[pdf]
[DOI]

SocialVAE: Human Trajectory Prediction Using Timewise Latents
Pei Xu, Jean-Bernard Hayet, Ioannis Karamouzas
[pdf]
[DOI]

Shape Matters: Deformable Patch Attack
Zhaoyu Chen, Bo Li, Shuang Wu, Jianghe Xu, Shouhong Ding, Wenqiang Zhang
[pdf]
[DOI]

Frequency Domain Model Augmentation for Adversarial Attack
Yuyang Long, Qilong Zhang, Boheng Zeng, Lianli Gao, Xianglong Liu, Jian Zhang, Jingkuan Song
[pdf]
[DOI]

Prior-Guided Adversarial Initialization for Fast Adversarial Training
Xiaojun Jia, Yong Zhang, Xingxing Wei, Baoyuan Wu, Ke Ma, Jue Wang, Xiaochun Cao
[pdf]
[DOI]

Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation
Shiji Zhao, Jie Yu, Zhenlong Sun, Bo Zhang, Xingxing Wei
[pdf]
[DOI]

LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity
Martin Gubri, Maxime Cordy, Mike Papadakis, Yves Le Traon, Koushik Sen
[pdf]
[DOI]

A Large-Scale Multiple-Objective Method for Black-Box Attack against Object Detection
Siyuan Liang, Longkang Li, Yanbo Fan, Xiaojun Jia, Jingzhi Li, Baoyuan Wu, Xiaochun Cao
[pdf]
[DOI]

GradAuto: Energy-Oriented Attack on Dynamic Neural Networks
Jianhong Pan, Qichen Zheng, Zhipeng Fan, Hossein Rahmani, Qiuhong Ke, Jun Liu
[pdf]
[DOI]

A Spectral View of Randomized Smoothing under Common Corruptions: Benchmarking and Improving Certified Robustness
Jiachen Sun, Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Dan Hendrycks, Jihun Hamm, Z. Morley Mao
[pdf]
[DOI]

Improving Adversarial Robustness of 3D Point Cloud Classification Models
Guanlin Li, Guowen Xu, Han Qiu, Ruan He, Jiwei Li, Tianwei Zhang
[pdf]
[DOI]

Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number
Xian Wei, Yangyu Xu, Yanhui Huang, Hairong Lv, Hai Lan, Mingsong Chen, Xuan Tang
[pdf]
[DOI]

RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN
Huy Phan, Cong Shi, Yi Xie, Tianfang Zhang, Zhuohang Li, Tianming Zhao, Jian Liu, Yan Wang, Yingying Chen, Bo Yuan
[pdf]
[DOI]

Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks
Xiao Yang, Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu
[pdf]
[DOI]

Adaptive Image Transformations for Transfer-Based Adversarial Attack
Zheng Yuan, Jie Zhang, Shiguang Shan
[pdf]
[DOI]

Generative Multiplane Images: Making a 2D GAN 3D-Aware
Xiaoming Zhao, Fangchang Ma, David Güera, Zhile Ren, Alexander G. Schwing, Alex Colburn
[pdf]
[DOI]

AdvDO: Realistic Adversarial Attacks for Trajectory Prediction
Yulong Cao, Chaowei Xiao, Anima Anandkumar, Danfei Xu, Marco Pavone
[pdf]
[DOI]

Adversarial Contrastive Learning via Asymmetric InfoNCE
Qiying Yu, Jieming Lou, Xianyuan Zhan, Qizhang Li, Wangmeng Zuo, Yang Liu, Jingjing Liu
[pdf]
[DOI]

One Size Does NOT Fit All: Data-Adaptive Adversarial Training
Shuo Yang, Chang Xu
[pdf]
[DOI]

UniCR: Universally Approximated Certified Robustness via Randomized Smoothing
Hanbin Hong, Binghui Wang, Yuan Hong
[pdf]
[DOI]

Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips
Jiawang Bai, Kuofeng Gao, Dihong Gong, Shu-Tao Xia, Zhifeng Li, Wei Liu
[pdf]
[DOI]

Robust Network Architecture Search via Feature Distortion Restraining
Yaguan Qian, Shenghui Huang, Bin Wang, Xiang Ling, Xiaohui Guan, Zhaoquan Gu, Shaoning Zeng, Wujie Zhou, Haijiang Wang
[pdf]
[DOI]

SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination
Zhuowen Yuan, Fan Wu, Yunhui Long, Chaowei Xiao, Bo Li
[pdf]
[DOI]

Triangle Attack: A Query-Efficient Decision-Based Adversarial Attack
Xiaosen Wang, Zeliang Zhang, Kangheng Tong, Dihong Gong, Kun He, Zhifeng Li, Wei Liu
[pdf]
[DOI]

Data-Free Backdoor Removal Based on Channel Lipschitzness
Runkai Zheng, Rongjun Tang, Jianze Li, Li Liu
[pdf]
[DOI]

Black-Box Dissector: Towards Erasing-Based Hard-Label Model Stealing Attack
Yixu Wang, Jie Li, Hong Liu, Yan Wang, Yongjian Wu, Feiyue Huang, Rongrong Ji
[pdf]
[DOI]

Learning Energy-Based Models with Adversarial Training
Xuwang Yin, Shiying Li, Gustavo K. Rohde
[pdf]
[DOI]

Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation
Ganlin Liu, Xiaowei Huang, Xinping Yi
[pdf]
[DOI]

Revisiting Outer Optimization in Adversarial Training
Ali Dabouei, Fariborz Taherkhani, Sobhan Soleymani, Nasser M. Nasrabadi
[pdf]
[DOI]

Zero-Shot Attribute Attacks on Fine-Grained Recognition Models
Nasim Shafiee, Ehsan Elhamifar
[pdf]
[DOI]

Towards Effective and Robust Neural Trojan Defenses via Input Filtering
Kien Do, Haripriya Harikumar, Hung Le, Dung Nguyen, Truyen Tran, Santu Rana, Dang Nguyen, Willy Susilo, Svetha Venkatesh
[pdf]
[DOI]

Scaling Adversarial Training to Large Perturbation Bounds
Sravanti Addepalli, Samyak Jain, Gaurang Sriramanan, R. Venkatesh Babu
[pdf]
[DOI]

Exploiting the Local Parabolic Landscapes of Adversarial Losses to Accelerate Black-Box Adversarial Attack
Hoang Tran, Dan Lu, Guannan Zhang
[pdf]
[DOI]

Generative Domain Adaptation for Face Anti-Spoofing
Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Kekai Sheng, Shouhong Ding, Lizhuang Ma
[pdf]
[DOI]

MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition
Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Xi Li
[pdf]
[DOI]

GaitEdge: Beyond Plain End-to-End Gait Recognition for Better Practicality
Junhao Liang, Chao Fan, Saihui Hou, Chuanfu Shen, Yongzhen Huang, Shiqi Yu
[pdf]
[DOI]

UIA-ViT: Unsupervised Inconsistency-Aware Method Based on Vision Transformer for Face Forgery Detection
Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, Nenghai Yu
[pdf]
[DOI]

Effective Presentation Attack Detection Driven by Face Related Task
Wentian Zhang, Haozhe Liu, Feng Liu, Raghavendra Ramachandra, Christoph Busch
[pdf]
[DOI]

PPT: Token-Pruned Pose Transformer for Monocular and Multi-View Human Pose Estimation
Haoyu Ma, Zhe Wang, Yifei Chen, Deying Kong, Liangjian Chen, Xingwei Liu, Xiangyi Yan, Hao Tang, Xiaohui Xie
[pdf]
[DOI]

AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing
Jiaxi Jiang, Paul Streli, Huajian Qiu, Andreas Fender, Larissa Laich, Patrick Snape, Christian Holz
[pdf]
[DOI]

P-STMO: Pre-trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation
Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
[pdf]
[DOI]

D&D: Learning Human Dynamics from Dynamic Camera
Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
[pdf]
[DOI]

Explicit Occlusion Reasoning for Multi-Person 3D Human Pose Estimation
Qihao Liu, Yi Zhang, Song Bai, Alan Yuille
[pdf]
[DOI]

COUCH: Towards Controllable Human-Chair Interactions
Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Vladimir Guzov, Gerard Pons-Moll
[pdf]
[DOI]

Identity-Aware Hand Mesh Estimation and Personalization from RGB Images
Deying Kong, Linguang Zhang, Liangjian Chen, Haoyu Ma, Xiangyi Yan, Shanlin Sun, Xingwei Liu, Kun Han, Xiaohui Xie
[pdf]
[DOI]

C3P: Cross-Domain Pose Prior Propagation for Weakly Supervised 3D Human Pose Estimation
Cunlin Wu, Yang Xiao, Boshen Zhang, Mingyang Zhang, Zhiguo Cao, Joey Tianyi Zhou
[pdf]
[DOI]

Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields
Garvita Tiwari, Dimitrije Antić, Jan Eric Lenssen, Nikolaos Sarafianos, Tony Tung, Gerard Pons-Moll
[pdf]
[DOI]

CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation
Zhihao Li, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan
[pdf]
[DOI]

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation
Ailing Zeng, Xuan Ju, Lei Yang, Ruiyuan Gao, Xizhou Zhu, Bo Dai, Qiang Xu
[pdf]
[DOI]

SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos
Ailing Zeng, Lei Yang, Xuan Ju, Jiefeng Li, Jianyi Wang, Qiang Xu
[pdf]
[DOI]

PoseTrans: A Simple yet Effective Pose Transformation Augmentation for Human Pose Estimation
Wentao Jiang, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Si Liu
[pdf]
[DOI]

Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement
Junuk Cha, Muhammad Saqlain, GeonU Kim, Mingyu Shin, Seungryul Baek
[pdf]
[DOI]

Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction
Xiaoning Sun, Qiongjie Cui, Huaijiang Sun, Bin Li, Weiqing Li, Jianfeng Lu
[pdf]
[DOI]

Structural Triangulation: A Closed-Form Solution to Constrained 3D Human Pose Estimation
Zhuo Chen, Xu Zhao, Xiaoyue Wan
[pdf]
[DOI]

Audio-Driven Stylized Gesture Generation with Flow-Based Model
Sheng Ye, Yu-Hui Wen, Yanan Sun, Ying He, Ziyang Zhang, Yaoyuan Wang, Weihua He, Yong-Jin Liu
[pdf]
[DOI]

Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation
Zhehan Kan, Shuoshuo Chen, Zeng Li, Zhihai He
[pdf]
[DOI]

UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture
Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik
[pdf]
[DOI]

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction
Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya Zhang
[pdf]
[DOI]

Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation
William McNally, Kanav Vats, Alexander Wong, John McPhee
[pdf]
[DOI]

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data
Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang
[pdf]
[DOI]

Poseur: Direct Human Pose Regression with Transformers
Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, Anton van den Hengel
[pdf]
[DOI]

SimCC: A Simple Coordinate Classification Perspective for Human Pose Estimation
Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang, Shu-Tao Xia
[pdf]
[DOI]

Regularizing Vector Embedding in Bottom-Up Human Pose Estimation
Haixin Wang, Lu Zhou, Yingying Chen, Ming Tang, Jinqiao Wang
[pdf]
[DOI]

A Visual Navigation Perspective for Category-Level Object Pose Estimation
Jiaxin Guo, Fangxun Zhong, Rong Xiong, Yun-Hui Liu, Yue Wang, Yiyi Liao
[pdf]
[DOI]

Faster VoxelPose: Real-Time 3D Human Pose Estimation by Orthographic Projection
Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, Yizhou Wang
[pdf]
[DOI]

Learning to Fit Morphable Models
Vasileios Choutas, Federica Bogo, Jingjing Shen, Julien Valentin
[pdf]
[DOI]

EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices
Siwei Zhang, Qianli Ma, Yan Zhang, Zhiyin Qian, Taein Kwon, Marc Pollefeys, Federica Bogo, Siyu Tang
[pdf]
[DOI]

Grasp’D: Differentiable Contact-Rich Grasp Synthesis for Multi-Fingered Hands
Dylan Turpin, Liquan Wang, Eric Heiden, Yun-Chun Chen, Miles Macklin, Stavros Tsogkas, Sven Dickinson, Animesh Garg
[pdf]
[DOI]

AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling
Ziqian Bai, Timur Bagautdinov, Javier Romero, Michael Zollhöfer, Ping Tan, Shunsuke Saito
[pdf]
[DOI]

Deep Radial Embedding for Visual Sequence Learning
Yuecong Min, Peiqi Jiao, Yanan Li, Xiaotao Wang, Lei Lei, Xiujuan Chai, Xilin Chen
[pdf]
[DOI]

SAGA: Stochastic Whole-Body Grasping with Contact
Yan Wu, Jiahao Wang, Yan Zhang, Siwei Zhang, Otmar Hilliges, Fisher Yu, Siyu Tang
[pdf]
[DOI]

Neural Capture of Animatable 3D Human from Monocular Video
Gusi Te, Xiu Li, Xiao Li, Jinglu Wang, Wei Hu, Yan Lu
[pdf]
[DOI]

General Object Pose Transformation Network from Unpaired Data
Yukun Su, Guosheng Lin, Ruizhou Sun, Qingyao Wu
[pdf]
[DOI]

Compositional Human-Scene Interaction Synthesis with Semantic Control
Kaifeng Zhao, Shaofei Wang, Yan Zhang, Thabo Beeler, Siyu Tang
[pdf]
[DOI]

PressureVision: Estimating Hand Pressure from a Single RGB Image
Patrick Grady, Chengcheng Tang, Samarth Brahmbhatt, Christopher D. Twigg, Chengde Wan, James Hays, Charles C. Kemp
[pdf]
[DOI]

PoseScript: 3D Human Poses from Natural Language
Ginger Delmas, Philippe Weinzaepfel, Thomas Lucas, Francesc Moreno-Noguer, Grégory Rogez
[pdf]
[DOI]

DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation
Jaewoo Park, Nam Ik Cho
[pdf]
[DOI]

3D Interacting Hand Pose Estimation by Hand De-Occlusion and Removal
Hao Meng, Sheng Jin, Wentao Liu, Chen Qian, Mengxiang Lin, Wanli Ouyang, Ping Luo
[pdf]
[DOI]

Pose for Everything: Towards Category-Agnostic Pose Estimation
Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
[pdf]
[DOI]

PoseGPT: Quantization-Based 3D Human Motion Generation and Forecasting
Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez
[pdf]
[DOI]

DH-AUG: DH Forward Kinematics Model Driven Augmentation for 3D Human Pose Estimation
Linzhi Huang, Jiahao Liang, Weihong Deng
[pdf]
[DOI]

Estimating Spatially-Varying Lighting in Urban Scenes with Disentangled Representation
Jiajun Tang, Yongjie Zhu, Haoyu Wang, Jun Hoong Chan, Si Li, Boxin Shi
[pdf]
[DOI]

Boosting Event Stream Super-Resolution with a Recurrent Neural Network
Wenming Weng, Yueyi Zhang, Zhiwei Xiong
[pdf]
[DOI]

Projective Parallel Single-Pixel Imaging to Overcome Global Illumination in 3D Structure Light Scanning
Yuxi Li, Huijie Zhao, Hongzhi Jiang, Xudong Li
[pdf]
[DOI]

Semantic-Sparse Colorization Network for Deep Exemplar-Based Colorization
Yunpeng Bai, Chao Dong, Zenghao Chai, Andong Wang, Zhengzhuo Xu, Chun Yuan
[pdf]
[DOI]

Practical and Scalable Desktop-Based High-Quality Facial Capture
Alexandros Lattas, Yiming Lin, Jayanth Kannan, Ekin Ozturk, Luca Filipi, Giuseppe Claudio Guarnera, Gaurav Chawla, Abhijeet Ghosh
[pdf]
[DOI]

FAST-VQA: Efficient End-to-End Video Quality Assessment with Fragment Sampling
Haoning Wu, Chaofeng Chen, Jingwen Hou, Liang Liao, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin
[pdf]
[DOI]

Physically-Based Editing of Indoor Scene Lighting from a Single Image
Zhengqin Li, Jia Shi, Sai Bi, Rui Zhu, Kalyan Sunkavalli, Miloš Hašan, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker
[pdf]
[DOI]

LEDNet: Joint Low-Light Enhancement and Deblurring in the Dark
Shangchen Zhou, Chongyi Li, Chen Change Loy
[pdf]
[DOI]

MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects
Juewen Peng, Jianming Zhang, Xianrui Luo, Hao Lu, Ke Xian, Zhiguo Cao
[pdf]
[DOI]

Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset
Huanjing Yue, Zhiming Zhang, Jingyu Yang
[pdf]
[DOI]

Transform Your Smartphone into a DSLR Camera: Learning the ISP in the Wild
Ardhendu Shekhar Tripathi, Martin Danelljan, Samarth Shukla, Radu Timofte, Luc Van Gool
[pdf]
[DOI]

Learning Deep Non-Blind Image Deconvolution without Ground Truths
Yuhui Quan, Zhuojie Chen, Huan Zheng, Hui Ji
[pdf]
[DOI]

NEST: Neural Event Stack for Event-Based Image Enhancement
Minggui Teng, Chu Zhou, Hanyue Lou, Boxin Shi
[pdf]
[DOI]

Editable Indoor Lighting Estimation
Henrique Weber, Mathieu Garon, Jean-François Lalonde
[pdf]
[DOI]

Fast Two-Step Blind Optical Aberration Correction
Thomas Eboli, Jean-Michel Morel, Gabriele Facciolo
[pdf]
[DOI]

Seeing Far in the Dark with Patterned Flash
Zhanghao Sun, Jian Wang, Yicheng Wu, Shree Nayar
[pdf]
[DOI]

PseudoClick: Interactive Image Segmentation with Click Imitation
Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, Ziyan Wu
[pdf]
[DOI]

CT2: Colorization Transformer via Color Tokens
Shuchen Weng, Jimeng Sun, Yu Li, Si Li, Boxin Shi
[pdf]
[DOI]

Simple Baselines for Image Restoration
Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, Jian Sun
[pdf]
[DOI]

Spike Transformer: Monocular Depth Estimation for Spiking Camera
Jiyuan Zhang, Lulu Tang, Zhaofei Yu, Jiwen Lu, Tiejun Huang
[pdf]
[DOI]

Improving Image Restoration by Revisiting Global Information Aggregation
Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu
[pdf]
[DOI]

Data Association between Event Streams and Intensity Frames under Diverse Baselines
Dehao Zhang, Qiankun Ding, Peiqi Duan, Chu Zhou, Boxin Shi
[pdf]
[DOI]

D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration
Yuzhi Zhao, Yongzhe Xu, Qiong Yan, Dingdong Yang, Xuehui Wang, Lai-Man Po
[pdf]
[DOI]

Learning Graph Neural Networks for Image Style Transfer
Yongcheng Jing, Yining Mao, Yiding Yang, Yibing Zhan, Mingli Song, Xinchao Wang, Dacheng Tao
[pdf]
[DOI]

DeepPS2: Revisiting Photometric Stereo Using Two Differently Illuminated Images
Ashish Tiwari, Shanmuganathan Raman
[pdf]
[DOI]

Instance Contour Adjustment via Structure-Driven CNN
Shuchen Weng, Yi Wei, Ming-Ching Chang, Boxin Shi
[pdf]
[DOI]

Synthesizing Light Field Video from Monocular Video
Shrisudhan Govindarajan, Prasan Shedligeri, Sarah, Kaushik Mitra
[pdf]
[DOI]

Human-Centric Image Cropping with Partition-Aware and Content-Preserving Features
Bo Zhang, Li Niu, Xing Zhao, Liqing Zhang
[pdf]
[DOI]

DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting
Jihyong Oh, Munchurl Kim
[pdf]
[DOI]

Neural Image Representations for Multi-Image Fusion and Layer Separation
Seonghyeon Nam, Marcus A. Brubaker, Michael S. Brown
[pdf]
[DOI]

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion
Zhihang Zhong, Mingdeng Cao, Xiao Sun, Zhirong Wu, Zhongyi Zhou, Yinqiang Zheng, Stephen Lin, Imari Sato
[pdf]
[DOI]

FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
[pdf]
[DOI]

Video Interpolation by Event-Driven Anisotropic Adjustment of Optical Flow
Song Wu, Kaichao You, Weihua He, Chen Yang, Yang Tian, Yaoyuan Wang, Ziyang Zhang, Jianxing Liao
[pdf]
[DOI]

EvAC3D: From Event-Based Apparent Contours to 3D Models via Continuous Visual Hulls
Ziyun Wang, Kenneth Chaney, Kostas Daniilidis
[pdf]
[DOI]

DCCF: Deep Comprehensible Color Filter Learning Framework for High-Resolution Image Harmonization
Ben Xue, Shenghui Ran, Quan Chen, Rongfei Jia, Binqiang Zhao, Xing Tang
[pdf]
[DOI]

SelectionConv: Convolutional Neural Networks for Non-Rectilinear Image Data
David Hart, Michael Whitney, Bryan Morse
[pdf]
[DOI]

Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization
Jingtang Liang, Xiaodong Cun, Chi-Man Pun, Jue Wang
[pdf]
[DOI]

BigColor: Colorization Using a Generative Color Prior for Natural Images
Geonung Kim, Kyoungkook Kang, Seongtae Kim, Hwayoon Lee, Sehoon Kim, Jonghyun Kim, Seung-Hwan Baek, Sunghyun Cho
[pdf]
[DOI]

CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
Cheeun Hong, Sungyong Baik, Heewon Kim, Seungjun Nah, Kyoung Mu Lee
[pdf]
[DOI]

Deep Semantic Statistics Matching (D2SM) Denoising Network
Kangfu Mei, Vishal M. Patel, Rui Huang
[pdf]
[DOI]

3D Scene Inference from Transient Histograms
Sacha Jungerman, Atul Ingle, Yin Li, Mohit Gupta
[pdf]
[DOI]

Neural Space-Filling Curves
Hanyu Wang, Kamal Gupta, Larry Davis, Abhinav Shrivastava
[pdf]
[DOI]

Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging
An Gia Vien, Chul Lee
[pdf]
[DOI]

Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration
Weng-Tai Su, Yi-Chun Hung, Po-Jen Yu, Shang-Hua Yang, Chia-Wen Lin
[pdf]
[DOI]

Tomography of Turbulence Strength Based on Scintillation Imaging
Nir Shaul, Yoav Y. Schechner
[pdf]
[DOI]

Realistic Blur Synthesis for Learning Image Deblurring
Jaesung Rim, Geonung Kim, Jungeon Kim, Junyong Lee, Seungyong Lee, Sunghyun Cho
[pdf]
[DOI]

Learning Phase Mask for Privacy-Preserving Passive Depth Estimation
Zaid Tasneem, Giovanni Milione, Yi-Hsuan Tsai, Xiang Yu, Ashok Veeraraghavan, Manmohan Chandraker, Francesco Pittaluga
[pdf]
[DOI]

LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval
Atreyee Saha, Salman S. Khan, Sagar Sehrawat, Sanjana S. Prabhu, Shanti Bhattacharya, Kaushik Mitra
[pdf]
[DOI]

PANDORA: Polarization-Aided Neural Decomposition of Radiance
Akshat Dave, Yongyi Zhao, Ashok Veeraraghavan
[pdf]
[DOI]

HuMMan: Multi-modal 4D Human Dataset for Versatile Sensing and Modeling
Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
[pdf]
[DOI]

DVS-Voltmeter: Stochastic Process-Based Event Simulator for Dynamic Vision Sensors
Songnan Lin, Ye Ma, Zhenhua Guo, Bihan Wen
[pdf]
[DOI]

Benchmarking Omni-Vision Representation through the Lens of Visual Realms
Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu
[pdf]
[DOI]

BEAT: A Large-Scale Semantic and Emotional Multi-modal Dataset for Conversational Gestures Synthesis
Haiyang Liu, Zihao Zhu, Naoya Iwamoto, Yichen Peng, Zhengqing Li, You Zhou, Elif Bozkurt, Bo Zheng
[pdf]
[DOI]

Neuromorphic Data Augmentation for Training Spiking Neural Networks
Yuhang Li, Youngeun Kim, Hyoungseob Park, Tamar Geller, Priyadarshini Panda
[pdf]
[DOI]

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy
[pdf]
[DOI]

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition
Alejandro Pardo, Fabian Caba, Juan León Alcázar, Ali Thabet, Bernard Ghanem
[pdf]
[DOI]

LaMAR: Benchmarking Localization and Mapping for Augmented Reality
Paul-Edouard Sarlin, Mihai Dusmanu, Johannes L. Schönberger, Pablo Speciale, Lukas Gruber, Viktor Larsson, Ondrej Miksik, Marc Pollefeys
[pdf]
[DOI]

"Unitail: Detecting, Reading, and Matching in Retail Scene"
Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides
[pdf]
[DOI]

Not Just Streaks: Towards Ground Truth for Single Image Deraining
Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso M. de Melo, Suya You, Stefano Soatto, Alex Wong, Achuta Kadambi
[pdf]
[DOI]

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-Verified Image-Caption Associations for MS-COCO
Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh
[pdf]
[DOI]

MOTCOM: The Multi-Object Tracking Dataset Complexity Metric
Malte Pedersen, Joakim Bruslund Haurum, Patrick Dendorfer, Thomas B. Moeslund
[pdf]
[DOI]

How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset?
Yuchi Liu, Zhongdao Wang, Tom Gedeon, Liang Zheng
[pdf]
[DOI]

A Real World Dataset for Multi-View 3D Reconstruction
Rakesh Shrestha, Siqi Hu, Minghao Gou, Ziyuan Liu, Ping Tan
[pdf]
[DOI]

REALY: Rethinking the Evaluation of 3D Face Reconstruction
Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao
[pdf]
[DOI]

"Capturing, Reconstructing, and Simulating: The UrbanScene3D Dataset"
Liqiang Lin, Yilin Liu, Yue Hu, Xingguang Yan, Ke Xie, Hui Huang
[pdf]
[DOI]

3D CoMPaT: Composition of Materials on Parts of 3D Things
Yuchen Li, Ujjwal Upadhyay, Habib Slim, Tezuesh Varshney, Ahmed Abdelreheem, Arpit Prajapati, Suhail Pothigara, Peter Wonka, Mohamed Elhoseiny
[pdf]
[DOI]

"PartImageNet: A Large, High-Quality Dataset of Parts"
Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan, Jie-Neng Chen, Shuai Liu, Cheng Yang, Qihang Yu, Alan Yuille
[pdf]
[DOI]

A-OKVQA: A Benchmark for Visual Question Answering Using World Knowledge
Dustin Schwenk, Apoorv Khandelwal, Christopher Clark, Kenneth Marino, Roozbeh Mottaghi
[pdf]
[DOI]

OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images
Bingchen Zhao, Shaozuo Yu, Wufei Ma, Mingxin Yu, Shenxiao Mei, Angtian Wang, Ju He, Alan Yuille, Adam Kortylewski
[pdf]
[DOI]

Facial Depth and Normal Estimation Using Single Dual-Pixel Camera
Minjun Kang, Jaesung Choe, Hyowon Ha, Hae-Gon Jeon, Sunghoon Im, In So Kweon, Kuk-Jin Yoon
[pdf]
[DOI]

The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing
Dawit Mureja Argaw, Fabian Caba, Joon-Young Lee, Markus Woodson, In So Kweon
[pdf]
[DOI]

StyleBabel: Artistic Style Tagging and Captioning
Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse
[pdf]
[DOI]

PANDORA: A Panoramic Detection Dataset for Object with Orientation
Hang Xu, Qiang Zhao, Yike Ma, Xiaodong Li, Peng Yuan, Bailan Feng, Chenggang Yan, Feng Dai
[pdf]
[DOI]

FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context
Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song
[pdf]
[DOI]

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset
Grant Van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, Serge Belongie
[pdf]
[DOI]

The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting
Justin Kay, Peter Kulits, Suzanne Stathatos, Siqi Deng, Erik Young, Sara Beery, Grant Van Horn, Pietro Perona
[pdf]
[DOI]

A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns, Deniz Arsan, Sanjna Agrawal, Ranjitha Kumar, Kate Saenko, Bryan A. Plummer
[pdf]
[DOI]

BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis
Davide Moltisanti, Jinyi Wu, Bo Dai, Chen Change Loy
[pdf]
[DOI]

Dress Code: High-Resolution Multi-Category Virtual Try-On
Davide Morelli, Matteo Fincato, Marcella Cornia, Federico Landi, Fabio Cesari, Rita Cucchiara
[pdf]
[DOI]

A Data-Centric Approach for Improving Ambiguous Labels with Combined Semi-Supervised Classification and Clustering
Lars Schmarje, Monty Santarossa, Simon-Martin Schröder, Claudius Zelenka, Rainer Kiko, Jenny Stracke, Nina Volkmann, Reinhard Koch
[pdf]
[DOI]

ClearPose: Large-Scale Transparent Object Dataset and Benchmark
Xiaotong Chen, Huijie Zhang, Zeren Yu, Anthony Opipari, Odest Chadwicke Jenkins
[pdf]
[DOI]

When Deep Classifiers Agree: Analyzing Correlations between Learning Order and Image Statistics
Iuliia Pliushch, Martin Mundt, Nicolas Lupp, Visvanathan Ramesh
[pdf]
[DOI]

AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment
Kangyeol Kim, Sunghyun Park, Jaeseong Lee, Sunghyo Chung, Junsoo Lee, Jaegul Choo
[pdf]
[DOI]

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Thomas Hayes, Songyang Zhang, Xi Yin, Guan Pang, Sasha Sheng, Harry Yang, Songwei Ge, Qiyuan Hu, Devi Parikh
[pdf]
[DOI]

A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing
Paul Upchurch, Ransen Niu
[pdf]
[DOI]

MimicME: A Large Scale Diverse 4D Database for Facial Expression Analysis
Athanasios Papaioannou, Baris Gecer, Shiyang Cheng, Grigorios G. Chrysos, Jiankang Deng, Eftychia Fotiadou, Christos Kampouris, Dimitrios Kollias, Stylianos Moschoglou, Kritaphat Songsri-In, Stylianos Ploumpis, George Trigeorgis, Panagiotis Tzirakis, Evangelos Ververas, Yuxiang Zhou, Allan Ponniah, Anastasios Roussos, Stefanos Zafeiriou
[pdf]
[DOI]

"Delving into Universal Lesion Segmentation: Method, Dataset, and Benchmark"
Yu Qiu, Jing Xu
[pdf]
[DOI]

Large Scale Real-World Multi-person Tracking
Bing Shuai, Alessandro Bergamo, Uta Büchler, Andrew Berneshawi, Alyssa Boden, Joseph Tighe
[pdf]
[DOI]

D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights
Yuzhen Zhang, Wentong Wang, Weizhi Guo, Pei Lv, Mingliang Xu, Wei Chen, Dinesh Manocha
[pdf]
[DOI]

The Missing Link: Finding Label Relations across Datasets
Jasper Uijlings, Thomas Mensink, Vittorio Ferrari
[pdf]
[DOI]

Learning Omnidirectional Flow in 360° Video via Siamese Representation
Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan
[pdf]
[DOI]

VizWiz-FewShot: Locating Objects in Images Taken by People with Visual Impairments
Yu-Yun Tseng, Alexander Bell, Danna Gurari
[pdf]
[DOI]

TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments
Shubham Dokania, Anbumani Subramanian, Manmohan Chandraker, C.V. Jawahar
[pdf]
[DOI]

Trapped in Texture Bias? A Large Scale Comparison of Deep Instance Segmentation
Johannes Theodoridis, Jessica Hofmann, Johannes Maucher, Andreas Schilling
[pdf]
[DOI]

Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
[pdf]
[DOI]

WeLSA: Learning to Predict 6D Pose from Weakly Labeled Data Using Shape Alignment
Shishir Reddy Vutukur, Ivan Shugurov, Benjamin Busam, Andreas Hutter, Slobodan Ilic
[pdf]
[DOI]

Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph
Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, Deng Cai
[pdf]
[DOI]

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li
[pdf]
[DOI]

Long-Tail Detection with Effective Class-Margins
Jang Hyun Cho, Philipp Krähenbühl
[pdf]
[DOI]

Semi-Supervised Monocular 3D Object Detection by Multi-View Consistency
Qing Lian, Yanbo Xu, Weilong Yao, Yingcong Chen, Tong Zhang
[pdf]
[DOI]

PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer towards Video Object Detection
Han Wang, Jun Tang, Xiaodong Liu, Shanyan Guan, Rong Xie, Li Song
[pdf]
[DOI]

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Yu Qiao, Jifeng Dai
[pdf]
[DOI]

Category-Level 6D Object Pose and Size Estimation Using Self-Supervised Deep Prior Deformation Networks
Jiehong Lin, Zewei Wei, Changxing Ding, Kui Jia
[pdf]
[DOI]

Dense Teacher: Dense Pseudo-Labels for Semi-Supervised Object Detection
Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun
[pdf]
[DOI]

Point-to-Box Network for Accurate Object Detection via Single Point Supervision
Pengfei Chen, Xuehui Yu, Xumeng Han, Najmul Hassan, Kai Wang, Jiachen Li, Jian Zhao, Humphrey Shi, Zhenjun Han, Qixiang Ye
[pdf]
[DOI]

Domain Adaptive Hand Keypoint and Pixel Localization in the Wild
Takehiko Ohkawa, Yu-Jhe Li, Qichen Fu, Ryosuke Furuta, Kris M. Kitani, Yoichi Sato
[pdf]
[DOI]

Towards Data-Efficient Detection Transformers
Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, Dacheng Tao
[pdf]
[DOI]

Open-Vocabulary DETR with Conditional Matching
Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
[pdf]
[DOI]

Prediction-Guided Distillation for Dense Object Detection
Chenhongyi Yang, Mateusz Ochal, Amos Storkey, Elliot J. Crowley
[pdf]
[DOI]

Multimodal Object Detection via Probabilistic Ensembling
Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, Shu Kong
[pdf]
[DOI]

Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris N. Metaxas
[pdf]
[DOI]

CPO: Change Robust Panorama to Point Cloud Localization
Junho Kim, Hojun Jang, Changwoon Choi, Young Min Kim
[pdf]
[DOI]

INT: Towards Infinite-Frames 3D Detection with an Efficient Framework
Jianyun Xu, Zhenwei Miao, Da Zhang, Hongyu Pan, Kaixuan Liu, Peihan Hao, Jun Zhu, Zhengyang Sun, Hongmin Li, Xin Zhan
[pdf]
[DOI]

End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution
Mingxiang Liao, Fang Wan, Yuan Yao, Zhenjun Han, Jialing Zou, Yuze Wang, Bailan Feng, Peng Yuan, Qixiang Ye
[pdf]
[DOI]

Calibration-Free Multi-View Crowd Counting
Qi Zhang, Antoni B. Chan
[pdf]
[DOI]

Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training
Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang
[pdf]
[DOI]

SuperLine3D: Self-Supervised Line Segmentation and Description for LiDAR Point Cloud
Xiangrui Zhao, Sheng Yang, Tianxin Huang, Jun Chen, Teng Ma, Mingyang Li, Yong Liu
[pdf]
[DOI]

Exploring Plain Vision Transformer Backbones for Object Detection
Yanghao Li, Hanzi Mao, Ross Girshick, Kaiming He
[pdf]
[DOI]

Adversarially-Aware Robust Object Detector
Ziyi Dong, Pengxu Wei, Liang Lin
[pdf]
[DOI]

HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors
Luting Wang, Xiaojie Li, Yue Liao, Zeren Jiang, Jianlong Wu, Fei Wang, Chen Qian, Si Liu
[pdf]
[DOI]

You Should Look at All Objects
Zhenchao Jin, Dongdong Yu, Luchuan Song, Zehuan Yuan, Lequan Yu
[pdf]
[DOI]

Detecting Twenty-Thousand Classes Using Image-Level Supervision
Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra
[pdf]
[DOI]

DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation
Hongyang Li, Jiehong Lin, Kui Jia
[pdf]
[DOI]

Monocular 3D Object Detection with Depth from Motion
Tai Wang, Jiangmiao Pang, Dahua Lin
[pdf]
[DOI]

DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation
Yilin Wen, Xiangyu Li, Hao Pan, Lei Yang, Zheng Wang, Taku Komura, Wenping Wang
[pdf]
[DOI]

Distilling Object Detectors with Global Knowledge
Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu, Yi Niu, Fan He
[pdf]
[DOI]

Unifying Visual Perception by Dispersible Points Learning
Jianming Liang, Guanglu Song, Biao Leng, Yu Liu
[pdf]
[DOI]

PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection
Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang
[pdf]
[DOI]

Exploring Resolution and Degradation Clues As Self-Supervised Signal for Low Quality Object Detection
Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Renrui Zhang, Zenghui Zhang, Tatsuya Harada
[pdf]
[DOI]

Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features
Wufei Ma, Angtian Wang, Alan Yuille, Adam Kortylewski
[pdf]
[DOI]

"Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection"
Maoxun Yuan, Yinyan Wang, Xingxing Wei
[pdf]
[DOI]

RFLA: Gaussian Receptive Field Based Label Assignment for Tiny Object Detection
Chang Xu, Jinwang Wang, Wen Yang, Huai Yu, Lei Yu, Gui-Song Xia
[pdf]
[DOI]

Rethinking IoU-Based Optimization for Single-Stage 3D Object Detection
Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Min-Jian Zhao, Gim Hee Lee
[pdf]
[DOI]

TD-Road: Top-Down Road Network Extraction with Holistic Graph Construction
Yang He, Ravi Garg, Amber Roy Chowdhury
[pdf]
[DOI]

Multi-faceted Distillation of Base-Novel Commonality for Few-Shot Object Detection
Shuang Wu, Wenjie Pei, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu
[pdf]
[DOI]

PointCLM: A Contrastive Learning-Based Framework for Multi-Instance Point Cloud Registration
Mingzhi Yuan, Zhihao Li, Qiuye Jin, Xinrong Chen, Manning Wang
[pdf]
[DOI]

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration
Haotian Bai, Ruimao Zhang, Jiong Wang, Xiang Wan
[pdf]
[DOI]

MTTrans: Cross-Domain Object Detection with Mean Teacher Transformer
Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, Jianxin Li, Kurt Keutzer, Shanghang Zhang
[pdf]
[DOI]

Multi-Domain Multi-Definition Landmark Localization for Small Datasets
David Ferman, Gaurav Bharaj
[pdf]
[DOI]

DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection
Abhinav Kumar, Garrick Brazil, Enrique Corona, Armin Parchami, Xiaoming Liu
[pdf]
[DOI]

Label-Guided Auxiliary Training Improves 3D Object Detector
Yaomin Huang, Xinmei Liu, Yichen Zhu, Zhiyuan Xu, Chaomin Shen, Zhengping Che, Guixu Zhang, Yaxin Peng, Feifei Feng, Jian Tang
[pdf]
[DOI]

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images
Chengjian Feng, Yujie Zhong, Zequn Jie, Xiangxiang Chu, Haibing Ren, Xiaolin Wei, Weidi Xie, Lin Ma
[pdf]
[DOI]

Densely Constrained Depth Estimator for Monocular 3D Object Detection
Yingyan Li, Yuntao Chen, Jiawei He, Zhaoxiang Zhang
[pdf]
[DOI]

Polarimetric Pose Prediction
Daoyi Gao, Yitong Li, Patrick Ruhkamp, Iuliia Skobleva, Magdalena Wysocki, HyunJun Jung, Pengyuan Wang, Arturo Guridi, Benjamin Busam
[pdf]
[DOI]

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching
Shuai Chen, Xinghui Li, Zirui Wang, Victor Adrian Prisacariu
[pdf]
[DOI]

Cornerformer: Purifying Instances for Corner-Based Detectors
Haoran Wei, Xin Chen, Lingxi Xie, Qi Tian
[pdf]
[DOI]

PillarNet: Real-Time and High-Performance Pillar-Based 3D Object Detection
Guangsheng Shi, Ruifeng Li, Chao Ma
[pdf]
[DOI]

Robust Object Detection with Inaccurate Bounding Boxes
Chengxin Liu, Kewei Wang, Hao Lu, Zhiguo Cao, Ziming Zhang
[pdf]
[DOI]

Efficient Decoder-Free Object Detection with Transformers
Peixian Chen, Mengdan Zhang, Yunhang Shen, Kekai Sheng, Yuting Gao, Xing Sun, Ke Li, Chunhua Shen
[pdf]
[DOI]

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection
Yu Hong, Hang Dai, Yong Ding
[pdf]
[DOI]

ReAct: Temporal Action Detection with Relational Queries
Dingfeng Shi, Yujie Zhong, Qiong Cao, Jing Zhang, Lin Ma, Jia Li, Dacheng Tao
[pdf]
[DOI]

Towards Accurate Active Camera Localization
Qihang Fang, Yingda Yin, Qingnan Fan, Fei Xia, Siyan Dong, Sheng Wang, Jue Wang, Leonidas J. Guibas, Baoquan Chen
[pdf]
[DOI]

Camera Pose Auto-Encoders for Improving Pose Regression
Yoli Shavit, Yosi Keller
[pdf]
[DOI]

Improving the Intra-Class Long-Tail in 3D Detection via Rare Example Mining
Chiyu Max Jiang, Mahyar Najibi, Charles R. Qi, Yin Zhou, Dragomir Anguelov
[pdf]
[DOI]

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization
Lei Zhu, Qian Chen, Lujia Jin, Yunfei You, Yanye Lu
[pdf]
[DOI]

UC-OWOD: Unknown-Classified Open World Object Detection
Zhiheng Wu, Yue Lu, Xingyu Chen, Zhengxing Wu, Liwen Kang, Junzhi Yu
[pdf]
[DOI]

RayTran: 3D Pose Estimation and Shape Reconstruction of Multiple Objects from Videos with Ray-Traced Transformers
Michał J. Tyszkiewicz, Kevis-Kokitsi Maninis, Stefan Popov, Vittorio Ferrari
[pdf]
[DOI]

GTCaR: Graph Transformer for Camera Re-Localization
Xinyi Li, Haibin Ling
[pdf]
[DOI]

3D Object Detection with a Self-Supervised Lidar Scene Flow Backbone
Emeç Erçelik, Ekim Yurtsever, Mingyu Liu, Zhijie Yang, Hanzhen Zhang, Pınar Topçam, Maximilian Listl, Yılmaz Kaan Çaylı, Alois Knoll
[pdf]
[DOI]

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, Ran Xu, Wenhao Liu, Caiming Xiong
[pdf]
[DOI]

Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations
Wenjie Pei, Shuang Wu, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu
[pdf]
[DOI]

SALISA: Saliency-Based Input Sampling for Efficient Video Object Detection
Babak Ehteshami Bejnordi, Amirhossein Habibian, Fatih Porikli, Amir Ghodrati
[pdf]
[DOI]

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement
Dongli Tan, Jiang-Jiang Liu, Xingyu Chen, Chao Chen, Ruixin Zhang, Yunhang Shen, Shouhong Ding, Rongrong Ji
[pdf]
[DOI]

Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting
Yangzheng Wu, Mohsen Zand, Ali Etemad, Michael Greenspan
[pdf]
[DOI]

Long-Tailed Instance Segmentation Using Gumbel Optimized Loss
Konstantinos Panagiotis Alexandridis, Jiankang Deng, Anh Nguyen, Shan Luo
[pdf]
[DOI]

DetMatch: Two Teachers Are Better than One for Joint 2D and 3D Semi-Supervised Object Detection
Jinhyung Park, Chenfeng Xu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan
[pdf]
[DOI]

ObjectBox: From Centers to Boxes for Anchor-Free Object Detection
Mohsen Zand, Ali Etemad, Michael Greenspan
[pdf]
[DOI]

Is Geometry Enough for Matching in Visual Localization?
Qunjie Zhou, Sérgio Agostinho, Aljoša Ošep, Laura Leal-Taixé
[pdf]
[DOI]

SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
Pei Sun, Mingxing Tan, Weiyue Wang, Chenxi Liu, Fei Xia, Zhaoqi Leng, Dragomir Anguelov
[pdf]
[DOI]

PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry
Yu Zhang, Junle Yu, Xiaolin Huang, Wenhui Zhou, Ji Hou
[pdf]
[DOI]

GLAMD: Global and Local Attention Mask Distillation for Object Detectors
Younho Jang, Wheemyung Shin, Jinbeom Kim, Simon Woo, Sung-Ho Bae
[pdf]
[DOI]

FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
Danila Rukhovich, Anna Vorontsova, Anton Konushin
[pdf]
[DOI]

Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles
Guodong Wang, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, Di Huang
[pdf]
[DOI]

Class-Agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang
[pdf]
[DOI]

Enhancing Multi-modal Features Using Local Self-Attention for 3D Object Detection
Hao Li, Zehan Zhang, Xian Zhao, Yulong Wang, Yuxi Shen, Shiliang Pu, Hui Mao
[pdf]
[DOI]

Object Detection As Probabilistic Set Prediction
Georg Hess, Christoffer Petersson, Lennart Svensson
[pdf]
[DOI]

Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions
Zhi Li, Lu He, Huijuan Xu
[pdf]
[DOI]

Neural Correspondence Field for Object Pose Estimation
Lin Huang, Tomas Hodan, Lingni Ma, Linguang Zhang, Luan Tran, Christopher D. Twigg, Po-Chen Wu, Junsong Yuan, Cem Keskin, Robert Wang
[pdf]
[DOI]

On Label Granularity and Object Localization
Elijah Cole, Kimberly Wilber, Grant Van Horn, Xuan Yang, Marco Fornoni, Pietro Perona, Serge Belongie, Andrew Howard, Oisin Mac Aodha
[pdf]
[DOI]

OIMNet++: Prototypical Normalization and Localization-Aware Learning for Person Search
Sanghoon Lee, Youngmin Oh, Donghyeon Baek, Junghyup Lee, Bumsub Ham
[pdf]
[DOI]

Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure
Ruoqi Li, Chongyang Zhang, Hao Zhou, Chao Shi, Yan Luo
[pdf]
[DOI]

Learning with Free Object Segments for Long-Tailed Instance Segmentation
Cheng Zhang, Tai-Yu Pan, Tianle Chen, Jike Zhong, Wenjin Fu, Wei-Lun Chao
[pdf]
[DOI]

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
YuXuan Liu, Nikhil Mishra, Maximilian Sieb, Yide Shentu, Pieter Abbeel, Xi Chen
[pdf]
[DOI]

3D Random Occlusion and Multi-layer Projection for Deep Multi-Camera Pedestrian Localization
Rui Qiu, Ming Xu, Yuyao Yan, Jeremy S. Smith, Xi Yang
[pdf]
[DOI]

A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation
Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou
[pdf]
[DOI]

Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby
[pdf]
[DOI]

"A Simple Approach and Benchmark for 21,000-Category Object Detection"
Yutong Lin, Chen Li, Yue Cao, Zheng Zhang, Jianfeng Wang, Lijuan Wang, Zicheng Liu, Han Hu
[pdf]
[DOI]

Knowledge Condensation Distillation
Chenxin Li, Mingbao Lin, Zhiyuan Ding, Nie Lin, Yihong Zhuang, Yue Huang, Xinghao Ding, Liujuan Cao
[pdf]
[DOI]

Reducing Information Loss for Spiking Neural Networks
Yufei Guo, Yuanpei Chen, Liwen Zhang, YingLei Wang, Xiaode Liu, Xinyi Tong, Yuanyuan Ou, Xuhui Huang, Zhe Ma
[pdf]
[DOI]

Masked Generative Distillation
Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan
[pdf]
[DOI]

Fine-Grained Data Distribution Alignment for Post-Training Quantization
Yunshan Zhong, Mingbao Lin, Mengzhao Chen, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
[pdf]
[DOI]

Learning with Recoverable Forgetting
Jingwen Ye, Yifang Fu, Jie Song, Xingyi Yang, Songhua Liu, Xin Jin, Mingli Song, Xinchao Wang
[pdf]
[DOI]

Efficient One Pass Self-Distillation with Zipf’s Label Smoothing
Jiajun Liang, Linze Li, Zhaodong Bing, Borui Zhao, Yao Tang, Bo Lin, Haoqiang Fan
[pdf]
[DOI]

Prune Your Model before Distill It
Jinhyuk Park, Albert No
[pdf]
[DOI]

Deep Partial Updating: Towards Communication Efficient Updating for On-Device Inference
Zhongnan Qu, Cong Liu, Lothar Thiele
[pdf]
[DOI]

Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, Qingyi Gu
[pdf]
[DOI]

"L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training"
Jonghyun Bae, Woohyeon Baek, Tae Jun Ham, Jae W. Lee
[pdf]
[DOI]

Streaming Multiscale Deep Equilibrium Models
Can Ufuk Ertenli, Emre Akbas, Ramazan Gokberk Cinbis
[pdf]
[DOI]

Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park, Yeongsang Jang, Eunhyeok Park
[pdf]
[DOI]

SP-Net: Slowly Progressing Dynamic Inference Networks
Huanyu Wang, Wenhu Zhang, Shihao Su, Hui Wang, Zhenwei Miao, Xin Zhan, Xi Li
[pdf]
[DOI]

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data
Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang
[pdf]
[DOI]

Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance
Chen Tang, Kai Ouyang, Zhi Wang, Yifei Zhu, Wen Ji, Yaowei Wang, Wenwu Zhu
[pdf]
[DOI]

Event Neural Networks
Matthew Dutson, Yin Li, Mohit Gupta
[pdf]
[DOI]

EdgeViTs: Competing Light-Weight CNNs on Mobile Devices with Vision Transformers
Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, Brais Martinez
[pdf]
[DOI]

PalQuant: Accelerating High-Precision Networks on Low-Precision Accelerators
Qinghao Hu, Gang Li, Qiman Wu, Jian Cheng
[pdf]
[DOI]

Disentangled Differentiable Network Pruning
Shangqian Gao, Feihu Huang, Yanfu Zhang, Heng Huang
[pdf]
[DOI]

IDa-Det: An Information Discrepancy-Aware Distillation for 1-Bit Detectors
Sheng Xu, Yanjing Li, Bohan Zeng, Teli Ma, Baochang Zhang, Xianbin Cao, Peng Gao, Jinhu Lü
[pdf]
[DOI]

Learning to Weight Samples for Dynamic Early-Exiting Networks
Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, Shiji Song, Junfeng Cao, Wenhui Huang, Chao Deng, Gao Huang
[pdf]
[DOI]

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets
Zhijun Tu, Xinghao Chen, Pengju Ren, Yunhe Wang
[pdf]
[DOI]

Adaptive Token Sampling for Efficient Vision Transformers
Mohsen Fayyaz, Soroush Abbasi Koohpayegani, Farnoush Rezaei Jafari, Sunando Sengupta, Hamid Reza Vaezi Joze, Eric Sommerlade, Hamed Pirsiavash, Jürgen Gall
[pdf]
[DOI]

Weight Fixing Networks
Christopher Subia-Waud, Srinandan Dasmahapatra
[pdf]
[DOI]

Self-Slimmed Vision Transformer
Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
[pdf]
[DOI]

Switchable Online Knowledge Distillation
Biao Qian, Yang Wang, Hongzhi Yin, Richang Hong, Meng Wang
[pdf]
[DOI]

l∞-Robustness and Beyond: Unleashing Efficient Adversarial Training
Hadi M. Dolatabadi, Sarah Erfani, Christopher Leckie
[pdf]
[DOI]

Multi-Granularity Pruning for Model Acceleration on Mobile Devices
Tianli Zhao, Xi Sheryl Zhang, Wentao Zhu, Jiaxing Wang, Sen Yang, Ji Liu, Jian Cheng
[pdf]
[DOI]

Deep Ensemble Learning by Diverse Knowledge Distillation for Fine-Grained Object Classification
Naoki Okamoto, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi
[pdf]
[DOI]

Helpful or Harmful: Inter-Task Association in Continual Learning
Hyundong Jin, Eunwoo Kim
[pdf]
[DOI]

Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies
Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jing Shao, Chunlei Liu, Xianglong Liu
[pdf]
[DOI]

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
Chien-Yu Lin, Anish Prabhu, Thomas Merth, Sachin Mehta, Anurag Ranjan, Maxwell Horton, Mohammad Rastegari
[pdf]
[DOI]

Ensemble Knowledge Guided Sub-network Search and Fine-Tuning for Filter Pruning
Seunghyun Lee, Byung Cheol Song
[pdf]
[DOI]

Network Binarization via Contrastive Learning
Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
[pdf]
[DOI]

Lipschitz Continuity Retained Binary Neural Network
Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
[pdf]
[DOI]

SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning
Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Wei Niu, Mengshu Sun, Xuan Shen, Geng Yuan, Bin Ren, Hao Tang, Minghai Qin, Yanzhi Wang
[pdf]
[DOI]

Soft Masking for Cost-Constrained Channel Pruning
Ryan Humble, Maying Shen, Jorge Albericio Latorre, Eric Darve, Jose Alvarez
[pdf]
[DOI]

Non-uniform Step Size Quantization for Accurate Post-Training Quantization
Sangyun Oh, Hyeonuk Sim, Jounghyun Kim, Jongeun Lee
[pdf]
[DOI]

SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
Haoran You, Baopu Li, Zhanyi Sun, Xu Ouyang, Yingyan Lin
[pdf]
[DOI]

Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously
Yi Sun, Jian Li, Xin Xu
[pdf]
[DOI]

Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning
Sayeed Shafayet Chowdhury, Nitin Rathi, Kaushik Roy
[pdf]
[DOI]

Towards Accurate Network Quantization with Equivalent Smooth Regularizer
Kirill Solodskikh, Vladimir Chikin, Ruslan Aydarkhanov, Dehua Song, Irina Zhelavskaya, Jiansheng Wei
[pdf]
[DOI]

Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization
Vladimir Chikin, Kirill Solodskikh, Irina Zhelavskaya
[pdf]
[DOI]

BASQ: Branch-Wise Activation-Clipping Search Quantization for Sub-4-Bit Neural Networks
Han-Byul Kim, Eunhyeok Park, Sungjoo Yoo
[pdf]
[DOI]

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding
Geng Yuan, Sung-En Chang, Qing Jin, Alec Lu, Yanyu Li, Yushu Wu, Zhenglun Kong, Yanyue Xie, Peiyan Dong, Minghai Qin, Xiaolong Ma, Xulong Tang, Zhenman Fang, Yanzhi Wang
[pdf]
[DOI]

Real Spike: Learning Real-Valued Spikes for Spiking Neural Networks
Yufei Guo, Liwen Zhang, Yuanpei Chen, Xinyi Tong, Xiaode Liu, YingLei Wang, Xuhui Huang, Zhe Ma
[pdf]
[DOI]

FedLTN: Federated Learning for Sparse and Personalized Lottery Ticket Networks
Vaikkunth Mugunthan, Eric Lin, Vignesh Gokul, Christian Lau, Lalana Kagal, Steve Pieper
[pdf]
[DOI]

Theoretical Understanding of the Information Flow on Continual Learning Performance
Joshua Andle, Salimeh Yasaei Sekeh
[pdf]
[DOI]

Exploring Lottery Ticket Hypothesis in Spiking Neural Networks
Youngeun Kim, Yuhang Li, Hyoungseob Park, Yeshwanth Venkatesha, Ruokai Yin, Priyadarshini Panda
[pdf]
[DOI]

On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network
Juseung Yun, Janghyeon Lee, Hyounguk Shon, Eojindl Yi, Seung Hwan Kim, Junmo Kim
[pdf]
[DOI]

LANA: Latency Aware Network Acceleration
Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat
[pdf]
[DOI]

RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization
Zhe Wang, Jie Lin, Xue Geng, Mohamed M. Sabry Aly, Vijay Chandrasekhar
[pdf]
[DOI]

U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search
Ahmet Caner Yüzügüler, Nikolaos Dimitriadis, Pascal Frossard
[pdf]
[DOI]

PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization
Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, Guangyu Sun
[pdf]
[DOI]

Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach
Jiseok Youn, Jaehun Song, Hyung-Sin Kim, Saewoong Bahk
[pdf]
[DOI]

Understanding the Dynamics of DNNs Using Graph Modularity
Yao Lu, Wen Yang, Yunzhe Zhang, Zuohui Chen, Jinyin Chen, Qi Xuan, Zhen Wang, Xiaoniu Yang
[pdf]
[DOI]

Latent Discriminant Deterministic Uncertainty
Gianni Franchi, Xuanlong Yu, Andrei Bursuc, Emanuel Aldea, Severine Dubuisson, David Filliat
[pdf]
[DOI]

Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals
Simon Vandenhende, Dhruv Mahajan, Filip Radenovic, Deepti Ghadiyaram
[pdf]
[DOI]

HIVE: Evaluating the Human Interpretability of Visual Explanations
Sunnie S. Y. Kim, Nicole Meister, Vikram V. Ramaswamy, Ruth Fong, Olga Russakovsky
[pdf]
[DOI]

BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks
Uddeshya Upadhyay, Shyamgopal Karthik, Yanbei Chen, Massimiliano Mancini, Zeynep Akata
[pdf]
[DOI]

SESS: Saliency Enhancing with Scaling and Sliding
Osman Tursun, Simon Denman, Sridha Sridharan, Clinton Fookes
[pdf]
[DOI]

No Token Left Behind: Explainability-Aided Image Classification and Generation
Roni Paiss, Hila Chefer, Lior Wolf
[pdf]
[DOI]

Interpretable Image Classification with Differentiable Prototypes Assignment
Dawid Rymarczyk, Łukasz Struski, Michał Górszczak, Koryna Lewandowska, Jacek Tabor, Bartosz Zieliński
[pdf]
[DOI]

"Contributions of Shape, Texture, and Color in Visual Recognition"
Yunhao Ge, Yao Xiao, Zhi Xu, Xingrui Wang, Laurent Itti
[pdf]
[DOI]

STEEX: Steering Counterfactual Explanations with Semantics
Paul Jacob, Éloi Zablocki, Hédi Ben-Younes, Mickaël Chen, Patrick Pérez, Matthieu Cord
[pdf]
[DOI]

Are Vision Transformers Robust to Patch Perturbations?
Jindong Gu, Volker Tresp, Yao Qin
[pdf]
[DOI]

A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & Their Explanations
Gautam Machiraju, Sylvia Plevritis, Parag Mallick
[pdf]
[DOI]

Cartoon Explanations of Image Classifiers
Stefan Kolek, Duc Anh Nguyen, Ron Levie, Joan Bruna, Gitta Kutyniok
[pdf]
[DOI]

Shap-CAM: Visual Explanations for Convolutional Neural Networks Based on Shapley Value
Quan Zheng, Ziwei Wang, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain
Jiazhen Ji, Huan Wang, Yuge Huang, Jiaxiang Wu, Xingkun Xu, Shouhong Ding, ShengChuan Zhang, Liujuan Cao, Rongrong Ji
[pdf]
[DOI]

Contrast-Phys: Unsupervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast
Zhaodong Sun, Xiaobai Li
[pdf]
[DOI]

Source-Free Domain Adaptation with Contrastive Domain Alignment and Self-Supervised Exploration for Face Anti-Spoofing
Yuchen Liu, Yabo Chen, Wenrui Dai, Mengran Gou, Chun-Ting Huang, Hongkai Xiong
[pdf]
[DOI]

On Mitigating Hard Clusters for Face Clustering
Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun
[pdf]
[DOI]

OneFace: One Threshold for All
Jiaheng Liu, Zhipeng Yu, Haoyu Qin, Yichao Wu, Ding Liang, Gangming Zhao, Ke Xu
[pdf]
[DOI]

Label2Label: A Language Modeling Framework for Multi-Attribute Learning
Wanhua Li, Zhexuan Cao, Jianjiang Feng, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics
Gee-Sern Hsu, Rui-Cang Xie, Zhi-Ting Chen, Yu-Hong Lin
[pdf]
[DOI]

Hierarchical Contrastive Inconsistency Learning for Deepfake Video Detection
Zhihao Gu, Taiping Yao, Yang Chen, Shouhong Ding, Lizhuang Ma
[pdf]
[DOI]

Rethinking Robust Representation Learning under Fine-Grained Noisy Faces
Bingqi Ma, Guanglu Song, Boxiao Liu, Yu Liu
[pdf]
[DOI]

Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition
Sungho Shin, Joosoon Lee, Junseok Lee, Yeonguk Yu, Kyoobin Lee
[pdf]
[DOI]

Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions
Tohar Lukov, Na Zhao, Gim Hee Lee, Ser-Nam Lim
[pdf]
[DOI]

Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
Shuai Shen, Wanhua Li, Zheng Zhu, Yueqi Duan, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

CoupleFace: Relation Matters for Face Recognition Distillation
Jiaheng Liu, Haoyu Qin, Yichao Wu, Jinyang Guo, Ding Liang, Ke Xu
[pdf]
[DOI]

Controllable and Guided Face Synthesis for Unconstrained Face Recognition
Feng Liu, Minchul Kim, Anil Jain, Xiaoming Liu
[pdf]
[DOI]

Towards Robust Face Recognition with Comprehensive Search
Manyuan Zhang, Guanglu Song, Yu Liu, Hongsheng Li
[pdf]
[DOI]

Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian
Zhiwen Cao, Dongfang Liu, Qifan Wang, Yingjie Chen
[pdf]
[DOI]

AU-Aware 3D Face Reconstruction through Personalized AU-Specific Blendshape Learning
Chenyi Kuang, Zijun Cui, Jeffrey O. Kephart, Qiang Ji
[pdf]
[DOI]

BézierPalm: A Free Lunch for Palmprint Recognition
Kai Zhao, Lei Shen, Yingyi Zhang, Chuhan Zhou, Tao Wang, Ruixin Zhang, Shouhong Ding, Wei Jia, Wei Shen
[pdf]
[DOI]

Adaptive Transformers for Robust Few-Shot Cross-Domain Face Anti-Spoofing
Hsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, Jinwei Yuan, Hartwig Adam, Ming-Hsuan Yang
[pdf]
[DOI]

Face2Faceρ: Real-Time High-Resolution One-Shot Face Reenactment
Kewei Yang, Kang Chen, Daoliang Guo, Song-Hai Zhang, Yuan-Chen Guo, Weidong Zhang
[pdf]
[DOI]

Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation
Haiwen Feng, Timo Bolkart, Joachim Tesch, Michael J. Black, Victoria Abrevaya
[pdf]
[DOI]

BoundaryFace: A Mining Framework with Noise Label Self-Correction for Face Recognition
Shijie Wu, Xun Gong
[pdf]
[DOI]

Pre-training Strategies and Datasets for Facial Representation Learning
Adrian Bulat, Shiyang Cheng, Jing Yang, Andrew Garbett, Enrique Sanchez, Georgios Tzimiropoulos
[pdf]
[DOI]

Look Both Ways: Self-Supervising Driver Gaze Estimation and Road Scene Saliency
Isaac Kasahara, Simon Stent, Hyun Soo Park
[pdf]
[DOI]

MFIM: Megapixel Facial Identity Manipulation
Sanghyeon Na
[pdf]
[DOI]

3D Face Reconstruction with Dense Landmarks
Erroll Wood, Tadas Baltrušaitis, Charlie Hewitt, Matthew Johnson, Jingjing Shen, Nikola Milosavljević, Daniel Wilde, Stephan Garbin, Toby Sharp, Ivan Stojiljković, Tom Cashman, Julien Valentin
[pdf]
[DOI]

Emotion-Aware Multi-View Contrastive Learning for Facial Emotion Recognition
Daeha Kim, Byung Cheol Song
[pdf]
[DOI]

Order Learning Using Partially Ordered Data via Chainization
Seon-Ho Lee, Chang-Su Kim
[pdf]
[DOI]

Unsupervised High-Fidelity Facial Texture Generation and Reconstruction
Ron Slossberg, Ibrahim Jubran, Ron Kimmel
[pdf]
[DOI]

Multi-Domain Learning for Updating Face Anti-Spoofing Models
Xiao Guo, Yaojie Liu, Anil Jain, Xiaoming Liu
[pdf]
[DOI]

Towards Metrical Reconstruction of Human Faces
Wojciech Zielonka, Timo Bolkart, Justus Thies
[pdf]
[DOI]

Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Zhiheng Li, Anthony Hoogs, Chenliang Xu
[pdf]
[DOI]

Unsupervised and Semi-Supervised Bias Benchmarking in Face Recognition
Alexandra Chouldechova, Siqi Deng, Yongxin Wang, Wei Xia, Pietro Perona
[pdf]
[DOI]

Towards Efficient Adversarial Training on Vision Transformers
Boxi Wu, Jindong Gu, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu
[pdf]
[DOI]

MIME: Minority Inclusion for Majority Group Enhancement of AI Performance
Pradyumna Chari, Yunhao Ba, Shreeram Athreya, Achuta Kadambi
[pdf]
[DOI]

Studying Bias in GANs through the Lens of Race
Vongani H. Maluleke, Neerja Thakkar, Tim Brooks, Ethan Weber, Trevor Darrell, Alexei A. Efros, Angjoo Kanazawa, Devin Guillory
[pdf]
[DOI]

"Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness"
Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen, Bryan Hooi
[pdf]
[DOI]

Learning to Censor by Noisy Sampling
Ayush Chopra, Abhinav Java, Abhishek Singh, Vivek Sharma, Ramesh Raskar
[pdf]
[DOI]

An Invisible Black-Box Backdoor Attack through Frequency Domain
Tong Wang, Yuan Yao, Feng Xu, Shengwei An, Hanghang Tong, Ting Wang
[pdf]
[DOI]

FairGRAPE: Fairness-Aware GRAdient Pruning mEthod for Face Attribute Classification
Xiaofeng Lin, Seungbae Kim, Jungseock Joo
[pdf]
[DOI]

Attaining Class-Level Forgetting in Pretrained Model Using Few Samples
Pravendra Singh, Pratik Mazumder, Mohammed Asad Karim
[pdf]
[DOI]

Anti-Neuron Watermarking: Protecting Personal Data against Unauthorized Neural Networks
Zihang Zou, Boqing Gong, Liqiang Wang
[pdf]
[DOI]

An Impartial Take to the CNN vs Transformer Robustness Contest
Francesco Pinto, Philip H. S. Torr, Puneet K. Dokania
[pdf]
[DOI]

Recover Fair Deep Classification Models via Altering Pre-trained Structure
Yanfu Zhang, Shangqian Gao, Heng Huang
[pdf]
[DOI]

Decouple-and-Sample: Protecting Sensitive Information in Task Agnostic Data Release
Abhishek Singh, Ethan Garza, Ayush Chopra, Praneeth Vepakomma, Vivek Sharma, Ramesh Raskar
[pdf]
[DOI]

Privacy-Preserving Action Recognition via Motion Difference Quantization
Sudhakar Kumawat, Hajime Nagahara
[pdf]
[DOI]

Latent Space Smoothing for Individually Fair Representations
Momchil Peychev, Anian Ruoss, Mislav Balunović, Maximilian Baader, Martin Vechev
[pdf]
[DOI]

Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration
Christian Tomani, Daniel Cremers, Florian Buettner
[pdf]
[DOI]

FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations
Cemre Efe Karakas, Alara Dirik, Eylül Yalçınkaya, Pinar Yanardag
[pdf]
[DOI]

Distilling the Undistillable: Learning from a Nasty Teacher
Surgan Jandial, Yash Khasbage, Arghya Pal, Vineeth N Balasubramanian, Balaji Krishnamurthy
[pdf]
[DOI]

SOS! Self-Supervised Learning over Sets of Handled Objects in Egocentric Action Recognition
Victor Escorcia, Ricardo Guerrero, Xiatian Zhu, Brais Martinez
[pdf]
[DOI]

Egocentric Activity Recognition and Localization on a 3D Map
Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li
[pdf]
[DOI]

Generative Adversarial Network for Future Hand Segmentation from Egocentric Video
Wenqi Jia, Miao Liu, James M. Rehg
[pdf]
[DOI]

My View Is the Best View: Procedure Learning from Egocentric Videos
Siddhant Bansal, Chetan Arora, C.V. Jawahar
[pdf]
[DOI]

GIMO: Gaze-Informed Human Motion Prediction in Context
Yang Zheng, Yanchao Yang, Kaichun Mo, Jiaman Li, Tao Yu, Yebin Liu, Karen Liu, Leonidas J. Guibas
[pdf]
[DOI]

Image-Based CLIP-Guided Essence Transfer
Hila Chefer, Sagie Benaim, Roni Paiss, Lior Wolf
[pdf]
[DOI]

Detecting and Recovering Sequential DeepFake Manipulation
Rui Shao, Tianxing Wu, Ziwei Liu
[pdf]
[DOI]

Self-Supervised Sparse Representation for Video Anomaly Detection
Jhih-Ciang Wu, He-Yen Hsieh, Ding-Jie Chen, Chiou-Shann Fuh, Tyng-Luh Liu
[pdf]
[DOI]

Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal
Xinwei Liu, Jian Liu, Yang Bai, Jindong Gu, Tao Chen, Xiaojun Jia, Xiaochun Cao
[pdf]
[DOI]

Explaining Deepfake Detection by Analysing Image Matching
Shichao Dong, Jin Wang, Jiajun Liang, Haoqiang Fan, Renhe Ji
[pdf]
[DOI]

FrequencyLowCut Pooling – Plug & Play against Catastrophic Overfitting
Julia Grabinski, Steffen Jung, Janis Keuper, Margret Keuper
[pdf]
[DOI]

TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations
Shivangi Aneja, Lev Markhasin, Matthias Nießner
[pdf]
[DOI]

FingerprintNet: Synthesized Fingerprints for Generated Image Detection
Yonghyun Jeong, Doyeon Kim, Youngmin Ro, Pyounggeon Kim, Jongwon Choi
[pdf]
[DOI]

Detecting Generated Images by Real Images
Bo Liu, Fan Yang, Xiuli Bi, Bin Xiao, Weisheng Li, Xinbo Gao
[pdf]
[DOI]

An Information Theoretic Approach for Attention-Driven Face Forgery Detection
Ke Sun, Hong Liu, Taiping Yao, Xiaoshuai Sun, Shen Chen, Shouhong Ding, Rongrong Ji
[pdf]
[DOI]

Exploring Disentangled Content Information for Face Forgery Detection
Jiahao Liang, Huafeng Shi, Weihong Deng
[pdf]
[DOI]

RepMix: Representation Mixing for Robust Attribution of Synthesized Images
Tu Bui, Ning Yu, John Collomosse
[pdf]
[DOI]

Totems: Physical Objects for Verifying Visual Integrity
Jingwei Ma, Lucy Chai, Minyoung Huh, Tongzhou Wang, Ser-Nam Lim, Phillip Isola, Antonio Torralba
[pdf]
[DOI]

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval
Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, Yongdong Zhang
[pdf]
[DOI]

PASS: Part-Aware Self-Supervised Pre-training for Person Re-identification
Kuan Zhu, Haiyun Guo, Tianyi Yan, Yousong Zhu, Jinqiao Wang, Ming Tang
[pdf]
[DOI]

Adaptive Cross-Domain Learning for Generalizable Person Re-identification
Pengyi Zhang, Huanzhang Dou, Yunlong Yu, Xi Li
[pdf]
[DOI]

Multi-Query Video Retrieval
Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
[pdf]
[DOI]

Hierarchical Average Precision Training for Pertinent Image Retrieval
Elias Ramzi, Nicolas Audebert, Nicolas Thome, Clément Rambour, Xavier Bitot
[pdf]
[DOI]

Learning Semantic Correspondence with Sparse Annotations
Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava
[pdf]
[DOI]

Dynamically Transformed Instance Normalization Network for Generalizable Person Re-identification
Bingliang Jiao, Lingqiao Liu, Liying Gao, Guosheng Lin, Lu Yang, Shizhou Zhang, Peng Wang, Yanning Zhang
[pdf]
[DOI]

Domain Adaptive Person Search
Junjie Li, Yichao Yan, Guanshuo Wang, Fufu Yu, Qiong Jia, Shouhong Ding
[pdf]
[DOI]

TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin
[pdf]
[DOI]

Unstructured Feature Decoupling for Vehicle Re-identification
Wen Qian, Hao Luo, Silong Peng, Fan Wang, Chen Chen, Hao Li
[pdf]
[DOI]

Deep Hash Distillation for Image Retrieval
Young Kyun Jang, Geonmo Gu, Byungsoo Ko, Isaac Kang, Nam Ik Cho
[pdf]
[DOI]

Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification
Boqiang Xu, Jian Liang, Lingxiao He, Zhenan Sun
[pdf]
[DOI]

Granularity-Aware Adaptation for Image Retrieval over Multiple Tasks
Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis
[pdf]
[DOI]

Learning Audio-Video Modalities from Image Captions
Arsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid
[pdf]
[DOI]

RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-Supervised Learning
Wei-Ting Chen, I-Hsiang Chen, Chih-Yuan Yeh, Hao-Hsiang Yang, Hua-En Chang, Jian-Jiun Ding, Sy-Yen Kuo
[pdf]
[DOI]

Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval
Fan Hu, Aozhu Chen, Ziyue Wang, Fangming Zhou, Jianfeng Dong, Xirong Li
[pdf]
[DOI]

Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-identification
Yiyuan Zhang, Sanyuan Zhao, Yuhao Kang, Jianbing Shen
[pdf]
[DOI]

Cross-Modality Transformer for Visible-Infrared Person Re-identification
Kongzhu Jiang, Tianzhu Zhang, Xiang Liu, Bingqiao Qian, Yongdong Zhang, Feng Wu
[pdf]
[DOI]

Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment
Sangmin Lee, Sungjune Park, Yong Man Ro
[pdf]
[DOI]

Connecting Compression Spaces with Transformer for Approximate Nearest Neighbor Search
Haokui Zhang, Buzhou Tang, Wenze Hu, Xiaoyu Wang
[pdf]
[DOI]

SEMICON: A Learning-to-Hash Solution for Large-Scale Fine-Grained Image Retrieval
Yang Shen, Xuhao Sun, Xiu-Shen Wei, Qing-Yuan Jiang, Jian Yang
[pdf]
[DOI]

CAViT: Contextual Alignment Vision Transformer for Video Object Re-identification
Jinlin Wu, Lingxiao He, Wu Liu, Yang Yang, Zhen Lei, Tao Mei, Stan Z. Li
[pdf]
[DOI]

Text-Based Temporal Localization of Novel Events
Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury
[pdf]
[DOI]

Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval
Zhaopeng Dou, Zhongdao Wang, Weihua Chen, Yali Li, Shengjin Wang
[pdf]
[DOI]

Relighting4D: Neural Relightable Human from Videos
Zhaoxi Chen, Ziwei Liu
[pdf]
[DOI]

Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, Shuchang Zhou
[pdf]
[DOI]

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation
Jing He, Yiyi Zhou, Qi Zhang, Jun Peng, Yunhang Shen, Xiaoshuai Sun, Chao Chen, Rongrong Ji
[pdf]
[DOI]

StyleSwap: Style-Based Generator Empowers Robust Face Swapping
Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
[pdf]
[DOI]

Paint2Pix: Interactive Painting Based Progressive Image Synthesis and Editing
Jaskirat Singh, Liang Zheng, Cameron Smith, Jose Echevarria
[pdf]
[DOI]

FurryGAN: High Quality Foreground-Aware Image Synthesis
Jeongmin Bae, Mingi Kwon, Youngjung Uh
[pdf]
[DOI]

SCAM! Transferring Humans between Images with Semantic Cross Attention Modulation
Nicolas Dufour, David Picard, Vicky Kalogeiton
[pdf]
[DOI]

Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields
Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
[pdf]
[DOI]

WaveGAN: Frequency-Aware GAN for High-Fidelity Few-Shot Image Generation
Mengping Yang, Zhe Wang, Ziqiu Chi, Wenyi Feng
[pdf]
[DOI]

End-to-End Visual Editing with a Generatively Pre-trained Artist
Andrew Brown, Cheng-Yang Fu, Omkar Parkhi, Tamara L. Berg, Andrea Vedaldi
[pdf]
[DOI]

High-Fidelity GAN Inversion with Padding Space
Qingyan Bai, Yinghao Xu, Jiapeng Zhu, Weihao Xia, Yujiu Yang, Yujun Shen
[pdf]
[DOI]

Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping
Chao Xu, Jiangning Zhang, Yue Han, Guanzhong Tian, Xianfang Zeng, Ying Tai, Yabiao Wang, Chengjie Wang, Yong Liu
[pdf]
[DOI]

Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives
Wentao Yuan, Qingtian Zhu, Xiangyue Liu, Yikang Ding, Haotian Zhang, Chi Zhang
[pdf]
[DOI]

Make-a-Scene: Scene-Based Text-to-Image Generation with Human Priors
Oran Gafni, Adam Polyak, Oron Ashual, Shelly Sheynin, Devi Parikh, Yaniv Taigman
[pdf]
[DOI]

3D-FM GAN: Towards 3D-Controllable Face Manipulation
Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S.Y. Kung
[pdf]
[DOI]

Multi-Curve Translator for High-Resolution Photorealistic Image Translation
Yuda Song, Hui Qian, Xin Du
[pdf]
[DOI]

Deep Bayesian Video Frame Interpolation
Zhiyang Yu, Yu Zhang, Xujie Xiang, Dongqing Zou, Xijun Chen, Jimmy S. Ren
[pdf]
[DOI]

Cross Attention Based Style Distribution for Controllable Person Image Synthesis
Xinyue Zhou, Mingyu Yin, Xinyuan Chen, Li Sun, Changxin Gao, Qingli Li
[pdf]
[DOI]

KeypointNeRF: Generalizing Image-Based Volumetric Avatars Using Relative Spatial Encoding of Keypoints
Marko Mihajlovic, Aayush Bansal, Michael Zollhöfer, Siyu Tang, Shunsuke Saito
[pdf]
[DOI]

ViewFormer: NeRF-Free Neural Rendering from Few Images Using Transformers
Jonáš Kulhánek, Erik Derner, Torsten Sattler, Robert Babuška
[pdf]
[DOI]

L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing
Ziyu Chen, Chenjing Ding, Jianfei Guo, Dongliang Wang, Yikang Li, Xuan Xiao, Wei Wu, Li Song
[pdf]
[DOI]

A Perceptual Quality Metric for Video Frame Interpolation
Qiqi Hou, Abhijay Ghildyal, Feng Liu
[pdf]
[DOI]

Adaptive Feature Interpolation for Low-Shot Image Generation
Mengyu Dai, Haibin Hang, Xiaoyang Guo
[pdf]
[DOI]

PalGAN: Image Colorization with Palette Generative Adversarial Networks
Yi Wang, Menghan Xia, Lu Qi, Jing Shao, Yu Qiao
[pdf]
[DOI]

Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis
Long Zhuo, Guangcong Wang, Shikai Li, Wayne Wu, Ziwei Liu
[pdf]
[DOI]

Learning Prior Feature and Attention Enhanced Image Inpainting
Chenjie Cao, Qiaole Dong, Yanwei Fu
[pdf]
[DOI]

Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling via Temporal Basis Learning
Wenpeng Xing, Jie Chen
[pdf]
[DOI]

3D-Aware Semantic-Guided Generative Model for Human Synthesis
Jichao Zhang, Enver Sangineto, Hao Tang, Aliaksandr Siarohin, Zhun Zhong, Nicu Sebe, Wei Wang
[pdf]
[DOI]

Temporally Consistent Semantic Video Editing
Yiran Xu, Badour AlBahar, Jia-Bin Huang
[pdf]
[DOI]

Error Compensation Framework for Flow-Guided Video Inpainting
Jaeyeon Kang, Seoung Wug Oh, Seon Joo Kim
[pdf]
[DOI]

Scraping Textures from Natural Images for Synthesis and Editing
Xueting Li, Xiaolong Wang, Ming-Hsuan Yang, Alexei A. Efros, Sifei Liu
[pdf]
[DOI]

Single Stage Virtual Try-On via Deformable Attention Flows
Shuai Bai, Huiling Zhou, Zhikang Li, Chang Zhou, Hongxia Yang
[pdf]
[DOI]

Improving GANs for Long-Tailed Data through Group Spectral Regularization
Harsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, R. Venkatesh Babu
[pdf]
[DOI]

Hierarchical Semantic Regularization of Latent Spaces in StyleGANs
Tejan Karmali, Rishubh Parihar, Susmit Agrawal, Harsh Rangwani, Varun Jampani, Maneesh Singh, R. Venkatesh Babu
[pdf]
[DOI]

IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion
Seung-Jun Moon, Gyeong-Moon Park
[pdf]
[DOI]

StyleLight: HDR Panorama Generation for Lighting Estimation and Editing
Guangcong Wang, Yinuo Yang, Chen Change Loy, Ziwei Liu
[pdf]
[DOI]

Contrastive Monotonic Pixel-Level Modulation
Kun Lu, Rongpeng Li, Honggang Zhang
[pdf]
[DOI]

Learning Cross-Video Neural Representations for High-Quality Frame Interpolation
Wentao Shangguan, Yu Sun, Weijie Gan, Ulugbek S. Kamilov
[pdf]
[DOI]

Learning Continuous Implicit Representation for Near-Periodic Patterns
Bowei Chen, Tiancheng Zhi, Martial Hebert, Srinivasa G. Narasimhan
[pdf]
[DOI]

End-to-End Graph-Constrained Vectorized Floorplan Generation with Panoptic Refinement
Jiachen Liu, Yuan Xue, Jose Duarte, Krishnendra Shekhawat, Zihan Zhou, Xiaolei Huang
[pdf]
[DOI]

Few-Shot Image Generation with Mixup-Based Distance Learning
Chaerin Kong, Jeesoo Kim, Donghoon Han, Nojun Kwak
[pdf]
[DOI]

A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos
Xu Yao, Alasdair Newson, Yann Gousseau, Pierre Hellier
[pdf]
[DOI]

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs
Ziqiang Li, Chaoyue Wang, Heliang Zheng, Jing Zhang, Bin Li
[pdf]
[DOI]

BlobGAN: Spatially Disentangled Scene Representations
Dave Epstein, Taesung Park, Richard Zhang, Eli Shechtman, Alexei A. Efros
[pdf]
[DOI]

Unified Implicit Neural Stylization
Zhiwen Fan, Yifan Jiang, Peihao Wang, Xinyu Gong, Dejia Xu, Zhangyang Wang
[pdf]
[DOI]

GAN with Multivariate Disentangling for Controllable Hair Editing
Xuyang Guo, Meina Kan, Tianle Chen, Shiguang Shan
[pdf]
[DOI]

Discovering Transferable Forensic Features for CNN-Generated Images Detection
Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Alexander Binder, Ngai-Man Cheung
[pdf]
[DOI]

Harmonizer: Learning to Perform White-Box Image and Video Harmonization
Zhanghan Ke, Chunyi Sun, Lei Zhu, Ke Xu, Rynson W.H. Lau
[pdf]
[DOI]

Text2LIVE: Text-Driven Layered Image and Video Editing
Omer Bar-Tal, Dolev Ofri-Amar, Rafail Fridman, Yoni Kasten, Tali Dekel
[pdf]
[DOI]

Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation
Jian Zhang, Jinchi Huang, Bowen Cai, Huan Fu, Mingming Gong, Chaohui Wang, Jiaming Wang, Hongchen Luo, Rongfei Jia, Binqiang Zhao, Xing Tang
[pdf]
[DOI]

StyleGAN-Human: A Data-Centric Odyssey of Human Generation
Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu
[pdf]
[DOI]

ColorFormer: Image Colorization via Color Memory Assisted Hybrid-Attention Transformer
Xiaozhong Ji, Boyuan Jiang, Donghao Luo, Guangpin Tao, Wenqing Chu, Zhifeng Xie, Chengjie Wang, Ying Tai
[pdf]
[DOI]

EAGAN: Efficient Two-Stage Evolutionary Architecture Search for GANs
Guohao Ying, Xin He, Bin Gao, Bo Han, Xiaowen Chu
[pdf]
[DOI]

Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation
Dae-Young Song, Geonsoo Lee, HeeKyung Lee, Gi-Mun Um, Donghyeon Cho
[pdf]
[DOI]

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation
Songhua Liu, Jingwen Ye, Sucheng Ren, Xinchao Wang
[pdf]
[DOI]

Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
[pdf]
[DOI]

Auto-Regressive Image Synthesis with Integrated Quantization
Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Changgong Zhang, Shijian Lu
[pdf]
[DOI]

JoJoGAN: One Shot Face Stylization
Min Jin Chong, David Forsyth
[pdf]
[DOI]

VecGAN: Image-to-Image Translation with Interpretable Latent Directions
Yusuf Dalva, Said Fahri Altındiş, Aysegul Dundar
[pdf]
[DOI]

Any-Resolution Training for High-Resolution Image Synthesis
Lucy Chai, Michaël Gharbi, Eli Shechtman, Phillip Isola, Richard Zhang
[pdf]
[DOI]

CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer
Zijie Wu, Zhen Zhu, Junping Du, Xiang Bai
[pdf]
[DOI]

CANF-VC: Conditional Augmented Normalizing Flows for Video Compression
Yung-Han Ho, Chih-Peng Chang, Peng-Yu Chen, Alessandro Gnutti, Wen-Hsiao Peng
[pdf]
[DOI]

Bi-Level Feature Alignment for Versatile Image Translation and Manipulation
Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Aoran Xiao, Shijian Lu, Chunyan Miao
[pdf]
[DOI]

High-Fidelity Image Inpainting with GAN Inversion
Yongsheng Yu, Libo Zhang, Heng Fan, Tiejian Luo
[pdf]
[DOI]

DeltaGAN: Towards Diverse Few-Shot Image Generation with Sample-Specific Delta
Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang
[pdf]
[DOI]

Image Inpainting with Cascaded Modulation GAN and Object-Aware Training
Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo
[pdf]
[DOI]

StyleFace: Towards Identity-Disentangled Face Generation on Megapixels
Yuchen Luo, Junwei Zhu, Keke He, Wenqing Chu, Ying Tai, Chengjie Wang, Junchi Yan
[pdf]
[DOI]

Video Extrapolation in Space and Time
Yunzhi Zhang, Jiajun Wu
[pdf]
[DOI]

Contrastive Learning for Diverse Disentangled Foreground Generation
Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
[pdf]
[DOI]

BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-Aided Adversarial Learning
Changgyoon Oh, Wonjune Cho, Yujeong Chae, Daehee Park, Lin Wang, Kuk-Jin Yoon
[pdf]
[DOI]

Augmentation of rPPG Benchmark Datasets: Learning to Remove and Embed rPPG Signals via Double Cycle Consistent Learning from Unpaired Facial Videos
Cheng-Ju Hsieh, Wei-Hao Chung, Chiou-Ting Hsu
[pdf]
[DOI]

Geometry-Aware Single-Image Full-Body Human Relighting
Chaonan Ji, Tao Yu, Kaiwen Guo, Jingxin Liu, Yebin Liu
[pdf]
[DOI]

3D-Aware Indoor Scene Synthesis with Depth Priors
Zifan Shi, Yujun Shen, Jiapeng Zhu, Dit-Yan Yeung, Qifeng Chen
[pdf]
[DOI]

Deep Portrait Delighting
Joshua Weir, Junhong Zhao, Andrew Chalmers, Taehyun Rhee
[pdf]
[DOI]

Vector Quantized Image-to-Image Translation
Yu-Jie Chen, Shin-I Cheng, Wei-Chen Chiu, Hung-Yu Tseng, Hsin-Ying Lee
[pdf]
[DOI]

The Surprisingly Straightforward Scene Text Removal Method with Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis
Hyeonsu Lee, Chankyu Choi
[pdf]
[DOI]

Free-Viewpoint RGB-D Human Performance Capture and Rendering
Phong Nguyen-Ha, Nikolaos Sarafianos, Christoph Lassner, Janne Heikkilä, Tony Tung
[pdf]
[DOI]

Multiview Regenerative Morphing with Dual Flows
Chih-Jung Tsai, Cheng Sun, Hwann-Tzong Chen
[pdf]
[DOI]

Hallucinating Pose-Compatible Scenes
Tim Brooks, Alexei A. Efros
[pdf]
[DOI]

Motion and Appearance Adaptation for Cross-Domain Motion Transfer
Borun Xu, Biao Wang, Jinhong Deng, Jiale Tao, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
[pdf]
[DOI]

Layered Controllable Video Generation
Jiahui Huang, Yuhe Jin, Kwang Moo Yi, Leonid Sigal
[pdf]
[DOI]

Custom Structure Preservation in Face Aging
Guillermo Gomez-Trenado, Stéphane Lathuilière, Pablo Mesejo, Óscar Cordón
[pdf]
[DOI]

Spatio-Temporal Deformable Attention Network for Video Deblurring
Huicong Zhang, Haozhe Xie, Hongxun Yao
[pdf]
[DOI]

NeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture Editing
Bangbang Yang, Chong Bao, Junyi Zeng, Hujun Bao, Yinda Zhang, Zhaopeng Cui, Guofeng Zhang
[pdf]
[DOI]

NeRF for Outdoor Scene Relighting
Viktor Rudnev, Mohamed Elgharib, William Smith, Lingjie Liu, Vladislav Golyanik, Christian Theobalt
[pdf]
[DOI]

CoGS: Controllable Generation and Search from Sketch and Style
Cusuh Ham, Gemma Canet Tarrés, Tu Bui, James Hays, Zhe Lin, John Collomosse
[pdf]
[DOI]

HairNet: Hairstyle Transfer with Pose Changes
Peihao Zhu, Rameen Abdal, John Femiani, Peter Wonka
[pdf]
[DOI]

Unbiased Multi-Modality Guidance for Image Inpainting
Yongsheng Yu, Dawei Du, Libo Zhang, Tiejian Luo
[pdf]
[DOI]

Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents
Jaskirat Singh, Cameron Smith, Jose Echevarria, Liang Zheng
[pdf]
[DOI]

Motion Transformer for Unsupervised Image Animation
Jiale Tao, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
[pdf]
[DOI]

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu, Jian Liang, Lei Ji, Fan Yang, Yuejian Fang, Daxin Jiang, Nan Duan
[pdf]
[DOI]

EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer
Chenyu Yang, Wanrong He, Yingqing Xu, Yang Gao
[pdf]
[DOI]

Editing Out-of-Domain GAN Inversion via Differential Activations
Haorui Song, Yong Du, Tianyi Xiang, Junyu Dong, Jing Qin, Shengfeng He
[pdf]
[DOI]

On the Robustness of Quality Measures for GANs
Motasem Alfarra, Juan C. Pérez, Anna Frühstück, Philip H. S. Torr, Peter Wonka, Bernard Ghanem
[pdf]
[DOI]

Sound-Guided Semantic Video Generation
Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Hyunjun Cho, Jihyun Bae, Jinkyu Kim, Sangpil Kim
[pdf]
[DOI]

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation
Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi
[pdf]
[DOI]

Controllable Video Generation through Global and Local Motion Dynamics
Aram Davtyan, Paolo Favaro
[pdf]
[DOI]

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN
Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang
[pdf]
[DOI]

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Songwei Ge, Thomas Hayes, Harry Yang, Xi Yin, Guan Pang, David Jacobs, Jia-Bin Huang, Devi Parikh
[pdf]
[DOI]

Combining Internal and External Constraints for Unrolling Shutter in Videos
Eyal Naor, Itai Antebi, Shai Bagon, Michal Irani
[pdf]
[DOI]

WISE: Whitebox Image Stylization by Example-Based Learning
Winfried Lötzsch, Max Reimann, Martin Büssemeyer, Amir Semmo, Jürgen Döllner, Matthias Trapp
[pdf]
[DOI]

Neural Radiance Transfer Fields for Relightable Novel-View Synthesis with Global Illumination
Linjie Lyu, Ayush Tewari, Thomas Leimkühler, Marc Habermann, Christian Theobalt
[pdf]
[DOI]

Transformers As Meta-Learners for Implicit Neural Representations
Yinbo Chen, Xiaolong Wang
[pdf]
[DOI]

Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment
Taewoo Kim, Chaeyeon Chung, Yoonseo Kim, Sunghyun Park, Kangyeol Kim, Jaegul Choo
[pdf]
[DOI]

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions
Sangyun Lee, Gyojung Gu, Sunghyun Park, Seunghwan Choi, Jaegul Choo
[pdf]
[DOI]

A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution
Hengsheng Zhang, Xueyi Zou, Jiaming Guo, Youliang Yan, Rong Xie, Li Song
[pdf]
[DOI]

Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis
Jeong-gi Kwak, Yuanming Li, Dongsik Yoon, Donghyeon Kim, David Han, Hanseok Ko
[pdf]
[DOI]

AdaNeRF: Adaptive Sampling for Real-Time Rendering of Neural Radiance Fields
Andreas Kurz, Thomas Neff, Zhaoyang Lv, Michael Zollhöfer, Markus Steinberger
[pdf]
[DOI]

Improving the Perceptual Quality of 2D Animation Interpolation
Shuhong Chen, Matthias Zwicker
[pdf]
[DOI]

Selective TransHDR: Transformer-Based Selective HDR Imaging Using Ghost Region Mask
Jou Won Song, Ye-In Park, Kyeongbo Kong, Jaeho Kwak, Suk-Ju Kang
[pdf]
[DOI]

Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution
Cheng Ma, Jingyi Zhang, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constraints
Di Chen, Yu Liu, Lianghua Huang, Bin Wang, Pan Pan
[pdf]
[DOI]

DoodleFormer: Creative Sketch Drawing with Transformers
Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg
[pdf]
[DOI]

Implicit Neural Representations for Variable Length Human Motion Generation
Pablo Cervantes, Yusuke Sekikawa, Ikuro Sato, Koichi Shinoda
[pdf]
[DOI]

Learning Object Placement via Dual-Path Graph Completion
Siyuan Zhou, Liu Liu, Li Niu, Liqing Zhang
[pdf]
[DOI]

Expanded Adaptive Scaling Normalization for End to End Image Compression
Chajin Shin, Hyeongmin Lee, Hanbin Son, Sangjin Lee, Dogyoon Lee, Sangyoun Lee
[pdf]
[DOI]

Generator Knows What Discriminator Should Learn in Unconditional GANs
Gayoung Lee, Hyunsu Kim, Junho Kim, Seonghyeon Kim, Jung-Woo Ha, Yunjey Choi
[pdf]
[DOI]

Compositional Visual Generation with Composable Diffusion Models
Nan Liu, Shuang Li, Yilun Du, Antonio Torralba, Joshua B. Tenenbaum
[pdf]
[DOI]

ManiFest: Manifold Deformation for Few-Shot Image Translation
Fabio Pizzati, Jean-François Lalonde, Raoul de Charette
[pdf]
[DOI]

Supervised Attribute Information Removal and Reconstruction for Image Manipulation
Nannan Li, Bryan A. Plummer
[pdf]
[DOI]

BLT: Bidirectional Layout Transformer for Controllable Layout Generation
Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa
[pdf]
[DOI]

Diverse Generation from a Single Video Made Possible
Niv Haim, Ben Feinstein, Niv Granot, Assaf Shocher, Shai Bagon, Tali Dekel, Michal Irani
[pdf]
[DOI]

Rayleigh EigenDirections (REDs): Nonlinear GAN Latent Space Traversals for Multidimensional Features
Guha Balakrishnan, Raghudeep Gadde, Aleix Martinez, Pietro Perona
[pdf]
[DOI]

Bridging the Domain Gap towards Generalization in Automatic Colorization
Hyejin Lee, Daehee Kim, Daeun Lee, Jinkyu Kim, Jaekoo Lee
[pdf]
[DOI]

Generating Natural Images with Direct Patch Distributions Matching
Ariel Elnekave, Yair Weiss
[pdf]
[DOI]

Context-Consistent Semantic Image Editing with Style-Preserved Modulation
Wuyang Luo, Su Yang, Hong Wang, Bo Long, Weishan Zhang
[pdf]
[DOI]

Eliminating Gradient Conflict in Reference-Based Line-Art Colorization
Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
[pdf]
[DOI]

Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations
Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada
[pdf]
[DOI]

JPEG Artifacts Removal via Contrastive Representation Learning
Xi Wang, Xueyang Fu, Yurui Zhu, Zheng-Jun Zha
[pdf]
[DOI]

Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning
Xiang Chen, Zhentao Fan, Pengpeng Li, Longgang Dai, Caihua Kong, Zhuoran Zheng, Yufeng Huang, Yufeng Li
[pdf]
[DOI]

Efficient Long-Range Attention Network for Image Super-Resolution
Xindong Zhang, Hui Zeng, Shi Guo, Lei Zhang
[pdf]
[DOI]

FlowFormer: A Transformer Architecture for Optical Flow
Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li
[pdf]
[DOI]

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction
Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc Van Gool
[pdf]
[DOI]

Learning Shadow Correspondence for Video Shadow Detection
Xinpeng Ding, Jingwen Yang, Xiaowei Hu, Xiaomeng Li
[pdf]
[DOI]

Metric Learning Based Interactive Modulation for Real-World Super-Resolution
Chong Mou, Yanze Wu, Xintao Wang, Chao Dong, Jian Zhang, Ying Shan
[pdf]
[DOI]

Dynamic Dual Trainable Bounds for Ultra-Low Precision Super-Resolution Networks
Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
[pdf]
[DOI]

OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers
Jialun Pei, Tianyang Cheng, Deng-Ping Fan, He Tang, Chuanbo Chen, Luc Van Gool
[pdf]
[DOI]

Highly Accurate Dichotomous Image Segmentation
Xuebin Qin, Hang Dai, Xiaobin Hu, Deng-Ping Fan, Ling Shao, Luc Van Gool
[pdf]
[DOI]

Boosting Supervised Dehazing Methods via Bi-Level Patch Reweighting
Xingyu Jiang, Hongkun Dou, Chengwei Fu, Bingquan Dai, Tianrun Xu, Yue Deng
[pdf]
[DOI]

Flow-Guided Transformer for Video Inpainting
Kaidong Zhang, Jingjing Fu, Dong Liu
[pdf]
[DOI]

Shift-tolerant Perceptual Similarity Metric
Abhijay Ghildyal, Feng Liu
[pdf]
[DOI]

Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution
Yuehan Zhang, Bo Ji, Jia Hao, Angela Yao
[pdf]
[DOI]

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng
[pdf]
[DOI]

Uncertainty Learning in Kernel Estimation for Multi-stage Blind Image Super-Resolution
Zhenxuan Fang, Weisheng Dong, Xin Li, Jinjian Wu, Leida Li, Guangming Shi
[pdf]
[DOI]

Learning Spatio-Temporal Downsampling for Effective Video Upscaling
Xiaoyu Xiang, Yapeng Tian, Vijay Rengarajan, Lucas D. Young, Bo Zhu, Rakesh Ranjan
[pdf]
[DOI]

Learning Local Implicit Fourier Representation for Image Warping
Jaewon Lee, Kwang Pyo Choi, Kyong Hwan Jin
[pdf]
[DOI]

SepLUT: Separable Image-Adaptive Lookup Tables for Real-Time Image Enhancement
Canqian Yang, Meiguang Jin, Yi Xu, Rui Zhang, Ying Chen, Huaida Liu
[pdf]
[DOI]

Blind Image Decomposition
Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li
[pdf]
[DOI]

MuLUT: Cooperating Multiple Look-Up Tables for Efficient Image Super-Resolution
Jiacheng Li, Chang Chen, Zhen Cheng, Zhiwei Xiong
[pdf]
[DOI]

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution
Zhongwei Qiu, Huan Yang, Jianlong Fu, Dongmei Fu
[pdf]
[DOI]

Spatial-Frequency Domain Information Integration for Pan-Sharpening
Man Zhou, Jie Huang, Keyu Yan, Hu Yu, Xueyang Fu, Aiping Liu, Xian Wei, Feng Zhao
[pdf]
[DOI]

Adaptive Patch Exiting for Scalable Single Image Super-Resolution
Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
[pdf]
[DOI]

Efficient Meta-Tuning for Content-Aware Neural Video Delivery
Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
[pdf]
[DOI]

Reference-Based Image Super-Resolution with Deformable Attention Transformer
Jiezhang Cao, Jingyun Liang, Kai Zhang, Yawei Li, Yulun Zhang, Wenguan Wang, Luc Van Gool
[pdf]
[DOI]

Local Color Distributions Prior for Image Enhancement
Haoyuan Wang, Ke Xu, Rynson W.H. Lau
[pdf]
[DOI]

L-CoDer: Language-Based Colorization with Color-Object Decoupling Transformer
Zheng Chang, Shuchen Weng, Yu Li, Si Li, Boxin Shi
[pdf]
[DOI]

From Face to Natural Image: Learning Real Degradation for Blind Image Super-Resolution
Xiaoming Li, Chaofeng Chen, Xianhui Lin, Wangmeng Zuo, Lei Zhang
[pdf]
[DOI]

Towards Interpretable Video Super-Resolution via Alternating Optimization
Jiezhang Cao, Jingyun Liang, Kai Zhang, Wenguan Wang, Qin Wang, Yulun Zhang, Hao Tang, Luc Van Gool
[pdf]
[DOI]

Event-Based Fusion for Motion Deblurring with Cross-Modal Attention
Lei Sun, Christos Sakaridis, Jingyun Liang, Qi Jiang, Kailun Yang, Peng Sun, Yaozu Ye, Kaiwei Wang, Luc Van Gool
[pdf]
[DOI]

Fast and High Quality Image Denoising via Malleable Convolution
Yifan Jiang, Bartlomiej Wronski, Ben Mildenhall, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue
[pdf]
[DOI]

TAPE: Task-Agnostic Prior Embedding for Image Restoration
Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian
[pdf]
[DOI]

Uncertainty Inspired Underwater Image Enhancement
Zhenqi Fu, Wu Wang, Yue Huang, Xinghao Ding, Kai-Kuang Ma
[pdf]
[DOI]

Hourglass Attention Network for Image Inpainting
Ye Deng, Siqi Hui, Rongye Meng, Sanping Zhou, Jinjun Wang
[pdf]
[DOI]

Unfolded Deep Kernel Estimation for Blind Image Super-Resolution
Hongyi Zheng, Hongwei Yong, Lei Zhang
[pdf]
[DOI]

Event-Guided Deblurring of Unknown Exposure Time Videos
Taewoo Kim, Jeongmin Lee, Lin Wang, Kuk-Jin Yoon
[pdf]
[DOI]

ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-Modality Image Fusion
Zhanbo Huang, Jinyuan Liu, Xin Fan, Risheng Liu, Wei Zhong, Zhongxuan Luo
[pdf]
[DOI]

Content Adaptive Latents and Decoder for Neural Image Compression
Guanbo Pan, Guo Lu, Zhihao Hu, Dong Xu
[pdf]
[DOI]

Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution
Jie Liang, Hui Zeng, Lei Zhang
[pdf]
[DOI]

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-Ahead Forward Ones
Junyi Li, Xiaohe Wu, Zhenxing Niu, Wangmeng Zuo
[pdf]
[DOI]

Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations
Zhilu Zhang, Ruohao Wang, Hongzhi Zhang, Yunjin Chen, Wangmeng Zuo
[pdf]
[DOI]

Secrets of Event-Based Optical Flow
Shintaro Shiba, Yoshimitsu Aoki, Guillermo Gallego
[pdf]
[DOI]

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing
Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi
[pdf]
[DOI]

ERDN: Equivalent Receptive Field Deformable Network for Video Deblurring
Bangrui Jiang, Zhihuai Xie, Zhen Xia, Songnan Li, Shan Liu
[pdf]
[DOI]

Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion
Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii, Takayoshi Yamashita
[pdf]
[DOI]

ART-SS: An Adaptive Rejection Technique for Semi-Supervised Restoration for Adverse Weather-Affected Images
Rajeev Yasarla, Carey E. Priebe, Vishal M. Patel
[pdf]
[DOI]

Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion
Pengwei Liang, Junjun Jiang, Xianming Liu, Jiayi Ma
[pdf]
[DOI]

Learning Degradation Representations for Image Deblurring
Dasong Li, Yi Zhang, Ka Chun Cheung, Xiaogang Wang, Hongwei Qin, Hongsheng Li
[pdf]
[DOI]

Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution
Xiaoyu Dong, Naoto Yokoya, Longguang Wang, Tatsumi Uezato
[pdf]
[DOI]

Spectrum-Aware and Transferable Architecture Search for Hyperspectral Image Restoration
Wei He, Quanming Yao, Naoto Yokoya, Tatsumi Uezato, Hongyan Zhang, Liangpei Zhang
[pdf]
[DOI]

Neural Color Operators for Sequential Image Retouching
Yili Wang, Xin Li, Kun Xu, Dongliang He, Qi Zhang, Fu Li, Errui Ding
[pdf]
[DOI]

Optimizing Image Compression via Joint Learning with Denoising
Ka Leong Cheng, Yueqi Xie, Qifeng Chen
[pdf]
[DOI]

"Restore Globally, Refine Locally: A Mask-Guided Scheme to Accelerate Super-Resolution Networks"
Xiaotao Hu, Jun Xu, Shuhang Gu, Ming-Ming Cheng, Li Liu
[pdf]
[DOI]

Compiler-Aware Neural Architecture Search for On-Mobile Real-Time Super-Resolution
Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang
[pdf]
[DOI]

Modeling Mask Uncertainty in Hyperspectral Image Reconstruction
Jiamian Wang, Yulun Zhang, Xin Yuan, Ziyi Meng, Zhiqiang Tao
[pdf]
[DOI]

Perceiving and Modeling Density for Image Dehazing
Tian Ye, Yunchen Zhang, Mingchao Jiang, Liang Chen, Yun Liu, Sixiang Chen, Erkang Chen
[pdf]
[DOI]

Stripformer: Strip Transformer for Fast Image Deblurring
Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, Chia-Wen Lin
[pdf]
[DOI]

Deep Fourier-Based Exposure Correction Network with Spatial-Frequency Interaction
Jie Huang, Yajing Liu, Feng Zhao, Keyu Yan, Jinghao Zhang, Yukun Huang, Man Zhou, Zhiwei Xiong
[pdf]
[DOI]

Frequency and Spatial Dual Guidance for Image Dehazing
Hu Yu, Naishan Zheng, Man Zhou, Jie Huang, Zeyu Xiao, Feng Zhao
[pdf]
[DOI]

Towards Real-World HDRTV Reconstruction: A Data Synthesis-Based Approach
Zhen Cheng, Tao Wang, Yong Li, Fenglong Song, Chang Chen, Zhiwei Xiong
[pdf]
[DOI]

Learning Discriminative Shrinkage Deep Networks for Image Deconvolution
Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien, Ming-Hsuan Yang
[pdf]
[DOI]

KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution
Jiahong Fu, Hong Wang, Qi Xie, Qian Zhao, Deyu Meng, Zongben Xu
[pdf]
[DOI]

ARM: Any-Time Super-Resolution Method
Bohong Chen, Mingbao Lin, Kekai Sheng, Mengdan Zhang, Peixian Chen, Ke Li, Liujuan Cao, Rongrong Ji
[pdf]
[DOI]

Attention-Aware Learning for Hyperparameter Prediction in Image Processing Pipelines
Haina Qin, Longfei Han, Juan Wang, Congxuan Zhang, Yanwei Li, Bing Li, Weiming Hu
[pdf]
[DOI]

RealFlow: EM-Based Realistic Optical Flow Dataset Generation from Videos
Yunhui Han, Kunming Luo, Ao Luo, Jiangyu Liu, Haoqiang Fan, Guiming Luo, Shuaicheng Liu
[pdf]
[DOI]

Memory-Augmented Model-Driven Network for Pansharpening
Keyu Yan, Man Zhou, Li Zhang, Chengjun Xie
[pdf]
[DOI]

All You Need Is RAW: Defending against Adversarial Attacks with Camera Image Pipelines
Yuxuan Zhang, Bo Dong, Felix Heide
[pdf]
[DOI]

Ghost-Free High Dynamic Range Imaging with Context-Aware Transformer
Zhen Liu, Yinglong Wang, Bing Zeng, Shuaicheng Liu
[pdf]
[DOI]

Style-Guided Shadow Removal
Jin Wan, Hui Yin, Zhenyao Wu, Xinyi Wu, Yanting Liu, Song Wang
[pdf]
[DOI]

D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution
Youwei Li, Haibin Huang, Lanpeng Jia, Haoqiang Fan, Shuaicheng Liu
[pdf]
[DOI]

GRIT-VLP: Grouped Mini-Batch Sampling for Efficient Vision and Language Pre-training
Jaeseok Byun, Taebaek Hwang, Jianlong Fu, Taesup Moon
[pdf]
[DOI]

Efficient Video Deblurring Guided by Motion Magnitude
Yusheng Wang, Yunfan Lu, Ye Gao, Lin Wang, Zhihang Zhong, Yinqiang Zheng, Atsushi Yamashita
[pdf]
[DOI]

Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and a New Physics-Inspired Transformer Model
Zhiyuan Mao, Ajay Jaiswal, Zhangyang Wang, Stanley H. Chan
[pdf]
[DOI]

Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression
A. Burakhan Koyuncu, Han Gao, Atanas Boev, Georgii Gaikov, Elena Alshina, Eckehard Steinbach
[pdf]
[DOI]

Image Super-Resolution with Deep Dictionary
Shunta Maeda
[pdf]
[DOI]

TempFormer: Temporally Consistent Transformer for Video Denoising
Mingyang Song, Yang Zhang, Tunç O. Aydın
[pdf]
[DOI]

RAWtoBit: A Fully End-to-End Camera ISP Network
Wooseok Jeong, Seung-Won Jung
[pdf]
[DOI]

DRCNet: Dynamic Image Restoration Contrastive Network
Fei Li, Lingfeng Shen, Yang Mi, Zhenbo Li
[pdf]
[DOI]

Zero-Shot Learning for Reflection Removal of Single 360-Degree Image
Byeong-Ju Han, Jae-Young Sim
[pdf]
[DOI]

Transformer with Implicit Edges for Particle-Based Physics Simulation
Yidi Shao, Chen Change Loy, Bo Dai
[pdf]
[DOI]

Rethinking Video Rain Streak Removal: A New Synthesis Model and a Deraining Network with Video Rain Prior
Shuai Wang, Lei Zhu, Huazhu Fu, Jing Qin, Carola-Bibiane Schönlieb, Wei Feng, Song Wang
[pdf]
[DOI]

Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images
Jinjin Gu, Haoming Cai, Chenyu Dong, Ruofan Zhang, Yulun Zhang, Wenming Yang, Chun Yuan
[pdf]
[DOI]

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance
Zhihang Zhong, Xiao Sun, Zhirong Wu, Yinqiang Zheng, Stephen Lin, Imari Sato
[pdf]
[DOI]

AlphaVC: High-Performance and Efficient Learned Video Compression
Yibo Shi, Yunying Ge, Jing Wang, Jue Mao
[pdf]
[DOI]

Content-Oriented Learned Image Compression
Meng Li, Shangyin Gao, Yihui Feng, Yibo Shi, Jing Wang
[pdf]
[DOI]

RRSR:Reciprocal Reference-Based Image Super-Resolution with Progressive Feature Alignment and Selection
Lin Zhang, Xin Li, Dongliang He, Fu Li, Yili Wang, Zhaoxiang Zhang
[pdf]
[DOI]

Contrastive Prototypical Network with Wasserstein Confidence Penalty
Haoqing Wang, Zhi-Hong Deng
[pdf]
[DOI]

Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition
Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang
[pdf]
[DOI]

Self-Support Few-Shot Semantic Segmentation
Qi Fan, Wenjie Pei, Yu-Wing Tai, Chi-Keung Tang
[pdf]
[DOI]

Few-Shot Object Detection with Model Calibration
Qi Fan, Chi-Keung Tang, Yu-Wing Tai
[pdf]
[DOI]

Self-Supervision Can Be a Good Few-Shot Learner
Yuning Lu, Liangjian Wen, Jianzhuang Liu, Yajing Liu, Xinmei Tian
[pdf]
[DOI]

tSF: Transformer-Based Semantic Filter for Few-Shot Learning
Jinxiang Lai, Siqian Yang, Wenlong Liu, Yi Zeng, Zhongyi Huang, Wenlong Wu, Jun Liu, Bin-Bin Gao, Chengjie Wang
[pdf]
[DOI]

Adversarial Feature Augmentation for Cross-Domain Few-Shot Classification
Yanxu Hu, Andy J. Ma
[pdf]
[DOI]

Constructing Balance from Imbalance for Long-Tailed Image Recognition
Yue Xu, Yong-Lu Li, Jiefeng Li, Cewu Lu
[pdf]
[DOI]

"On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond"
Yuzhe Yang, Hao Wang, Dina Katabi
[pdf]
[DOI]

Few-Shot Video Object Detection
Qi Fan, Chi-Keung Tang, Yu-Wing Tai
[pdf]
[DOI]

Worst Case Matters for Few-Shot Recognition
Minghao Fu, Yun-Hao Cao, Jianxin Wu
[pdf]
[DOI]

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
Kai Yi, Xiaoqian Shen, Yunhao Gou, Mohamed Elhoseiny
[pdf]
[DOI]

Doubly Deformable Aggregation of Covariance Matrices for Few-Shot Segmentation
Zhitong Xiong, Haopeng Li, Xiao Xiang Zhu
[pdf]
[DOI]

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation
Xinyu Shi, Dong Wei, Yu Zhang, Donghuan Lu, Munan Ning, Jiashun Chen, Kai Ma, Yefeng Zheng
[pdf]
[DOI]

Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning
Xingping Dong, Jianbing Shen, Ling Shao
[pdf]
[DOI]

CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition
Shreyank N Gowda, Laura Sevilla-Lara, Frank Keller, Marcus Rohrbach
[pdf]
[DOI]

Few-Shot Class-Incremental Learning for 3D Point Cloud Objects
Townim Chowdhury, Ali Cheraghian, Sameera Ramasinghe, Sahar Ahmadi, Morteza Saberi, Shafin Rahman
[pdf]
[DOI]

Meta-Learning with Less Forgetting on Large-Scale Non-stationary Task Distributions
Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang Duan, Mingchen Gao
[pdf]
[DOI]

DNA: Improving Few-Shot Transfer Learning with Low-Rank Decomposition and Alignment
Ziyu Jiang, Tianlong Chen, Xuxi Chen, Yu Cheng, Luowei Zhou, Lu Yuan, Ahmed Awadallah, Zhangyang Wang
[pdf]
[DOI]

Learning Instance and Task-Aware Dynamic Kernels for Few-Shot Learning
Rongkai Ma, Pengfei Fang, Gil Avraham, Yan Zuo, Tianyu Zhu, Tom Drummond, Mehrtash Harandi
[pdf]
[DOI]

Open-World Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
Quande Liu, Youpeng Wen, Jianhua Han, Chunjing Xu, Hang Xu, Xiaodan Liang
[pdf]
[DOI]

Few-Shot Classification with Contrastive Learning
Zhanyuan Yang, Jinghua Wang, Yingying Zhu
[pdf]
[DOI]

Time-rEversed diffusioN tEnsor Transformer: A New TENET of Few-Shot Object Detection
Shan Zhang, Naila Murray, Lei Wang, Piotr Koniusz
[pdf]
[DOI]

Self-Promoted Supervision for Few-Shot Transformer
Bowen Dong, Pan Zhou, Shuicheng Yan, Wangmeng Zuo
[pdf]
[DOI]

Few-Shot Object Counting and Detection
Thanh Nguyen, Chau Pham, Khoi Nguyen, Minh Hoai
[pdf]
[DOI]

Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark
Kibok Lee, Hao Yang, Satyaki Chakraborty, Zhaowei Cai, Gurumurthy Swaminathan, Avinash Ravichandran, Onkar Dabeer
[pdf]
[DOI]

Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations
Wentao Chen, Zhang Zhang, Wei Wang, Liang Wang, Zilei Wang, Tieniu Tan
[pdf]
[DOI]

Mutually Reinforcing Structure with Proposal Contrastive Consistency for Few-Shot Object Detection
Tianxue Ma, Mingwei Bi, Jian Zhang, Wang Yuan, Zhizhong Zhang, Yuan Xie, Shouhong Ding, Lizhuang Ma
[pdf]
[DOI]

Dual Contrastive Learning with Anatomical Auxiliary Supervision for Few-Shot Medical Image Segmentation
Huisi Wu, Fangyan Xiao, Chongxin Liang
[pdf]
[DOI]

Improving Few-Shot Learning through Multi-task Representation Learning Theory
Quentin Bouniot, Ievgen Redko, Romaric Audigier, Angélique Loesch, Amaury Habrard
[pdf]
[DOI]

Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation
Min Zhang, Siteng Huang, Wenbin Li, Donglin Wang
[pdf]
[DOI]

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments
Khoi D. Nguyen, Quoc-Huy Tran, Khoi Nguyen, Binh-Son Hua, Rang Nguyen
[pdf]
[DOI]

Temporal and Cross-Modal Attention for Audio-Visual Zero-Shot Learning
Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
[pdf]
[DOI]

HM: Hybrid Masking for Few-Shot Segmentation
Seonghyeon Moon, Samuel S. Sohn, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Muhammad Haris Khan, Mubbasir Kapadia
[pdf]
[DOI]

TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning
Haoquan Li, Laoming Zhang, Daoan Zhang, Lang Fu, Peng Yang, Jianguo Zhang
[pdf]
[DOI]

Kernel Relative-Prototype Spectral Filtering for Few-Shot Learning
Tao Zhang, Wu Huang
[pdf]
[DOI]

"“This Is My Unicorn, Fluffy”: Personalizing Frozen Vision-Language Representations"
Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
[pdf]
[DOI]

CLOSE: Curriculum Learning on the Sharing Extent towards Better One-Shot NAS
Zixuan Zhou, Xuefei Ning, Yi Cai, Jiashu Han, Yiping Deng, Yuhan Dong, Huazhong Yang, Yu Wang
[pdf]
[DOI]

Streamable Neural Fields
Junwoo Cho, Seungtae Nam, Daniel Rho, Jong Hwan Ko, Eunbyung Park
[pdf]
[DOI]

Gradient-Based Uncertainty for Monocular Depth Estimation
Julia Hornauer, Vasileios Belagiannis
[pdf]
[DOI]

Online Continual Learning with Contrastive Vision Transformer
Zhen Wang, Liu Liu, Yajing Kong, Jiaxian Guo, Dacheng Tao
[pdf]
[DOI]

CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Taeho Kim, Yongin Kwon, Jemin Lee, Taeho Kim, Sangtae Ha
[pdf]
[DOI]

EAutoDet: Efficient Architecture Search for Object Detection
Xiaoxing Wang, Jiale Lin, Juanping Zhao, Xiaokang Yang, Junchi Yan
[pdf]
[DOI]

A Max-Flow Based Approach for Neural Architecture Search
Chao Xue, Xiaoxing Wang, Junchi Yan, Chun-Guang Li
[pdf]
[DOI]

OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses
Robik Shrestha, Kushal Kafle, Christopher Kanan
[pdf]
[DOI]

ERA: Enhanced Rational Activations
Martin Trimmel, Mihai Zanfir, Richard Hartley, Cristian Sminchisescu
[pdf]
[DOI]

Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Cong Wang, Hongmin Xu, Xiong Zhang, Li Wang, Zhitong Zheng, Haifeng Liu
[pdf]
[DOI]

Active Label Correction Using Robust Parameter Update and Entropy Propagation
Kwang In Kim
[pdf]
[DOI]

Unpaired Image Translation via Vector Symbolic Architectures
Justin Theiss, Jay Leverett, Daeil Kim, Aayush Prakash
[pdf]
[DOI]

"UniNet: Unified Architecture Search with Convolution, Transformer, and MLP"
Jihao Liu, Xin Huang, Guanglu Song, Hongsheng Li, Yu Liu
[pdf]
[DOI]

AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers
Yongming Rao, Wenliang Zhao, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu, Jinnian Zhang, Houwen Peng, Mengchen Liu, Bin Xiao, Jianlong Fu, Lu Yuan
[pdf]
[DOI]

Equivariant Hypergraph Neural Networks
Jinwoo Kim, Saeyoon Oh, Sungjun Cho, Seunghoon Hong
[pdf]
[DOI]

ScaleNet: Searching for the Model to Scale
Jiyang Xie, Xiu Su, Shan You, Zhanyu Ma, Fei Wang, Chen Qian
[pdf]
[DOI]

Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction
Vincent Le Guen, Clément Rambour, Nicolas Thome
[pdf]
[DOI]

ViTAS: Vision Transformer Architecture Search
Xiu Su, Shan You, Jiyang Xie, Mingkai Zheng, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang, Chang Xu
[pdf]
[DOI]

LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds
Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, Dragomir Anguelov
[pdf]
[DOI]

Uncertainty-DTW for Time Series and Sequences
Lei Wang, Piotr Koniusz
[pdf]
[DOI]

Black-Box Few-Shot Knowledge Distillation
Dang Nguyen, Sunil Gupta, Kien Do, Svetha Venkatesh
[pdf]
[DOI]

Revisiting Batch Norm Initialization
Jim Davis, Logan Frank
[pdf]
[DOI]

SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling
Ho Man Kwan, Shenghui Song
[pdf]
[DOI]

Filter Pruning via Feature Discrimination in Deep Neural Networks
Zhiqiang He, Yaguan Qian, Yuqi Wang, Bin Wang, Xiaohui Guan, Zhaoquan Gu, Xiang Ling, Shaoning Zeng, Haijiang Wang, Wujie Zhou
[pdf]
[DOI]

LA3: Efficient Label-Aware AutoAugment
Mingjun Zhao, Shan Lu, Zixuan Wang, Xiaoli Wang, Di Niu
[pdf]
[DOI]

Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps
Alireza Ganjdanesh, Shangqian Gao, Heng Huang
[pdf]
[DOI]

BA-Net: Bridge Attention for Deep Convolutional Neural Networks
Yue Zhao, Junzhou Chen, Zirui Zhang, Ronghui Zhang
[pdf]
[DOI]

SAU: Smooth Activation Function Using Convolution with Approximate Identities
Koushik Biswas, Sandeep Kumar, Shilpak Banerjee, Ashish Kumar Pandey
[pdf]
[DOI]

Multi-Exit Semantic Segmentation Networks
Alexandros Kouris, Stylianos I. Venieris, Stefanos Laskaridis, Nicholas Lane
[pdf]
[DOI]

Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks
Bernd Prach, Christoph H. Lampert
[pdf]
[DOI]

PointScatter: Point Set Representation for Tubular Structure Extraction
Dong Wang, Zhao Zhang, Ziwei Zhao, Yuhang Liu, Yihong Chen, Liwei Wang
[pdf]
[DOI]

Check and Link: Pairwise Lesion Correspondence Guides Mammogram Mass Detection
Ziwei Zhao, Dong Wang, Yihong Chen, Ziteng Wang, Liwei Wang
[pdf]
[DOI]

Graph-Constrained Contrastive Regularization for Semi-Weakly Volumetric Segmentation
Simon Reiß, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen
[pdf]
[DOI]

Generalizable Medical Image Segmentation via Random Amplitude Mixup and Domain-Specific Image Restoration
Ziqi Zhou, Lei Qi, Yinghuan Shi
[pdf]
[DOI]

Auto-FedRL: Federated Hyperparameter Optimization for Multi-Institutional Medical Image Segmentation
Pengfei Guo, Dong Yang, Ali Hatamizadeh, An Xu, Ziyue Xu, Wenqi Li, Can Zhao, Daguang Xu, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Vishal M. Patel, Holger R. Roth
[pdf]
[DOI]

Personalizing Federated Medical Image Segmentation via Local Calibration
Jiacheng Wang, Yueming Jin, Liansheng Wang
[pdf]
[DOI]

One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement
Zihao Yin, Ping Gong, Chunyu Wang, Yizhou Yu, Yizhou Wang
[pdf]
[DOI]

Ultra-High-Resolution Unpaired Stain Transformation via Kernelized Instance Normalization
Ming-Yang Ho, Min-Sheng Wu, Che-Ming Wu
[pdf]
[DOI]

Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation
Wenxuan Wang, Chen Chen, Jing Wang, Sen Zha, Yan Zhang, Jiangyun Li
[pdf]
[DOI]

ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images
Jiawei Yang, Hanbo Chen, Yuan Liang, Junzhou Huang, Lei He, Jianhua Yao
[pdf]
[DOI]

CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images
Axel Levy, Frédéric Poitevin, Julien Martel, Youssef Nashed, Ariana Peck, Nina Miolane, Daniel Ratner, Mike Dunne, Gordon Wetzstein
[pdf]
[DOI]

UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier
Yutong Xie, Jianpeng Zhang, Yong Xia, Qi Wu
[pdf]
[DOI]

DLME: Deep Local-Flatness Manifold Embedding
Zelin Zang, Siyuan Li, Di Wu, Ge Wang, Kai Wang, Lei Shang, Baigui Sun, Hao Li, Stan Z. Li
[pdf]
[DOI]

Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching
Jiazhen Liu, Xirong Li, Qijie Wei, Jie Xu, Dayong Ding
[pdf]
[DOI]

Graph Neural Network for Cell Tracking in Microscopy Videos
Tal Ben-Haim, Tammy Riklin Raviv
[pdf]
[DOI]

CXR Segmentation by AdaIN-Based Domain Adaptation and Knowledge Distillation
Yujin Oh, Jong Chul Ye
[pdf]
[DOI]

Accurate Detection of Proteins in Cryo-Electron Tomograms from Sparse Labels
Qinwen Huang, Ye Zhou, Hsuan-Fu Liu, Alberto Bartesaghi
[pdf]
[DOI]

K-SALSA: K-Anonymous Synthetic Averaging of Retinal Images via Local Style Alignment
Minkyu Jeon, Hyeonjin Park, Hyunwoo J. Kim, Michael Morley, Hyunghoon Cho
[pdf]
[DOI]

RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-Guided Disease Classification
Moinak Bhattacharya, Shubham Jain, Prateek Prasanna
[pdf]
[DOI]

Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images
Kevin Thandiackal, Boqi Chen, Pushpak Pati, Guillaume Jaume, Drew F. K. Williamson, Maria Gabrani, Orcun Goksel
[pdf]
[DOI]

Learning Uncoupled-Modulation CVAE for 3D Action-Conditioned Human Motion Synthesis
Chongyang Zhong, Lei Hu, Zihao Zhang, Shihong Xia
[pdf]
[DOI]

Towards Grand Unification of Object Tracking
Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu
[pdf]
[DOI]

ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo, Wenyu Liu, Xinggang Wang
[pdf]
[DOI]

Robust Multi-Object Tracking by Marginal Inference
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu
[pdf]
[DOI]

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?
Aleksandr Kim, Guillem Brasó, Aljoša Ošep, Laura Leal-Taixé
[pdf]
[DOI]

Particle Video Revisited: Tracking through Occlusions Using Point Trajectories
Adam W. Harley, Zhaoyuan Fang, Katerina Fragkiadaki
[pdf]
[DOI]

Tracking Objects As Pixel-Wise Distributions
Zelin Zhao, Ze Wu, Yueqing Zhuang, Boxun Li, Jiaya Jia
[pdf]
[DOI]

CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds
Zhiyang Guo, Yunyao Mao, Wengang Zhou, Min Wang, Houqiang Li
[pdf]
[DOI]

Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline
Jinyu Yang, Zhongqun Zhang, Zhe Li, Hyung Jin Chang, Aleš Leonardis, Feng Zheng
[pdf]
[DOI]

Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting
Dooseop Choi, KyoungWook Min
[pdf]
[DOI]

AiATrack: Attention in Attention for Transformer Visual Tracking
Shenyuan Gao, Chunluan Zhou, Chao Ma, Xinggang Wang, Junsong Yuan
[pdf]
[DOI]

Disentangling Architecture and Training for Optical Flow
Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David J. Fleet, William T. Freeman
[pdf]
[DOI]

A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow
Jenny Schmalfuss, Philipp Scholze, Andrés Bruhn
[pdf]
[DOI]

Robust Landmark-Based Stent Tracking in X-Ray Fluoroscopy
Luojie Huang, Yikang Liu, Li Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun
[pdf]
[DOI]

Social ODE: Multi-agent Trajectory Forecasting with Neural Ordinary Differential Equations
Song Wen, Hao Wang, Dimitris N. Metaxas
[pdf]
[DOI]

Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-agent Trajectory Prediction
Li-Wu Tsao, Yan-Kai Wang, Hao-Siang Lin, Hong-Han Shuai, Lai-Kuan Wong, Wen-Huang Cheng
[pdf]
[DOI]

Diverse Human Motion Prediction Guided by Multi-level Spatial-Temporal Anchors
Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui
[pdf]
[DOI]

Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction
Inhwan Bae, Jin-Hwi Park, Hae-Gon Jeon
[pdf]
[DOI]

Sequential Multi-View Fusion Network for Fast LiDAR Point Motion Estimation
Gang Zhang, Xiaoyan Li, Zhenhua Wang
[pdf]
[DOI]

E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs
Yanyan Li, Federico Tombari
[pdf]
[DOI]

Point Cloud Compression with Range Image-Based Entropy Model for Autonomous Driving
Sukai Wang, Ming Liu
[pdf]
[DOI]

Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework
Botao Ye, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
[pdf]
[DOI]

MotionCLIP: Exposing Human Motion Generation to CLIP Space
Guy Tevet, Brian Gordon, Amir Hertz, Amit H. Bermano, Daniel Cohen-Or
[pdf]
[DOI]

Backbone Is All Your Need: A Simplified Architecture for Visual Object Tracking
Boyu Chen, Peixia Li, Lei Bai, Lei Qiao, Qiuhong Shen, Bo Li, Weihao Gan, Wei Wu, Wanli Ouyang
[pdf]
[DOI]

Aware of the History: Trajectory Forecasting with the Local Behavior Data
Yiqi Zhong, Zhenyang Ni, Siheng Chen, Ulrich Neumann
[pdf]
[DOI]

Optical Flow Training under Limited Label Budget via Active Learning
Shuai Yuan, Xian Sun, Hannah Kim, Shuzhi Yu, Carlo Tomasi
[pdf]
[DOI]

Hierarchical Feature Embedding for Visual Tracking
Zhixiong Pi, Weitao Wan, Chong Sun, Changxin Gao, Nong Sang, Chen Li
[pdf]
[DOI]

Tackling Background Distraction in Video Object Segmentation
Suhwan Cho, Heansung Lee, Minhyeok Lee, Chaewon Park, Sungjun Jang, Minjung Kim, Sangyoun Lee
[pdf]
[DOI]

Social-Implicit: Rethinking Trajectory Prediction Evaluation and the Effectiveness of Implicit Maximum Likelihood Estimation
Abduallah Mohamed, Deyao Zhu, Warren Vu, Mohamed Elhoseiny, Christian Claudel
[pdf]
[DOI]

TEMOS: Generating Diverse Human Motions from Textual Descriptions
Mathis Petrovich, Michael J. Black, Gül Varol
[pdf]
[DOI]

Tracking Every Thing in the Wild
Siyuan Li, Martin Danelljan, Henghui Ding, Thomas E. Huang, Fisher Yu
[pdf]
[DOI]

HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance
Soshi Shimada, Vladislav Golyanik, Zhi Li, Patrick Pérez, Weipeng Xu, Christian Theobalt
[pdf]
[DOI]

Towards Sequence-Level Training for Visual Tracking
Minji Kim, Seungkwan Lee, Jungseul Ok, Bohyung Han, Minsu Cho
[pdf]
[DOI]

Learned Monocular Depth Priors in Visual-Inertial Initialization
Yunwen Zhou, Abhishek Kar, Eric Turner, Adarsh Kowdle, Chao X. Guo, Ryan C. DuToit, Konstantine Tsotsos
[pdf]
[DOI]

Robust Visual Tracking by Segmentation
Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc Van Gool
[pdf]
[DOI]

MeshLoc: Mesh-Based Visual Localization
Vojtech Panek, Zuzana Kukelova, Torsten Sattler
[pdf]
[DOI]

S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction
Yu-Wen Chen, Hsuan-Kung Yang, Chu-Chi Chiu, Chun-Yi Lee
[pdf]
[DOI]

Large-Displacement 3D Object Tracking with Hybrid Non-local Optimization
Xuhui Tian, Xinran Lin, Fan Zhong, Xueying Qin
[pdf]
[DOI]

"FEAR: Fast, Efficient, Accurate and Robust Visual Tracker"
Vasyl Borsuk, Roman Vei, Orest Kupyn, Tetiana Martyniuk, Igor Krashenyi, Jiři Matas
[pdf]
[DOI]

PREF: Predictability Regularized Neural Motion Fields
Liangchen Song, Xuan Gong, Benjamin Planche, Meng Zheng, David Doermann, Junsong Yuan, Terrence Chen, Ziyan Wu
[pdf]
[DOI]

View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums
Conghao Wong, Beihao Xia, Ziming Hong, Qinmu Peng, Wei Yuan, Qiong Cao, Yibo Yang, Xinge You
[pdf]
[DOI]

"HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking"
Haoxian Zhang, Yonggen Ling
[pdf]
[DOI]

RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer
Jianfeng Xiang, Junliang Chen, Wenshuang Liu, Xianxu Hou, Linlin Shen
[pdf]
[DOI]

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image
Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang
[pdf]
[DOI]

Entropy-Driven Sampling and Training Scheme for Conditional Diffusion Generation
Guangcong Zheng, Shengming Li, Hui Wang, Taiping Yao, Yang Chen, Shouhong Ding, Xi Li
[pdf]
[DOI]

Accelerating Score-Based Generative Models with Preconditioned Diffusion Sampling
Hengyuan Ma, Li Zhang, Xiatian Zhu, Jianfeng Feng
[pdf]
[DOI]

Learning to Generate Realistic LiDAR Point Clouds
Vlas Zyrianov, Xiyue Zhu, Shenlong Wang
[pdf]
[DOI]

RFNet-4D: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds
Tuan-Anh Vu, Thanh Nguyen, Binh-Son Hua, Quang-Hieu Pham, Sai-Kit Yeung
[pdf]
[DOI]

Diverse Image Inpainting with Normalizing Flow
Cairong Wang, Yiming Zhu, Chun Yuan
[pdf]
[DOI]

Improved Masked Image Generation with Token-Critic
José Lezama, Huiwen Chang, Lu Jiang, Irfan Essa
[pdf]
[DOI]

TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation
Junghyuk Lee, Jong-Seok Lee
[pdf]
[DOI]

Exploring Gradient-Based Multi-directional Controls in GANs
Zikun Chen, Ruowei Jiang, Brendan Duke, Han Zhao, Parham Aarabi
[pdf]
[DOI]

Spatially Invariant Unsupervised 3D Object-Centric Learning and Scene Decomposition
Tianyu Wang, Miaomiao Liu, Kee Siong Ng
[pdf]
[DOI]

Neural Scene Decoration from a Single Photograph
Hong-Wing Pang, Yingshu Chen, Phuoc-Hieu Le, Binh-Son Hua, Thanh Nguyen, Sai-Kit Yeung
[pdf]
[DOI]

Outpainting by Queries
Kai Yao, Penglei Gao, Xi Yang, Jie Sun, Rui Zhang, Kaizhu Huang
[pdf]
[DOI]

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes
Sam Bond-Taylor, Peter Hessey, Hiroshi Sasaki, Toby P. Breckon, Chris G. Willcocks
[pdf]
[DOI]

ChunkyGAN: Real Image Inversion via Segments
Adéla Šubrtová, David Futschik, Jan Čech, Michal Lukáč, Eli Shechtman, Daniel Sýkora
[pdf]
[DOI]

GAN Cocktail: Mixing GANs without Dataset Access
Omri Avrahami, Dani Lischinski, Ohad Fried
[pdf]
[DOI]

Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering
Mingfei Chen, Jianfeng Zhang, Xiangyu Xu, Lijuan Liu, Yujun Cai, Jiashi Feng, Shuicheng Yan
[pdf]
[DOI]

Controllable Shadow Generation Using Pixel Height Maps
Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes
[pdf]
[DOI]

Learning Where to Look – Generative NAS Is Surprisingly Efficient
Jovita Lukasik, Steffen Jung, Margret Keuper
[pdf]
[DOI]

Subspace Diffusion Generative Models
Bowen Jing, Gabriele Corso, Renato Berlinghieri, Tommi Jaakkola
[pdf]
[DOI]

DuelGAN: A Duel between Two Discriminators Stabilizes the GAN Training
Jiaheng Wei, Minghao Liu, Jiahao Luo, Andrew Zhu, James Davis, Yang Liu
[pdf]
[DOI]

MINER: Multiscale Implicit Neural Representation
Vishwanath Saragadam, Jasper Tan, Guha Balakrishnan, Richard G. Baraniuk, Ashok Veeraraghavan
[pdf]
[DOI]

An Embedded Feature Whitening Approach to Deep Neural Network Optimization
Hongwei Yong, Lei Zhang
[pdf]
[DOI]

Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization
Alp Yurtsever, Tolga Birdal, Vladislav Golyanik
[pdf]
[DOI]

Self-Supervised Learning of Visual Graph Matching
Chang Liu, Shaofeng Zhang, Xiaokang Yang, Junchi Yan
[pdf]
[DOI]

Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models
Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Ahmed Awadallah, Zhangyang Wang
[pdf]
[DOI]

QISTA-ImageNet: A Deep Compressive Image Sensing Framework Solving lq-Norm Optimization Problem
Gang-Xuan Lin, Shih-Wei Hu, Chun-Shien Lu
[pdf]
[DOI]

R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning
Qiankun Gao, Chen Zhao, Bernard Ghanem, Jian Zhang
[pdf]
[DOI]

Domain Generalization by Mutual-Information Regularization with Pre-trained Models
Junbum Cha, Kyungjae Lee, Sungrae Park, Sanghyuk Chun
[pdf]
[DOI]

Predicting Is Not Understanding: Recognizing and Addressing Underspecification in Machine Learning
Damien Teney, Maxime Peyrard, Ehsan Abbasnejad
[pdf]
[DOI]

Neural-Sim: Learning to Generate Training Data with NeRF
Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet
[pdf]
[DOI]

Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning
Hanwei Fan, Jiandong Mu, Wei Zhang
[pdf]
[DOI]

Learned Variational Video Color Propagation
Markus Hofinger, Erich Kobler, Alexander Effland, Thomas Pock
[pdf]
[DOI]

Continual Variational Autoencoder Learning via Online Cooperative Memorization
Fei Ye, Adrian G. Bors
[pdf]
[DOI]

Learning to Learn with Smooth Regularization
Yuanhao Xiong, Cho-Jui Hsieh
[pdf]
[DOI]

Incremental Task Learning with Incremental Rank Updates
Rakib Hyder, Ken Shao, Boyu Hou, Panos Markopoulos, Ashley Prater-Bennette, M. Salman Asif
[pdf]
[DOI]

Batch-Efficient EigenDecomposition for Small and Medium Matrices
Yue Song, Nicu Sebe, Wei Wang
[pdf]
[DOI]

Ensemble Learning Priors Driven Deep Unfolding for Scalable Video Snapshot Compressive Imaging
Chengshuai Yang, Shiyu Zhang, Xin Yuan
[pdf]
[DOI]

Approximate Discrete Optimal Transport Plan with Auxiliary Measure Method
Dongsheng An, Na Lei, Xianfeng Gu
[pdf]
[DOI]

A Comparative Study of Graph Matching Algorithms in Computer Vision
Stefan Haller, Lorenz Feineis, Lisa Hutschenreiter, Florian Bernard, Carsten Rother, Dagmar Kainmüller, Paul Swoboda, Bogdan Savchynskyy
[pdf]
[DOI]

Improving Generalization in Federated Learning by Seeking Flat Minima
Debora Caldarola, Barbara Caputo, Marco Ciccone
[pdf]
[DOI]

Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search: Tight or Not
Liangzu Peng, Mahyar Fazlyab, René Vidal
[pdf]
[DOI]

Transfer without Forgetting
Matteo Boschini, Lorenzo Bonicelli, Angelo Porrello, Giovanni Bellitto, Matteo Pennisi, Simone Palazzo, Concetto Spampinato, Simone Calderara
[pdf]
[DOI]

AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation
Farshid Varno, Marzie Saghayi, Laya Rafiee Sevyeri, Sharut Gupta, Stan Matwin, Mohammad Havaei
[pdf]
[DOI]

Tackling Long-Tailed Category Distribution under Domain Shifts
Xiao Gu, Yao Guo, Zeju Li, Jianing Qiu, Qi Dou, Yuxuan Liu, Benny Lo, Guang-Zhong Yang
[pdf]
[DOI]

Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with Local Representation
Li Gao, Dong Nie, Bo Li, Xiaofeng Ren
[pdf]
[DOI]

Improving Vision Transformers by Revisiting High-Frequency Components
Jiawang Bai, Li Yuan, Shu-Tao Xia, Shuicheng Yan, Zhifeng Li, Wei Liu
[pdf]
[DOI]

Recurrent Bilinear Optimization for Binary Neural Networks
Sheng Xu, Yanjing Li, Tiancheng Wang, Teli Ma, Baochang Zhang, Peng Gao, Yu Qiao, Jinhu Lü, Guodong Guo
[pdf]
[DOI]

Neural Architecture Search for Spiking Neural Networks
Youngeun Kim, Yuhang Li, Hyoungseob Park, Yeshwanth Venkatesha, Priyadarshini Panda
[pdf]
[DOI]

Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification
Yang Liu, Lei Zhou, Pengcheng Zhang, Xiao Bai, Lin Gu, Xiaohan Yu, Jun Zhou, Edwin R. Hancock
[pdf]
[DOI]

DaViT: Dual Attention Vision Transformers
Mingyu Ding, Bin Xiao, Noel Codella, Ping Luo, Jingdong Wang, Lu Yuan
[pdf]
[DOI]

Optimal Transport for Label-Efficient Visible-Infrared Person Re-identification
Jiangming Wang, Zhizhong Zhang, Mingang Chen, Yi Zhang, Cong Wang, Bin Sheng, Yanyun Qu, Yuan Xie
[pdf]
[DOI]

Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li, Runyi Yu, Zhennan Wang, Li Yuan, Guoli Song, Jie Chen
[pdf]
[DOI]

Neighborhood Collective Estimation for Noisy Label Identification and Correction
Jichang Li, Guanbin Li, Feng Liu, Yizhou Yu
[pdf]
[DOI]

Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay
Huan Liu, Li Gu, Zhixiang Chi, Yang Wang, Yuanhao Yu, Jun Chen, Jin Tang
[pdf]
[DOI]

Anti-Retroactive Interference for Lifelong Learning
Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo
[pdf]
[DOI]

Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning
Hualiang Wang, Siming Fu, Xiaoxuan He, Hangxiang Fang, Zuozhu Liu, Haoji Hu
[pdf]
[DOI]

Dynamic Metric Learning with Cross-Level Concept Distillation
Wenzhao Zheng, Yuanhui Huang, Borui Zhang, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing
Linhui Sun, Yifan Zhang, Ke Cheng, Jian Cheng, Hanqing Lu
[pdf]
[DOI]

Out-of-Distribution Detection with Boundary Aware Learning
Sen Pei, Xin Zhang, Bin Fan, Gaofeng Meng
[pdf]
[DOI]

Learning Hierarchy Aware Features for Reducing Mistake Severity
Ashima Garg, Depanshu Sani, Saket Anand
[pdf]
[DOI]

Learning to Detect Every Thing in an Open World
Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko
[pdf]
[DOI]

KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang, Xue Wang, Fan Wang, Ming Lin, Shuning Chang, Hao Li, Rong Jin
[pdf]
[DOI]

Registration Based Few-Shot Anomaly Detection
Chaoqin Huang, Haoyan Guan, Aofan Jiang, Ya Zhang, Michael Spratling, Yan-Feng Wang
[pdf]
[DOI]

Improving Robustness by Enhancing Weak Subnets
Yong Guo, David Stutz, Bernt Schiele
[pdf]
[DOI]

Learning Invariant Visual Representations for Compositional Zero-Shot Learning
Tian Zhang, Kongming Liang, Ruoyi Du, Xian Sun, Zhanyu Ma, Jun Guo
[pdf]
[DOI]

Improving Covariance Conditioning of the SVD Meta-Layer by Orthogonality
Yue Song, Nicu Sebe, Wei Wang
[pdf]
[DOI]

Out-of-Distribution Detection with Semantic Mismatch under Masking
Yijun Yang, Ruiyuan Gao, Qiang Xu
[pdf]
[DOI]

Data-Free Neural Architecture Search via Recursive Label Calibration
Zechun Liu, Zhiqiang Shen, Yun Long, Eric Xing, Kwang-Ting Cheng, Chas Leichner
[pdf]
[DOI]

Learning from Multiple Annotator Noisy Labels via Sample-Wise Label Fusion
Zhengqi Gao, Fan-Keng Sun, Mingran Yang, Sucheng Ren, Zikai Xiong, Marc Engeler, Antonio Burazer, Linda Wildling, Luca Daniel, Duane S. Boning
[pdf]
[DOI]

Acknowledging the Unknown for Multi-Label Learning with Single Positive Labels
Donghao Zhou, Pengfei Chen, Qiong Wang, Guangyong Chen, Pheng-Ann Heng
[pdf]
[DOI]

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers
Zicheng Liu, Siyuan Li, Di Wu, Zihan Liu, Zhiyuan Chen, Lirong Wu, Stan Z. Li
[pdf]
[DOI]

MaxViT: Multi-axis Vision Transformer
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li
[pdf]
[DOI]

ScalableViT: Rethinking the Context-Oriented Generalization of Vision Transformer
Rui Yang, Hailong Ma, Jie Wu, Yansong Tang, Xuefeng Xiao, Min Zheng, Xiu Li
[pdf]
[DOI]

Three Things Everyone Should Know about Vision Transformers
Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou
[pdf]
[DOI]

DeiT III: Revenge of the ViT
Hugo Touvron, Matthieu Cord, Hervé Jégou
[pdf]
[DOI]

MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
Chuanguang Yang, Zhulin An, Helong Zhou, Linhang Cai, Xiang Zhi, Jiwen Wu, Yongjun Xu, Qian Zhang
[pdf]
[DOI]

Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition
Zhou Yang, Weisheng Dong, Xin Li, Jinjian Wu, Leida Li, Guangming Shi
[pdf]
[DOI]

Novel Class Discovery without Forgetting
K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian
[pdf]
[DOI]

SAFA: Sample-Adaptive Feature Augmentation for Long-Tailed Image Classification
Yan Hong, Jianfu Zhang, Zhongyi Sun, Ke Yan
[pdf]
[DOI]

Negative Samples Are at Large: Leveraging Hard-Distance Elastic Loss for Re-identification
Hyungtae Lee, Sungmin Eum, Heesung Kwon
[pdf]
[DOI]

Discrete-Constrained Regression for Local Counting Models
Haipeng Xiong, Angela Yao
[pdf]
[DOI]

Breadcrumbs: Adversarial Class-Balanced Sampling for Long-Tailed Recognition
Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos
[pdf]
[DOI]

Chairs Can Be Stood On: Overcoming Object Bias in Human-Object Interaction Detection
Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan Kankanhalli
[pdf]
[DOI]

A Fast Knowledge Distillation Framework for Visual Recognition
Zhiqiang Shen, Eric Xing
[pdf]
[DOI]

DICE: Leveraging Sparsification for Out-of-Distribution Detection
Yiyou Sun, Yixuan Li
[pdf]
[DOI]

Invariant Feature Learning for Generalized Long-Tailed Classification
Kaihua Tang, Mingyuan Tao, Jiaxin Qi, Zhenguang Liu, Hanwang Zhang
[pdf]
[DOI]

Sliced Recursive Transformer
Zhiqiang Shen, Zechun Liu, Eric Xing
[pdf]
[DOI]

Cross-Domain Ensemble Distillation for Domain Generalization
Kyungmoon Lee, Sungyeon Kim, Suha Kwak
[pdf]
[DOI]

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels
Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu
[pdf]
[DOI]

Hyperspherical Learning in Multi-Label Classification
Bo Ke, Yunquan Zhu, Mengtian Li, Xiujun Shu, Ruizhi Qiao, Bo Ren
[pdf]
[DOI]

When Active Learning Meets Implicit Semantic Data Augmentation
Zhuangzhuang Chen, Jin Zhang, Pan Wang, Jie Chen, Jianqiang Li
[pdf]
[DOI]

VL-LTR: Learning Class-Wise Visual-Linguistic Representation for Long-Tailed Visual Recognition
Changyao Tian, Wenhai Wang, Xizhou Zhu, Jifeng Dai, Yu Qiao
[pdf]
[DOI]

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-of-Distribution Generalization
Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang
[pdf]
[DOI]

Hierarchical Semi-Supervised Contrastive Learning for Contamination-Resistant Anomaly Detection
Gaoang Wang, Yibing Zhan, Xinchao Wang, Mingli Song, Klara Nahrstedt
[pdf]
[DOI]

Tracking by Associating Clips
Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee
[pdf]
[DOI]

RealPatch: A Statistical Matching Framework for Model Patching with Real Samples
Sara Romiti, Christopher Inskip, Viktoriia Sharmanska, Novi Quadrianto
[pdf]
[DOI]

Background-Insensitive Scene Text Recognition with Text Semantic Segmentation
Liang Zhao, Zhenyao Wu, Xinyi Wu, Greg Wilsbacher, Song Wang
[pdf]
[DOI]

Semantic Novelty Detection via Relational Reasoning
Francesco Cappio Borlino, Silvia Bucci, Tatiana Tommasi
[pdf]
[DOI]

Improving Closed and Open-Vocabulary Attribute Prediction Using Transformers
Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava
[pdf]
[DOI]

Training Vision Transformers with Only 2040 Images
Yun-Hao Cao, Hao Yu, Jianxin Wu
[pdf]
[DOI]

Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection
Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee
[pdf]
[DOI]

TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs
Shantanu Jaiswal, Basura Fernando, Cheston Tan
[pdf]
[DOI]

Automatic Check-Out via Prototype-Based Classifier Learning from Single-Product Exemplars
Hao Chen, Xiu-Shen Wei, Faen Zhang, Yang Shen, Hui Xu, Liang Xiao
[pdf]
[DOI]

Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain
Piyapat Saranrittichai, Chaithanya Kumar Mummadi, Claudia Blaiotta, Mauricio Munoz, Volker Fischer
[pdf]
[DOI]

Photo-Realistic Neural Domain Randomization
Sergey Zakharov, Rareș Ambruș, Vitor Guizilini, Wadim Kehl, Adrien Gaidon
[pdf]
[DOI]

Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, Tao Mei
[pdf]
[DOI]

Tailoring Self-Supervision for Supervised Learning
WonJun Moon, Ji-Hwan Kim, Jae-Pil Heo
[pdf]
[DOI]

Difficulty-Aware Simulator for Open Set Recognition
WonJun Moon, Junho Park, Hyun Seok Seong, Cheol-Ho Cho, Jae-Pil Heo
[pdf]
[DOI]

Few-Shot Class-Incremental Learning from an Open-Set Perspective
Can Peng, Kun Zhao, Tianren Wang, Meng Li, Brian C. Lovell
[pdf]
[DOI]

FOSTER: Feature Boosting and Compression for Class-Incremental Learning
Fu-Yun Wang, Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan
[pdf]
[DOI]

Visual Knowledge Tracing
Neehar Kondapaneni, Pietro Perona, Oisin Mac Aodha
[pdf]
[DOI]

S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning
Jayateja Kalla, Soma Biswas
[pdf]
[DOI]

Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism
Yangyang Shu, Baosheng Yu, Haiming Xu, Lingqiao Liu
[pdf]
[DOI]

VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang, Yufei Xu, Jing Zhang, Dacheng Tao
[pdf]
[DOI]

Unbiased Manifold Augmentation for Coarse Class Subdivision
Baoming Yan, Ke Gao, Bo Gao, Lin Wang, Jiang Yang, Xiaobo Li
[pdf]
[DOI]

DenseHybrid: Hybrid Anomaly Detection for Dense Open-Set Recognition
Matej Grcić, Petra Bevandić, Siniša Šegvić
[pdf]
[DOI]

Rethinking Confidence Calibration for Failure Prediction
Fei Zhu, Zhen Cheng, Xu-Yao Zhang, Cheng-Lin Liu
[pdf]
[DOI]

Uncertainty-Guided Source-Free Domain Adaptation
Subhankar Roy, Martin Trapp, Andrea Pilzer, Juho Kannala, Nicu Sebe, Elisa Ricci, Arno Solin
[pdf]
[DOI]

Should All Proposals Be Treated Equally in Object Detection?
Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Pei Yu, Ying Jin, Lu Yuan, Zicheng Liu, Nuno Vasconcelos
[pdf]
[DOI]

VIP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers
Junbo Li, Huan Zhang, Cihang Xie
[pdf]
[DOI]

incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection
Amanda Rios, Nilesh Ahuja, Ibrahima Ndiour, Utku Genc, Laurent Itti, Omesh Tickoo
[pdf]
[DOI]

IGFormer: Interaction Graph Transformer for Skeleton-Based Human Interaction Recognition
Yunsheng Pang, Qiuhong Ke, Hossein Rahmani, James Bailey, Jun Liu
[pdf]
[DOI]

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions
Apostolos Modas, Rahul Rade, Guillermo Ortiz-Jiménez, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard
[pdf]
[DOI]

Rotation Regularization without Rotation
Takumi Kobayashi
[pdf]
[DOI]

Towards Accurate Open-Set Recognition via Background-Class Regularization
Wonwoo Cho, Jaegul Choo
[pdf]
[DOI]

In Defense of Image Pre-training for Spatiotemporal Recognition
Xianhang Li, Huiyu Wang, Chen Wei, Jieru Mei, Alan Yuille, Yuyin Zhou, Cihang Xie
[pdf]
[DOI]

Augmenting Deep Classifiers with Polynomial Neural Networks
Grigorios G. Chrysos, Markos Georgopoulos, Jiankang Deng, Jean Kossaifi, Yannis Panagakis, Anima Anandkumar
[pdf]
[DOI]

Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection
Seong Min Kye, Kwanghee Choi, Joonyoung Yi, Buru Chang
[pdf]
[DOI]

Online Task-Free Continual Learning with Dynamic Sparse Distributed Memory
Julien Pourcel, Ngoc-Son Vu, Robert M. French
[pdf]
[DOI]

Contrastive Deep Supervision
Linfeng Zhang, Xin Chen, Junbo Zhang, Runpei Dong, Kaisheng Ma
[pdf]
[DOI]

Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective
Quan Cui, Bingchen Zhao, Zhao-Min Chen, Borui Zhao, Renjie Song, Boyan Zhou, Jiajun Liang, Osamu Yoshie
[pdf]
[DOI]

LocVTP: Video-Text Pre-training for Temporal Localization
Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou
[pdf]
[DOI]

Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding across Heads
Jiawei Ma, Guangxing Han, Shiyuan Huang, Yuncong Yang, Shih-Fu Chang
[pdf]
[DOI]

Implicit Neural Representations for Image Compression
Yannick Strümpler, Janis Postels, Ren Yang, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

LiP-Flow: Learning Inference-Time Priors for Codec Avatars via Normalizing Flows in Latent Space
Emre Aksan, Shugao Ma, Akin Caliskan, Stanislav Pidhorskyi, Alexander Richard, Shih-En Wei, Jason Saragih, Otmar Hilliges
[pdf]
[DOI]

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining
Qihang Zhang, Zhenghao Peng, Bolei Zhou
[pdf]
[DOI]

Learning Ego 3D Representation As Ray Tracing
Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Hang Xu, Li Zhang
[pdf]
[DOI]

Static and Dynamic Concepts for Self-Supervised Video Representation Learning
Rui Qian, Shuangrui Ding, Xian Liu, Dahua Lin
[pdf]
[DOI]

SphereFed: Hyperspherical Federated Learning
Xin Dong, Sai Qian Zhang, Ang Li, H.T. Kung
[pdf]
[DOI]

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning
Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas
[pdf]
[DOI]

Posterior Refinement on Metric Matrix Improves Generalization Bound in Metric Learning
Mingda Wang, Canqian Yang, Yi Xu
[pdf]
[DOI]

Balancing Stability and Plasticity through Advanced Null Space in Continual Learning
Yajing Kong, Liu Liu, Zhen Wang, Dacheng Tao
[pdf]
[DOI]

DisCo: Remedying Self-Supervised Learning on Lightweight Models with Distilled Contrastive Learning
Yuting Gao, Jia-Xin Zhuang, Shaohui Lin, Hao Cheng, Xing Sun, Ke Li, Chunhua Shen
[pdf]
[DOI]

CoSCL: Cooperation of Small Continual Learners Is Stronger than a Big One
Liyuan Wang, Xingxing Zhang, Qian Li, Jun Zhu, Yi Zhong
[pdf]
[DOI]

Manifold Adversarial Learning for Cross-Domain 3D Shape Representation
Hao Huang, Cheng Chen, Yi Fang
[pdf]
[DOI]

Fast-MoCo: Boost Momentum-Based Contrastive Learning with Combinatorial Patches
Yuanzheng Ci, Chen Lin, Lei Bai, Wanli Ouyang
[pdf]
[DOI]

LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling
Boyan Jiang, Xinlin Ren, Mingsong Dou, Xiangyang Xue, Yanwei Fu, Yinda Zhang
[pdf]
[DOI]

On the Versatile Uses of Partial Distance Correlation in Deep Learning
Xingjian Zhen, Zihang Meng, Rudrasis Chakraborty, Vikas Singh
[pdf]
[DOI]

Self-Regulated Feature Learning via Teacher-Free Feature Distillation
Lujun Li
[pdf]
[DOI]

Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning
Mingfu Liang, Jiahuan Zhou, Wei Wei, Ying Wu
[pdf]
[DOI]

Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification
Xulin Li, Yan Lu, Bin Liu, Yating Liu, Guojun Yin, Qi Chu, Jinyang Huang, Feng Zhu, Rui Zhao, Nenghai Yu
[pdf]
[DOI]

DAS: Densely-Anchored Sampling for Deep Metric Learning
Lizhao Liu, Shangxin Huang, Zhuangwei Zhuang, Ran Yang, Mingkui Tan, Yaowei Wang
[pdf]
[DOI]

Learn from All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition
Yuhang Zhang, Chengrui Wang, Xu Ling, Weihong Deng
[pdf]
[DOI]

A Non-Isotropic Probabilistic Take On Proxy-Based Deep Metric Learning
Michael Kirchhof, Karsten Roth, Zeynep Akata, Enkelejda Kasneci
[pdf]
[DOI]

TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers
Jihao Liu, Boxiao Liu, Hang Zhou, Hongsheng Li, Yu Liu
[pdf]
[DOI]

UFO: Unified Feature Optimization
Teng Xi, Yifan Sun, Deli Yu, Bi Li, Nan Peng, Gang Zhang, Xinyu Zhang, Zhigang Wang, Jinwen Chen, Jian Wang, Lufei Liu, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
[pdf]
[DOI]

Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen, David F. Fouhey, Andrew Owens
[pdf]
[DOI]

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation
Yinan He, Gengshi Huang, Siyu Chen, Jianing Teng, Kun Wang, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao
[pdf]
[DOI]

SLIP: Self-Supervision Meets Language-Image Pre-training
Norman Mu, Alexander Kirillov, David Wagner, Saining Xie
[pdf]
[DOI]

Discovering Deformable Keypoint Pyramids
Jianing Qian, Anastasios Panagopoulos, Dinesh Jayaraman
[pdf]
[DOI]

Neural Video Compression Using GANs for Detail Synthesis and Propagation
Fabian Mentzer, Eirikur Agustsson, Johannes Ballé, David Minnen, Nick Johnston, George Toderici
[pdf]
[DOI]

A Contrastive Objective for Learning Disentangled Representations
Jonathan Kahana, Yedid Hoshen
[pdf]
[DOI]

PT4AL: Using Self-Supervised Pretext Tasks for Active Learning
John Seon Keun Yi, Minseok Seo, Jongchan Park, Dong-Geol Choi
[pdf]
[DOI]

ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer
Haokui Zhang, Wenze Hu, Xiaoyu Wang
[pdf]
[DOI]

DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning
Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister
[pdf]
[DOI]

Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective
Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Chenyu Wang, Wanli Ouyang
[pdf]
[DOI]

Decoupled Contrastive Learning
Chun-Hsiao Yeh, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu, Yubei Chen, Yann LeCun
[pdf]
[DOI]

Joint Learning of Localized Representations from Medical Images and Reports
Philip Müller, Georgios Kaissis, Congyu Zou, Daniel Rueckert
[pdf]
[DOI]

The Challenges of Continuous Self-Supervised Learning
Senthil Purushwalkam, Pedro Morgado, Abhinav Gupta
[pdf]
[DOI]

Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval
Zhixin Ling, Zhen Xing, Jian Zhou, Xiangdong Zhou
[pdf]
[DOI]

Identifying Hard Noise in Long-Tailed Sample Distribution
Xuanyu Yi, Kaihua Tang, Xian-Sheng Hua, Joo-Hwee Lim, Hanwang Zhang
[pdf]
[DOI]

Relative Contrastive Loss for Unsupervised Representation Learning
Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang
[pdf]
[DOI]

Fine-Grained Fashion Representation Learning by Online Deep Clustering
Yang Jiao, Ning Xie, Yan Gao, Chien-chih Wang, Yi Sun
[pdf]
[DOI]

NashAE: Disentangling Representations through Adversarial Covariance Minimization
Eric Yeats, Frank Liu, David Womble, Hai Li
[pdf]
[DOI]

A Gyrovector Space Approach for Symmetric Positive Semi-Definite Matrix Learning
Xuan Son Nguyen
[pdf]
[DOI]

Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan
[pdf]
[DOI]

Contrasting Quadratic Assignments for Set-Based Representation Learning
Artem Moskalev, Ivan Sosnovik, Volker Fischer, Arnold Smeulders
[pdf]
[DOI]

Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer
Arjun Ashok, K J Joseph, Vineeth N Balasubramanian
[pdf]
[DOI]

Object Discovery and Representation Networks
Olivier J. Hénaff, Skanda Koppula, Evan Shelhamer, Daniel Zoran, Andrew Jaegle, Andrew Zisserman, João Carreira, Relja Arandjelović
[pdf]
[DOI]

Trading Positional Complexity vs Deepness in Coordinate Networks
Jianqiao Zheng, Sameera Ramasinghe, Xueqian Li, Simon Lucey
[pdf]
[DOI]

MVDG: A Unified Multi-View Framework for Domain Generalization
Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao
[pdf]
[DOI]

Panoptic Scene Graph Generation
Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu
[pdf]
[DOI]

Object-Compositional Neural Implicit Surfaces
Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, Jianmin Zheng
[pdf]
[DOI]

RigNet: Repetitive Image Guided Network for Depth Completion
Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang
[pdf]
[DOI]

FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling
Hao Lu, Wenze Liu, Hongtao Fu, Zhiguo Cao
[pdf]
[DOI]

LiDAL: Inter-Frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation
Zeyu Hu, Xuyang Bai, Runze Zhang, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai
[pdf]
[DOI]

Hierarchical Memory Learning for Fine-Grained Scene Graph Generation
Youming Deng, Yansheng Li, Yongjun Zhang, Xiang Xiang, Jian Wang, Jingdong Chen, Jiayi Ma
[pdf]
[DOI]

DODA: Data-Oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation
Runyu Ding, Jihan Yang, Li Jiang, Xiaojuan Qi
[pdf]
[DOI]

MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning
Xiaogang Xu, Hengshuang Zhao, Vibhav Vineet, Ser-Nam Lim, Antonio Torralba
[pdf]
[DOI]

MonoPLFlowNet: Permutohedral Lattice FlowNet for Real-Scale 3D Scene Flow Estimation with Monocular Images
Runfa Li, Truong Nguyen
[pdf]
[DOI]

TO-Scene: A Large-Scale Dataset for Understanding 3D Tabletop Scenes
Mutian Xu, Pei Chen, Haolin Liu, Xiaoguang Han
[pdf]
[DOI]

Is It Necessary to Transfer Temporal Knowledge for Domain Adaptive Video Semantic Segmentation?
Xinyi Wu, Zhenyao Wu, Jin Wan, Lili Ju, Song Wang
[pdf]
[DOI]

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation
Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu
[pdf]
[DOI]

Improving the Reliability for Confidence Estimation
Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu
[pdf]
[DOI]

Fine-Grained Scene Graph Generation with Data Transfer
Ao Zhang, Yuan Yao, Qianyu Chen, Wei Ji, Zhiyuan Liu, Maosong Sun, Tat-Seng Chua
[pdf]
[DOI]

Pose2Room: Understanding 3D Scenes from Human Activities
Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner
[pdf]
[DOI]

Towards Hard-Positive Query Mining for DETR-Based Human-Object Interaction Detection
Xubin Zhong, Changxing Ding, Zijian Li, Shaoli Huang
[pdf]
[DOI]

Discovering Human-Object Interaction Concepts via Self-Compositional Learning
Zhi Hou, Baosheng Yu, Dacheng Tao
[pdf]
[DOI]

Primitive-Based Shape Abstraction via Nonparametric Bayesian Inference
Yuwei Wu, Weixiao Liu, Sipu Ruan, Gregory S. Chirikjian
[pdf]
[DOI]

Stereo Depth Estimation with Echoes
Chenghao Zhang, Kun Tian, Bolin Ni, Gaofeng Meng, Bin Fan, Zhaoxiang Zhang, Chunhong Pan
[pdf]
[DOI]

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Hanrong Ye, Dan Xu
[pdf]
[DOI]

PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Yingfei Liu, Tiancai Wang, Xiangyu Zhang, Jian Sun
[pdf]
[DOI]

S2Net: Stochastic Sequential Pointcloud Forecasting
Xinshuo Weng, Junyu Nan, Kuan-Hui Lee, Rowan McAllister, Adrien Gaidon, Nicholas Rhinehart, Kris M. Kitani
[pdf]
[DOI]

RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation
Mu He, Le Hui, Yikai Bian, Jian Ren, Jin Xie, Jian Yang
[pdf]
[DOI]

PolyphonicFormer: Unified Query Learning for Depth-Aware Video Panoptic Segmentation
Haobo Yuan, Xiangtai Li, Yibo Yang, Guangliang Cheng, Jing Zhang, Yunhai Tong, Lefei Zhang, Dacheng Tao
[pdf]
[DOI]

SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds
Qingyong Hu, Bo Yang, Guangchi Fang, Yulan Guo, Aleš Leonardis, Niki Trigoni, Andrew Markham
[pdf]
[DOI]

PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe, Chunghyun Park, Francois Rameau, Jaesik Park, In So Kweon
[pdf]
[DOI]

Initialization and Alignment for Adversarial Texture Optimization
Xiaoming Zhao, Zhizhen Zhao, Alexander G. Schwing
[pdf]
[DOI]

MOTR: End-to-End Multiple-Object Tracking with TRansformer
Fangao Zeng, Bin Dong, Yuang Zhang, Tiancai Wang, Xiangyu Zhang, Yichen Wei
[pdf]
[DOI]

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
[pdf]
[DOI]

LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments
Henry Howard-Jenkins, Victor Adrian Prisacariu
[pdf]
[DOI]

3D-PL: Domain Adaptive Depth Estimation with 3D-Aware Pseudo-Labeling
Yu-Ting Yen, Chia-Ni Lu, Wei-Chen Chiu, Yi-Hsuan Tsai
[pdf]
[DOI]

Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Xiangtai Li, Shilin Xu, Yibo Yang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
[pdf]
[DOI]

Salient Object Detection for Point Clouds
Songlin Fan, Wei Gao, Ge Li
[pdf]
[DOI]

Learning Semantic Segmentation from Multiple Datasets with Label Shifts
Dongwan Kim, Yi-Hsuan Tsai, Yumin Suh, Masoud Faraki, Sparsh Garg, Manmohan Chandraker, Bohyung Han
[pdf]
[DOI]

Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination
Kangcheng Liu, Yuzhi Zhao, Qiang Nie, Zhi Gao, Ben M. Chen
[pdf]
[DOI]

Towards Open-Vocabulary Scene Graph Generation with Prompt-Based Finetuning
Tao He, Lianli Gao, Jingkuan Song, Yuan-Fang Li
[pdf]
[DOI]

Variance-Aware Weight Initialization for Point Convolutional Neural Networks
Pedro Hermosilla, Michael Schelling, Tobias Ritschel, Timo Ropinski
[pdf]
[DOI]

Break and Make: Interactive Structural Understanding Using LEGO Bricks
Aaron Walsman, Muru Zhang, Klemen Kotar, Karthik Desingh, Ali Farhadi, Dieter Fox
[pdf]
[DOI]

Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation
Wencan Cheng, Jong Hwan Ko
[pdf]
[DOI]

3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching
Runyu Mao, Chen Bai, Yatong An, Fengqing Zhu, Cheng Lu
[pdf]
[DOI]

Video Restoration Framework and Its Meta-Adaptations to Data-Poor Conditions
Prashant W Patil, Sunil Gupta, Santu Rana, Svetha Venkatesh
[pdf]
[DOI]

MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud
Michaël Ramamonjisoa, Sinisa Stekovic, Vincent Lepetit
[pdf]
[DOI]

Scene Text Recognition with Permuted Autoregressive Sequence Models
Darwin Bautista, Rowel Atienza
[pdf]
[DOI]

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition
Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai
[pdf]
[DOI]

Detecting Tampered Scene Text in the Wild
Yuxin Wang, Hongtao Xie, Mengting Xing, Jing Wang, Shenggao Zhu, Yongdong Zhang
[pdf]
[DOI]

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning
Jingqun Tang, Wenming Qian, Luchuan Song, Xiena Dong, Lan Li, Xiang Bai
[pdf]
[DOI]

GLASS: Global to Local Attention for Scene-Text Spotting
Roi Ronen, Shahar Tsiper, Oron Anschel, Inbal Lavi, Amir Markovitz, R. Manmatha
[pdf]
[DOI]

COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts
Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa
[pdf]
[DOI]

Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
Chuhui Xue, Wenqing Zhang, Yu Hao, Shijian Lu, Philip H. S. Torr, Song Bai
[pdf]
[DOI]

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen Wang, Xiang Bai
[pdf]
[DOI]

Levenshtein OCR
Cheng Da, Peng Wang, Cong Yao
[pdf]
[DOI]

Multi-Granularity Prediction for Scene Text Recognition
Peng Wang, Cheng Da, Cong Yao
[pdf]
[DOI]

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting
Ying Chen, Liang Qiao, Zhanzhan Cheng, Shiliang Pu, Yi Niu, Xi Li
[pdf]
[DOI]

Contextual Text Block Detection towards Scene Text Understanding
Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Changhu Wang, Song Bai
[pdf]
[DOI]

CoMER: Modeling Coverage for Transformer-Based Handwritten Mathematical Expression Recognition
Wenqi Zhao, Liangcai Gao
[pdf]
[DOI]

Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context
Chongyu Liu, Lianwen Jin, Yuliang Liu, Canjie Luo, Bangdong Chen, Fengjun Guo, Kai Ding
[pdf]
[DOI]

TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
Oren Nuriel, Sharon Fogel, Ron Litman
[pdf]
[DOI]

Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
Byeonghu Na, Yoonsik Kim, Sungrae Park
[pdf]
[DOI]

SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition
Dajian Zhong, Shujing Lyu, Palaiahnakote Shivakumara, Bing Yin, Jiajia Wu, Umapada Pal, Yue Lu
[pdf]
[DOI]

Pure Transformer with Integrated Experts for Scene Text Recognition
Yew Lee Tan, Adams Wai-Kin Kong, Jung-Jae Kim
[pdf]
[DOI]

OCR-Free Document Understanding Transformer
Geewook Kim, Teakgyu Hong, Moonbin Yim, JeongYeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park
[pdf]
[DOI]

CAR: Class-Aware Regularizations for Semantic Segmentation
Ye Huang, Di Kang, Liang Chen, Xuefei Zhe, Wenjing Jia, Linchao Bao, Xiangjian He
[pdf]
[DOI]

Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation
Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee
[pdf]
[DOI]

SeqFormer: Sequential Transformer for Video Instance Segmentation
Junfeng Wu, Yi Jiang, Song Bai, Wenqing Zhang, Xiang Bai
[pdf]
[DOI]

Saliency Hierarchy Modeling via Generative Kernels for Salient Object Detection
Wenhu Zhang, Liangli Zheng, Huanyu Wang, Xintian Wu, Xi Li
[pdf]
[DOI]

In Defense of Online Models for Video Instance Segmentation
Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai
[pdf]
[DOI]

Active Pointly-Supervised Instance Segmentation
Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu
[pdf]
[DOI]

A Transformer-Based Decoder for Semantic Segmentation with Multi-level Context Mining
Bowen Shi, Dongsheng Jiang, Xiaopeng Zhang, Han Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian
[pdf]
[DOI]

XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Ho Kei Cheng, Alexander G. Schwing
[pdf]
[DOI]

Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving
Jiale Li, Hang Dai, Yong Ding
[pdf]
[DOI]

2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
Xu Yan, Jiantao Gao, Chaoda Zheng, Chao Zheng, Ruimao Zhang, Shuguang Cui, Zhen Li
[pdf]
[DOI]

Extract Free Dense Labels from CLIP
Chong Zhou, Chen Change Loy, Bo Dai
[pdf]
[DOI]

3D Compositional Zero-Shot Learning with DeCompositional Consensus
Muhammad Ferjad Naeem, Evin Pınar Örnek, Yongqin Xian, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

Video Mask Transfiner for High-Quality Video Instance Segmentation
Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu
[pdf]
[DOI]

Box-Supervised Instance Segmentation with Level Set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xian-Sheng Hua, Lei Zhang
[pdf]
[DOI]

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding
Hao Wen, Yunze Liu, Jingwei Huang, Bo Duan, Li Yi
[pdf]
[DOI]

Adaptive Agent Transformer for Few-Shot Segmentation
Yuan Wang, Rui Sun, Zhe Zhang, Tianzhu Zhang
[pdf]
[DOI]

Waymo Open Dataset: Panoramic Video Panoptic Segmentation
Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar
[pdf]
[DOI]

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation
Zhaoyuan Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin
[pdf]
[DOI]

AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-Shot Interactions
Yian Wang, Ruihai Wu, Kaichun Mo, Jiaqi Ke, Qingnan Fan, Leonidas J. Guibas, Hao Dong
[pdf]
[DOI]

Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong, Seokju Cho, Jisu Nam, Stephen Lin, Seungryong Kim
[pdf]
[DOI]

"Fine-Grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications"
Lingzhi Zhang, Shenghao Zhou, Simon Stent, Jianbo Shi
[pdf]
[DOI]

Perceptual Artifacts Localization for Inpainting
Lingzhi Zhang, Yuqian Zhou, Connelly Barnes, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi
[pdf]
[DOI]

2D Amodal Instance Segmentation Guided by 3D Shape Prior
Zhixuan Li, Weining Ye, Tingting Jiang, Tiejun Huang
[pdf]
[DOI]

Data Efficient 3D Learner via Knowledge Transferred from 2D Model
Ping-Chung Yu, Cheng Sun, Min Sun
[pdf]
[DOI]

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation
Tong Wu, Guangyu Gao, Junshi Huang, Xiaolin Wei, Xiaoming Wei, Chi Harold Liu
[pdf]
[DOI]

Dense Gaussian Processes for Few-Shot Segmentation
Joakim Johnander, Johan Edstedt, Michael Felsberg, Fahad Shahbaz Khan, Martin Danelljan
[pdf]
[DOI]

3D Instances as 1D Kernels
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong
[pdf]
[DOI]

TransMatting: Enhancing Transparent Objects Matting with Transformers
Huanqia Cai, Fanglei Xue, Lele Xu, Lili Guo
[pdf]
[DOI]

MVSalNet:Multi-View Augmentation for RGB-D Salient Object Detection
Jiayuan Zhou, Lijun Wang, Huchuan Lu, Kaining Huang, Xinchu Shi, Bocong Liu
[pdf]
[DOI]

k-Means Mask Transformer
Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
[pdf]
[DOI]

SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness
Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip H. S. Torr
[pdf]
[DOI]

Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hyeokjun Kweon, Jegyeong Cho, Shinjeong Kim, Kuk-Jin Yoon
[pdf]
[DOI]

Continual Semantic Segmentation via Structure Preserving and Projected Feature Alignment
Zihan Lin, Zilei Wang, Yixin Zhang
[pdf]
[DOI]

Interclass Prototype Relation for Few-Shot Segmentation
Atsuro Okazawa
[pdf]
[DOI]

Slim Scissors: Segmenting Thin Object from Synthetic Background
Kunyang Han, Jun Hao Liew, Jiashi Feng, Huawei Tian, Yao Zhao, Yunchao Wei
[pdf]
[DOI]

Abstracting Sketches through Simple Primitives
Stephan Alaniz, Massimiliano Mancini, Anjan Dutta, Diego Marcos, Zeynep Akata
[pdf]
[DOI]

Multi-Scale and Cross-Scale Contrastive Learning for Semantic Segmentation
Theodoros Pissas, Claudio S. Ravasio, Lyndon Da Cruz, Christos Bergeles
[pdf]
[DOI]

One-Trimap Video Matting
Hongje Seong, Seoung Wug Oh, Brian Price, Euntai Kim, Joon-Young Lee
[pdf]
[DOI]

D2ADA: Dynamic Density-Aware Active Domain Adaptation for Semantic Segmentation
Tsung-Han Wu, Yi-Syuan Liou, Shao-Ji Yuan, Hsin-Ying Lee, Tung-I Chen, Kuan-Chih Huang, Winston H. Hsu
[pdf]
[DOI]

Learning Quality-Aware Dynamic Memory for Video Object Segmentation
Yong Liu, Ran Yu, Fei Yin, Xinyuan Zhao, Wei Zhao, Weihao Xia, Yujiu Yang
[pdf]
[DOI]

Learning Implicit Feature Alignment Function for Semantic Segmentation
Hanzhe Hu, Yinbo Chen, Jiarui Xu, Shubhankar Borse, Hong Cai, Fatih Porikli, Xiaolong Wang
[pdf]
[DOI]

Quantum Motion Segmentation
Federica Arrigoni, Willi Menapace, Marcel Seelbach Benkner, Elisa Ricci, Vladislav Golyanik
[pdf]
[DOI]

Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation
Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei
[pdf]
[DOI]

Laplacian Mesh Transformer: Dual Attention and Topology Aware Network for 3D Mesh Classification and Segmentation
Xiao-Juan Li, Jie Yang, Fang-Lue Zhang
[pdf]
[DOI]

Geodesic-Former: A Geodesic-Guided Few-Shot 3D Point Cloud Instance Segmenter
Tuan Ngo, Khoi Nguyen
[pdf]
[DOI]

Union-Set Multi-source Model Adaptation for Semantic Segmentation
Zongyao Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
[pdf]
[DOI]

Point MixSwap: Attentional Point Cloud Mixing via Swapping Matched Structural Divisions
Ardian Umam, Cheng-Kun Yang, Yung-Yu Chuang, Jen-Hui Chuang, Yen-Yu Lin
[pdf]
[DOI]

BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation
Ye Yu, Jialin Yuan, Gaurav Mittal, Li Fuxin, Mei Chen
[pdf]
[DOI]

SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection
Minhyeok Lee, Chaewon Park, Suhwan Cho, Sangyoun Lee
[pdf]
[DOI]

Global Spectral Filter Memory Network for Video Object Segmentation
Yong Liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong Tang, Yujiu Yang
[pdf]
[DOI]

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer
Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan
[pdf]
[DOI]

RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
Haodi He, Yuhui Yuan, Xiangyu Yue, Han Hu
[pdf]
[DOI]

Learning Topological Interactions for Multi-Class Medical Image Segmentation
Saumya Gupta, Xiaoling Hu, James Kaan, Michael Jin, Mutshipay Mpoy, Katherine Chung, Gagandeep Singh, Mary Saltz, Tahsin Kurc, Joel Saltz, Apostolos Tassiopoulos, Prateek Prasanna, Chao Chen
[pdf]
[DOI]

Unsupervised Segmentation in Real-World Images via Spelke Object Inference
Honglin Chen, Rahul Venkatesh, Yoni Friedman, Jiajun Wu, Joshua B. Tenenbaum, Daniel L. K. Yamins, Daniel M. Bear
[pdf]
[DOI]

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model
Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Han Hu, Xiang Bai
[pdf]
[DOI]

Fast Two-View Motion Segmentation Using Christoffel Polynomials
Bengisu Ozbay, Octavia Camps, Mario Sznaier
[pdf]
[DOI]

UCTNet: Uncertainty-Aware Cross-Modal Transformer Network for Indoor RGB-D Semantic Segmentation
Xiaowen Ying, Mooi Choo Chuah
[pdf]
[DOI]

Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation
Geon Lee, Chanho Eom, Wonkyung Lee, Hyekang Park, Bumsub Ham
[pdf]
[DOI]

Learning Regional Purity for Instance Segmentation on 3D Point Clouds
Shichao Dong, Guosheng Lin, Tzu-Yi Hung
[pdf]
[DOI]

Cross-Domain Few-Shot Semantic Segmentation
Shuo Lei, Xuchao Zhang, Jianfeng He, Fanglan Chen, Bowen Du, Chang-Tien Lu
[pdf]
[DOI]

Generative Subgraph Contrast for Self-Supervised Graph Representation Learning
Yuehui Han, Le Hui, Haobo Jiang, Jianjun Qian, Jin Xie
[pdf]
[DOI]

SdAE: Self-Distillated Masked Autoencoder
Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian
[pdf]
[DOI]

Demystifying Unsupervised Semantic Correspondence Estimation
Mehmet Aygün, Oisin Mac Aodha
[pdf]
[DOI]

Open-Set Semi-Supervised Object Detection
Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira
[pdf]
[DOI]

Vibration-Based Uncertainty Estimation for Learning from Limited Supervision
Hengtong Hu, Lingxi Xie, Xinyue Huo, Richang Hong, Qi Tian
[pdf]
[DOI]

Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation
Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu
[pdf]
[DOI]

Weakly Supervised Object Localization through Inter-class Feature Similarity and Intra-Class Appearance Consistency
Jun Wei, Sheng Wang, S. Kevin Zhou, Shuguang Cui, Zhen Li
[pdf]
[DOI]

Active Learning Strategies for Weakly-Supervised Object Detection
Huy V. Vo, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Jean Ponce
[pdf]
[DOI]

Mc-BEiT: Multi-Choice Discretization for Image BERT Pre-training
Xiaotong Li, Yixiao Ge, Kun Yi, Zixuan Hu, Ying Shan, Ling-Yu Duan
[pdf]
[DOI]

Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu
[pdf]
[DOI]

Unsupervised Visual Representation Learning by Synchronous Momentum Grouping
Bo Pang, Yifan Zhang, Yaoyi Li, Jia Cai, Cewu Lu
[pdf]
[DOI]

Improving Few-Shot Part Segmentation Using Coarse Supervision
Oindrila Saha, Zezhou Cheng, Subhransu Maji
[pdf]
[DOI]

What to Hide from Your Students: Attention-Guided Masked Image Modeling
Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis
[pdf]
[DOI]

Pointly-Supervised Panoptic Segmentation
Junsong Fan, Zhaoxiang Zhang, Tieniu Tan
[pdf]
[DOI]

MVP: Multimodality-Guided Visual Pre-training
Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
[pdf]
[DOI]

Locally Varying Distance Transform for Unsupervised Visual Anomaly Detection
Wen-Yan Lin, Zhonghang Liu, Siying Liu
[pdf]
[DOI]

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
Lukas Hoyer, Dengxin Dai, Luc Van Gool
[pdf]
[DOI]

SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation
Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, Onkar Dabeer
[pdf]
[DOI]

Dual-Domain Self-Supervised Learning and Model Adaption for Deep Compressive Imaging
Yuhui Quan, Xinran Qin, Tongyao Pang, Hui Ji
[pdf]
[DOI]

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
Xudong Wang, Long Lian, Stella X. Yu
[pdf]
[DOI]

Max Pooling with Vision Transformers Reconciles Class and Shape in Weakly Supervised Semantic Segmentation
Simone Rossetti, Damiano Zappia, Marta Sanzari, Marco Schaerf, Fiora Pirri
[pdf]
[DOI]

Dense Siamese Network for Dense Unsupervised Learning
Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
[pdf]
[DOI]

Multi-Granularity Distillation Scheme towards Lightweight Semi-Supervised Semantic Segmentation
Jie Qin, Jie Wu, Ming Li, Xuefeng Xiao, Min Zheng, Xingang Wang
[pdf]
[DOI]

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation
Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen
[pdf]
[DOI]

Self-Filtering: A Noise-Aware Sample Selection for Label Noise with Confidence Penalization
Qi Wei, Haoliang Sun, Xiankai Lu, Yilong Yin
[pdf]
[DOI]

RDA: Reciprocal Distribution Alignment for Robust Semi-Supervised Learning
Yue Duan, Lei Qi, Lei Wang, Luping Zhou, Yinghuan Shi
[pdf]
[DOI]

MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation
Tarun Kalluri, Astuti Sharma, Manmohan Chandraker
[pdf]
[DOI]

United Defocus Blur Detection and Deblurring via Adversarial Promoting Learning
Wenda Zhao, Fei Wei, You He, Huchuan Lu
[pdf]
[DOI]

Synergistic Self-Supervised and Quantization Learning
Yun-Hao Cao, Peiqin Sun, Yechang Huang, Jianxin Wu, Shuchang Zhou
[pdf]
[DOI]

Semi-Supervised Vision Transformers
Zejia Weng, Xitong Yang, Ang Li, Zuxuan Wu, Yu-Gang Jiang
[pdf]
[DOI]

Domain Adaptive Video Segmentation via Temporal Pseudo Supervision
Yun Xing, Dayan Guan, Jiaxing Huang, Shijian Lu
[pdf]
[DOI]

Diverse Learner: Exploring Diverse Supervision for Semi-Supervised Object Detection
Linfeng Li, Minyue Jiang, Yue Yu, Wei Zhang, Xiangru Lin, Yingying Li, Xiao Tan, Jingdong Wang, Errui Ding
[pdf]
[DOI]

A Closer Look at Invariances in Self-Supervised Pre-training for 3D Vision
Lanxiao Li, Michael Heizmann
[pdf]
[DOI]

ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization
Jiwon Kim, Youngjo Min, Daehwan Kim, Gyuseong Lee, Junyoung Seo, Kwangrok Ryoo, Seungryong Kim
[pdf]
[DOI]

FedX: Unsupervised Federated Learning with Cross Knowledge Distillation
Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim, Chuhan Wu, Xing Xie, Meeyoung Cha
[pdf]
[DOI]

W2N: Switching from Weak Supervision to Noisy Supervision for Object Detection
Zitong Huang, Yiping Bao, Bowen Dong, Erjin Zhou, Wangmeng Zuo
[pdf]
[DOI]

Decoupled Adversarial Contrastive Learning for Self-Supervised Adversarial Robustness
Chaoning Zhang, Kang Zhang, Chenshuang Zhang, Axi Niu, Jiu Feng, Chang D. Yoo, In So Kweon
[pdf]
[DOI]

GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning
Huseyin Coskun, Alireza Zareian, Joshua L. Moore, Federico Tombari, Chen Wang
[pdf]
[DOI]

Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning
K L Navaneet, Soroush Abbasi Koohpayegani, Ajinkya Tejankar, Kossar Pourahmadi, Akshayvarun Subramanya, Hamed Pirsiavash
[pdf]
[DOI]

Revisiting the Critical Factors of Augmentation-Invariant Representation Learning
Junqiang Huang, Xiangwen Kong, Xiangyu Zhang
[pdf]
[DOI]

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation
Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia
[pdf]
[DOI]

Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation
Zhonghua Wu, Yicheng Wu, Guosheng Lin, Jianfei Cai, Chen Qian
[pdf]
[DOI]

Semantic-Aware Fine-Grained Correspondence
Yingdong Hu, Renhao Wang, Kaifeng Zhang, Yang Gao
[pdf]
[DOI]

Self-Supervised Classification Network
Elad Amrani, Leonid Karlinsky, Alex Bronstein
[pdf]
[DOI]

Data Invariants to Understand Unsupervised Out-of-Distribution Detection
Lars Doorenbos, Raphael Sznitman, Pablo Márquez-Neila
[pdf]
[DOI]

Domain Invariant Masked Autoencoders for Self-Supervised Learning from Multi-Domains
Haiyang Yang, Shixiang Tang, Meilin Chen, Yizhou Wang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang
[pdf]
[DOI]

Semi-Supervised Object Detection via Virtual Category Learning
Changrui Chen, Kurt Debattista, Jungong Han
[pdf]
[DOI]

Completely Self-Supervised Crowd Counting via Distribution Matching
Deepak Babu Sam, Abhinav Agarwalla, Jimmy Joseph, Vishwanath A. Sindagi, R. Venkatesh Babu, Vishal M. Patel
[pdf]
[DOI]

Coarse-to-Fine Incremental Few-Shot Learning
Xiang Xiang, Yuwen Tan, Qian Wan, Jing Ma, Alan Yuille, Gregory D. Hager
[pdf]
[DOI]

Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling
Jian Hu, Haowen Zhong, Fei Yang, Shaogang Gong, Guile Wu, Junchi Yan
[pdf]
[DOI]

Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition
Shreyank N Gowda, Marcus Rohrbach, Frank Keller, Laura Sevilla-Lara
[pdf]
[DOI]

CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation
Renhao Wang, Hang Zhao, Yang Gao
[pdf]
[DOI]

PSS: Progressive Sample Selection for Open-World Visual Representation Learning
Tianyue Cao, Yongxin Wang, Yifan Xing, Tianjun Xiao, Tong He, Zheng Zhang, Hao Zhou, Joseph Tighe
[pdf]
[DOI]

Improving Self-Supervised Lightweight Model Learning via Hard-Aware Metric Distillation
Hao Liu, Mang Ye
[pdf]
[DOI]

Object Discovery via Contrastive Learning for Weakly Supervised Object Detection
Jinhwan Seo, Wonho Bae, Danica J. Sutherland, Junhyug Noh, Daijin Kim
[pdf]
[DOI]

Stochastic Consensus: Enhancing Semi-Supervised Learning with Consistency of Stochastic Classifiers
Hui Tang, Lin Sun, Kui Jia
[pdf]
[DOI]

DiffuseMorph: Unsupervised Deformable Image Registration Using Diffusion Model
Boah Kim, Inhwa Han, Jong Chul Ye
[pdf]
[DOI]

Semi-Leak: Membership Inference Attacks against Semi-Supervised Learning
Xinlei He, Hongbin Liu, Neil Zhenqiang Gong, Yang Zhang
[pdf]
[DOI]

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning
Mamshad Nayeem Rizve, Navid Kardan, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah
[pdf]
[DOI]

Embedding Contrastive Unsupervised Features to Cluster in- and Out-of-Distribution Noise in Corrupted Image Datasets
Paul Albert, Eric Arazo, Noel E. O’Connor, Kevin McGuinness
[pdf]
[DOI]

Unsupervised Few-Shot Image Classification by Learning Features into Clustering Space
Shuo Li, Fang Liu, Zehua Hao, Kaibo Zhao, Licheng Jiao
[pdf]
[DOI]

Towards Realistic Semi-Supervised Learning
Mamshad Nayeem Rizve, Navid Kardan, Mubarak Shah
[pdf]
[DOI]

Masked Siamese Networks for Label-Efficient Learning
Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Michael Rabbat, Nicolas Ballas
[pdf]
[DOI]

Natural Synthetic Anomalies for Self-Supervised Anomaly Detection and Localization
Hannah M. Schlüter, Jeremy Tan, Benjamin Hou, Bernhard Kainz
[pdf]
[DOI]

Understanding Collapse in Non-Contrastive Siamese Representation Learning
Alexander C. Li, Alexei A. Efros, Deepak Pathak
[pdf]
[DOI]

Federated Self-Supervised Learning for Video Understanding
Yasar Abbas Ur Rehman, Yan Gao, Jiajun Shen, Pedro Porto Buarque de Gusmão, Nicholas Lane
[pdf]
[DOI]

Towards Efficient and Effective Self-Supervised Learning of Visual Representations
Sravanti Addepalli, Kaushal Bhogale, Priyam Dey, R. Venkatesh Babu
[pdf]
[DOI]

DSR – A Dual Subspace Re-Projection Network for Surface Anomaly Detection
Vitjan Zavrtanik, Matej Kristan, Danijel Skočaj
[pdf]
[DOI]

PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds
Zhaoqi Leng, Shuyang Cheng, Benjamin Caine, Weiyue Wang, Xiao Zhang, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov
[pdf]
[DOI]

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo
Xiaofeng Wang, Zheng Zhu, Guan Huang, Fangbo Qin, Yun Ye, Yijia He, Xu Chi, Xingang Wang
[pdf]
[DOI]

RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild
Jason Y. Zhang, Deva Ramanan, Shubham Tulsiani
[pdf]
[DOI]

R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov
[pdf]
[DOI]

KD-MVS: Knowledge Distillation Based Self-Supervised Learning for Multi-View Stereo
Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, Chi Zhang
[pdf]
[DOI]

SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas
John Lambert, Yuguang Li, Ivaylo Boyadzhiev, Lambert Wixson, Manjunath Narayana, Will Hutchcroft, James Hays, Frank Dellaert, Sing Bing Kang
[pdf]
[DOI]

RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering
Di Chang, Aljaž Božič, Tong Zhang, Qingsong Yan, Yingcong Chen, Sabine Süsstrunk, Matthias Nießner
[pdf]
[DOI]

Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll
[pdf]
[DOI]

NeILF: Neural Incident Light Field for Physically-Based Material Estimation
Yao Yao, Jingyang Zhang, Jingbo Liu, Yihang Qu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
[pdf]
[DOI]

ARF: Artistic Radiance Fields
Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, Noah Snavely
[pdf]
[DOI]

Multiview Stereo with Cascaded Epipolar RAFT
Zeyu Ma, Zachary Teed, Jia Deng
[pdf]
[DOI]

ARAH: Animatable Volume Rendering of Articulated Human SDFs
Shaofei Wang, Katja Schwarz, Andreas Geiger, Siyu Tang
[pdf]
[DOI]

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
Hongkai Chen, Zixin Luo, Lei Zhou, Yurun Tian, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
[pdf]
[DOI]

NDF: Neural Deformable Fields for Dynamic Human Modelling
Ruiqi Zhang, Jie Chen
[pdf]
[DOI]

Neural Density-Distance Fields
Itsuki Ueda, Yoshihiro Fukuhara, Hirokatsu Kataoka, Hiroaki Aizawa, Hidehiko Shishido, Itaru Kitahara
[pdf]
[DOI]

NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer
Yunxiao Wang, Yanjie Li, Peidong Liu, Tao Dai, Shu-Tao Xia
[pdf]
[DOI]

Learning Online Multi-sensor Depth Fusion
Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc Van Gool
[pdf]
[DOI]

BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-Scale Scene Rendering
Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, Dahua Lin
[pdf]
[DOI]

Decomposing the Tangent of Occluding Boundaries according to Curvatures and Torsions
Huizong Yang, Anthony Yezzi
[pdf]
[DOI]

NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors
Jiepeng Wang, Peng Wang, Xiaoxiao Long, Christian Theobalt, Taku Komura, Lingjie Liu, Wenping Wang
[pdf]
[DOI]

Generalizable Patch-Based Neural Rendering
Mohammed Suhail, Carlos Esteves, Leonid Sigal, Ameesh Makadia
[pdf]
[DOI]

Improving RGB-D Point Cloud Registration by Learning Multi-Scale Local Linear Transformation
Ziming Wang, Xiaoliang Huo, Zhenghao Chen, Jing Zhang, Lu Sheng, Dong Xu
[pdf]
[DOI]

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images
Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen
[pdf]
[DOI]

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, Wenping Wang
[pdf]
[DOI]

Disentangling Object Motion and Occlusion for Unsupervised Multi-Frame Monocular Depth
Ziyue Feng, Liang Yang, Longlong Jing, Haiyan Wang, YingLi Tian, Bing Li
[pdf]
[DOI]

Depth Field Networks for Generalizable Multi-View Scene Representation
Vitor Guizilini, Igor Vasiljevic, Jiading Fang, Rareș Ambruș, Greg Shakhnarovich, Matthew R. Walter, Adrien Gaidon
[pdf]
[DOI]

Context-Enhanced Stereo Transformer
Weiyu Guo, Zhaoshuo Li, Yongkui Yang, Zheng Wang, Russell H. Taylor, Mathias Unberath, Alan Yuille, Yingwei Li
[pdf]
[DOI]

PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching
Zhelun Shen, Yuchao Dai, Xibin Song, Zhibo Rao, Dingfu Zhou, Liangjun Zhang
[pdf]
[DOI]

Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images
Yuan Liu, Yilin Wen, Sida Peng, Cheng Lin, Xiaoxiao Long, Taku Komura, Wenping Wang
[pdf]
[DOI]

Latency-Aware Collaborative Perception
Zixing Lei, Shunli Ren, Yue Hu, Wenjun Zhang, Siheng Chen
[pdf]
[DOI]

TensoRF: Tensorial Radiance Fields
Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, Hao Su
[pdf]
[DOI]

NeFSAC: Neurally Filtered Minimal Samples
Luca Cavalli, Marc Pollefeys, Daniel Barath
[pdf]
[DOI]

SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data
Eldar Insafutdinov, Dylan Campbell, João F. Henriques, Andrea Vedaldi
[pdf]
[DOI]

HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields
Kim Jun-Seong, Kim Yu-Ji, Moon Ye-Bin, Tae-Hyun Oh
[pdf]
[DOI]

NeuMan: Neural Human Radiance Field from a Single Video
Wei Jiang, Kwang Moo Yi, Golnoosh Samei, Oncel Tuzel, Anurag Ranjan
[pdf]
[DOI]

TAVA: Template-Free Animatable Volumetric Actors
Ruilong Li, Julian Tanke, Minh Vo, Michael Zollhöfer, Jürgen Gall, Angjoo Kanazawa, Christoph Lassner
[pdf]
[DOI]

EASNet: Searching Elastic and Accurate Network Architecture for Stereo Matching
Qiang Wang, Shaohuai Shi, Kaiyong Zhao, Xiaowen Chu
[pdf]
[DOI]

Relative Pose from SIFT Features
Daniel Barath, Zuzana Kukelova
[pdf]
[DOI]

Selection and Cross Similarity for Event-Image Deep Stereo
Hoonhee Cho, Kuk-Jin Yoon
[pdf]
[DOI]

D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang
[pdf]
[DOI]

CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-Scale Indoor Scene
Hao-Xiang Chen, Jiahui Huang, Tai-Jiang Mu, Shi-Min Hu
[pdf]
[DOI]

ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild
Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang, Yong-Jin Liu
[pdf]
[DOI]

4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
Yujin Chen, Matthias Nießner, Angela Dai
[pdf]
[DOI]

Few ‘Zero Level Set’-Shot Learning of Shape Signed Distance Functions in Feature Space
Amine Ouasfi, Adnane Boukhayma
[pdf]
[DOI]

Solution Space Analysis of Essential Matrix Based on Algebraic Error Minimization
Gaku Nakano
[pdf]
[DOI]

Approximate Differentiable Rendering with Algebraic Surfaces
Leonid Keselman, Martial Hebert
[pdf]
[DOI]

CoVisPose: Co-Visibility Pose Transformer for Wide-Baseline Relative Pose Estimation in 360° Indoor Panoramas
Will Hutchcroft, Yuguang Li, Ivaylo Boyadzhiev, Zhiqiang Wan, Haiyan Wang, Sing Bing Kang
[pdf]
[DOI]

Affine Correspondences between Multi-Camera Systems for 6DOF Relative Pose Estimation
Banglei Guan, Ji Zhao
[pdf]
[DOI]

GraphFit: Learning Multi-Scale Graph-Convolutional Representation for Point Cloud Normal Estimation
Keqiang Li, Mingyang Zhao, Huaiyu Wu, Dong-Ming Yan, Zhen Shen, Fei-Yue Wang, Gang Xiong
[pdf]
[DOI]

IS-MVSNet: Importance Sampling-Based MVSNet
Likang Wang, Yue Gong, Xinjun Ma, Qirui Wang, Kaixuan Zhou, Lei Chen
[pdf]
[DOI]

Point Scene Understanding via Disentangled Instance Mesh Reconstruction
Jiaxiang Tang, Xiaokang Chen, Jingbo Wang, Gang Zeng
[pdf]
[DOI]

DiffuStereo: High Quality Human Reconstruction via Diffusion-Based Stereo Using Sparse Cameras
Ruizhi Shao, Zerong Zheng, Hongwen Zhang, Jingxiang Sun, Yebin Liu
[pdf]
[DOI]

Space-Partitioning RANSAC
Daniel Barath, Gábor Valasek
[pdf]
[DOI]

SimpleRecon: 3D Reconstruction without 3D Convolutions
Mohamed Sayed, John Gibson, Jamie Watson, Victor Prisacariu, Michael Firman, Clément Godard
[pdf]
[DOI]

Structure and Motion from Casual Videos
Zhoutong Zhang, Forrester Cole, Zhengqi Li, Noah Snavely, Michael Rubinstein, William T. Freeman
[pdf]
[DOI]

What Matters for 3D Scene Flow Network
Guangming Wang, Yunzhe Hu, Zhe Liu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
[pdf]
[DOI]

Correspondence Reweighted Translation Averaging
Lalit Manam, Venu Madhav Govindu
[pdf]
[DOI]

Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images
Radu Alexandru Rosu, Shunsuke Saito, Ziyan Wang, Chenglei Wu, Sven Behnke, Giljoo Nam
[pdf]
[DOI]

GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs
Xin Liu, Xiaofei Shao, Bo Wang, Yali Li, Shengjin Wang
[pdf]
[DOI]

Objects Can Move: 3D Change Detection by Geometric Transformation Consistency
Aikaterini Adam, Torsten Sattler, Konstantinos Karantzalos, Tomas Pajdla
[pdf]
[DOI]

Language-Grounded Indoor 3D Semantic Segmentation in the Wild
Dávid Rozenberszki, Or Litany, Angela Dai
[pdf]
[DOI]

Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs
Sameera Ramasinghe, Simon Lucey
[pdf]
[DOI]

Deforming Radiance Fields with Cages
Tianhan Xu, Tatsuya Harada
[pdf]
[DOI]

FLEX: Extrinsic Parameters-Free Multi-View 3D Human Motion Reconstruction
Brian Gordon, Sigal Raab, Guy Azov, Raja Giryes, Daniel Cohen-Or
[pdf]
[DOI]

MODE: Multi-View Omnidirectional Depth Estimation with 360° Cameras
Ming Li, Xueqian Jin, Xuejiao Hu, Jingzhao Dai, Sidan Du, Yang Li
[pdf]
[DOI]

GigaDepth: Learning Depth from Structured Light with Branching Neural Networks
Simon Schreiberhuber, Jean-Baptiste Weibel, Timothy Patten, Markus Vincze
[pdf]
[DOI]

ActiveNeRF: Learning Where to See with Uncertainty Estimation
Xuran Pan, Zihang Lai, Shiji Song, Gao Huang
[pdf]
[DOI]

PoserNet: Refining Relative Camera Poses Exploiting Object Detections
Matteo Taiana, Matteo Toso, Stuart James, Alessio Del Bue
[pdf]
[DOI]

Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction & Pose Estimation
Shin-Fang Chng, Sameera Ramasinghe, Jamie Sherrah, Simon Lucey
[pdf]
[DOI]

Unbiased Gradient Estimation for Differentiable Surface Splatting via Poisson Sampling
Jan U. Müller, Michael Weinmann, Reinhard Klein
[pdf]
[DOI]

Towards Learning Neural Representations from Shadows
Kushagra Tiwary, Tzofi Klinghoffer, Ramesh Raskar
[pdf]
[DOI]

Class-Incremental Novel Class Discovery
Subhankar Roy, Mingxuan Liu, Zhun Zhong, Nicu Sebe, Elisa Ricci
[pdf]
[DOI]

Unknown-Oriented Learning for Open Set Domain Adaptation
Jie Liu, Xiaoqing Guo, Yixuan Yuan
[pdf]
[DOI]

Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation
Hongbin Lin, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Chuang Gan, Yanxia Liu, Mingkui Tan
[pdf]
[DOI]

DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation
Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia
[pdf]
[DOI]

Class-Agnostic Object Counting Robust to Intraclass Diversity
Shenjian Gong, Shanshan Zhang, Jian Yang, Dengxin Dai, Bernt Schiele
[pdf]
[DOI]

Burn after Reading: Online Adaptation for Cross-Domain Streaming Data
Luyu Yang, Mingfei Gao, Zeyuan Chen, Ran Xu, Abhinav Shrivastava, Chetan Ramaiah
[pdf]
[DOI]

Mind the Gap in Distilling StyleGANs
Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy
[pdf]
[DOI]

Improving Test-Time Adaptation via Shift-Agnostic Weight Regularization and Nearest Source Prototypes
Sungha Choi, Seunghan Yang, Seokeon Choi, Sungrack Yun
[pdf]
[DOI]

Learning Instance-Specific Adaptation for Cross-Domain Segmentation
Yuliang Zou, Zizhao Zhang, Chun-Liang Li, Han Zhang, Tomas Pfister, Jia-Bin Huang
[pdf]
[DOI]

RegionCL: Exploring Contrastive Region Pairs for Self-Supervised Representation Learning
Yufei Xu, Qiming Zhang, Jing Zhang, Dacheng Tao
[pdf]
[DOI]

Long-Tailed Class Incremental Learning
Xialei Liu, Yu-Song Hu, Xu-Sheng Cao, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng
[pdf]
[DOI]

DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning
Hyounguk Shon, Janghyeon Lee, Seung Hwan Kim, Junmo Kim
[pdf]
[DOI]

Adversarial Partial Domain Adaptation by Cycle Inconsistency
Kun-Yu Lin, Jiaming Zhou, Yukun Qiu, Wei-Shi Zheng
[pdf]
[DOI]

Combating Label Distribution Shift for Active Domain Adaptation
Sehyun Hwang, Sohyun Lee, Sungyeon Kim, Jungseul Ok, Suha Kwak
[pdf]
[DOI]

GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation
Cristiano Saltori, Evgeny Krivosheev, Stéphane Lathuilière, Nicu Sebe, Fabio Galasso, Giuseppe Fiameni, Elisa Ricci, Fabio Poiesi
[pdf]
[DOI]

CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation
Cristiano Saltori, Fabio Galasso, Giuseppe Fiameni, Nicu Sebe, Elisa Ricci, Fabio Poiesi
[pdf]
[DOI]

A Unified Framework for Domain Adaptive Pose Estimation
Donghyun Kim, Kaihong Wang, Kate Saenko, Margrit Betke, Stan Sclaroff
[pdf]
[DOI]

A Broad Study of Pre-training for Domain Generalization and Adaptation
Donghyun Kim, Kaihong Wang, Stan Sclaroff, Kate Saenko
[pdf]
[DOI]

Prior Knowledge Guided Unsupervised Domain Adaptation
Tao Sun, Cheng Lu, Haibin Ling
[pdf]
[DOI]

GCISG: Guided Causal Invariant Learning for Improved Syn-to-Real Generalization
Gilhyun Nam, Gyeongjae Choi, Kyungmin Lee
[pdf]
[DOI]

AcroFOD: An Adaptive Method for Cross-Domain Few-Shot Object Detection
Yipeng Gao, Lingxiao Yang, Yunmu Huang, Song Xie, Shiyong Li, Wei-Shi Zheng
[pdf]
[DOI]

Unsupervised Domain Adaptation for One-Stage Object Detector Using Offsets to Bounding Box
Jayeon Yoo, Inseop Chung, Nojun Kwak
[pdf]
[DOI]

Visual Prompt Tuning
Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, Ser-Nam Lim
[pdf]
[DOI]

Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap
Yongwei Chen, Zihao Wang, Longkun Zou, Ke Chen, Kui Jia
[pdf]
[DOI]

Interpretable Open-Set Domain Adaptation via Angular Margin Separation
Xinhao Li, Jingjing Li, Zhekai Du, Lei Zhu, Wen Li
[pdf]
[DOI]

TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation
Rui Gong, Martin Danelljan, Dengxin Dai, Danda Pani Paudel, Ajad Chhatkuli, Fisher Yu, Luc Van Gool
[pdf]
[DOI]

Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation
Zhengkai Jiang, Yuxi Li, Ceyuan Yang, Peng Gao, Yabiao Wang, Ying Tai, Chengjie Wang
[pdf]
[DOI]

RBC: Rectifying the Biased Context in Continual Semantic Segmentation
Hanbin Zhao, Fengyu Yang, Xinghe Fu, Xi Li
[pdf]
[DOI]

Factorizing Knowledge in Neural Networks
Xingyi Yang, Jingwen Ye, Xinchao Wang
[pdf]
[DOI]

Contrastive Vicinal Space for Unsupervised Domain Adaptation
Jaemin Na, Dongyoon Han, Hyung Jin Chang, Wonjun Hwang
[pdf]
[DOI]

Cross-Modal Knowledge Transfer without Task-Relevant Source Data
Sk Miraj Ahmed, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Amit K. Roy-Chowdhury
[pdf]
[DOI]

Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions
Theodoros Panagiotakopoulos, Pier Luigi Dovesi, Linus Härenstam-Nielsen, Matteo Poggi
[pdf]
[DOI]

Source-Free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition
Yuecong Xu, Jianfei Yang, Haozhi Cao, Keyu Wu, Min Wu, Zhenghua Chen
[pdf]
[DOI]

BMD: A General Class-Balanced Multicentric Dynamic Prototype Strategy for Source-Free Domain Adaptation
Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, Wei He, Dacheng Tao
[pdf]
[DOI]

Generalized Brain Image Synthesis with Transferable Convolutional Sparse Coding Networks
Yawen Huang, Feng Zheng, Xu Sun, Yuexiang Li, Ling Shao, Yefeng Zheng
[pdf]
[DOI]

Incomplete Multi-View Domain Adaptation via Channel Enhancement and Knowledge Transfer
Haifeng Xia, Pu Wang, Zhengming Ding
[pdf]
[DOI]

DistPro: Searching a Fast Knowledge Distillation Process via Meta Optimization
Xueqing Deng, Dawei Sun, Shawn Newsam, Peng Wang
[pdf]
[DOI]

ML-BPM: Multi-Teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation
Fei Pan, Sungsu Hur, Seokju Lee, Junsik Kim, In So Kweon
[pdf]
[DOI]

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks
Nan Ding, Xi Chen, Tomer Levinboim, Soravit Changpinyo, Radu Soricut
[pdf]
[DOI]

Personalized Education: Blind Knowledge Distillation
Xiang Deng, Jian Zheng, Zhongfei Zhang
[pdf]
[DOI]

Not All Models Are Equal: Predicting Model Transferability in a Self-Challenging Fisher Space
Wenqi Shao, Xun Zhao, Yixiao Ge, Zhaoyang Zhang, Lei Yang, Xiaogang Wang, Ying Shan, Ping Luo
[pdf]
[DOI]

How Stable Are Transferability Metrics Evaluations?
Andrea Agostinelli, Michal Pándy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari
[pdf]
[DOI]

Attention Diversification for Domain Generalization
Rang Meng, Xianfeng Li, Weijie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, Shiliang Pu
[pdf]
[DOI]

ESS: Learning Event-Based Semantic Segmentation from Still Images
Zhaoning Sun, Nico Messikommer, Daniel Gehrig, Davide Scaramuzza
[pdf]
[DOI]

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection
Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang, Bohan Zhuang
[pdf]
[DOI]

Human Trajectory Prediction via Neural Social Physics
Jiangbei Yue, Dinesh Manocha, He Wang
[pdf]
[DOI]

Towards Open Set Video Anomaly Detection
Yuansheng Zhu, Wentao Bao, Qi Yu
[pdf]
[DOI]

ECLIPSE: Efficient Long-Range Video Retrieval Using Sight and Sound
Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius
[pdf]
[DOI]

Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, Limin Wang
[pdf]
[DOI]

Less than Few: Self-Shot Video Instance Segmentation
Pengwan Yang, Yuki M. Asano, Pascal Mettes, Cees G. M. Snoek
[pdf]
[DOI]

Adaptive Face Forgery Detection in Cross Domain
Luchuan Song, Zheng Fang, Xiaodan Li, Xiaoyi Dong, Zhenchao Jin, Yuefeng Chen, Siwei Lyu
[pdf]
[DOI]

Real-Time Online Video Detection with Temporal Smoothing Transformers
Yue Zhao, Philipp Krähenbühl
[pdf]
[DOI]

TALLFormer: Temporal Action Localization with a Long-Memory Transformer
Feng Cheng, Gedas Bertasius
[pdf]
[DOI]

Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation
Guolei Sun, Yun Liu, Hao Tang, Ajad Chhatkuli, Le Zhang, Luc Van Gool
[pdf]
[DOI]

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Medhini Narasimhan, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell, Anna Rohrbach, Cordelia Schmid
[pdf]
[DOI]

Rethinking Learning Approaches for Long-Term Action Anticipation
Megha Nawhal, Akash Abdu Jyothi, Greg Mori
[pdf]
[DOI]

DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang, Pan Zhou, Roger Zimmermann, Shuicheng Yan
[pdf]
[DOI]

Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation
Gensheng Pei, Fumin Shen, Yazhou Yao, Guo-Sen Xie, Zhenmin Tang, Jinhui Tang
[pdf]
[DOI]

PAC-Net: Highlight Your Video via History Preference Modeling
Hang Wang, Penghao Zhou, Chong Zhou, Zhao Zhang, Xing Sun
[pdf]
[DOI]

How Severe Is Benchmark-Sensitivity in Video Self-Supervised Learning?
Fida Mohammad Thoker, Hazel Doughty, Piyush Bagad, Cees G. M. Snoek
[pdf]
[DOI]

A Sliding Window Scheme for Online Temporal Action Localization
Young Hwi Kim, Hyolim Kang, Seon Joo Kim
[pdf]
[DOI]

ERA: Expert Retrieval and Assembly for Early Action Prediction
Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Qiuhong Ke, Jun Liu
[pdf]
[DOI]

Dual Perspective Network for Audio-Visual Event Localization
Varshanth Rao, Md Ibrahim Khalil, Haoda Li, Peng Dai, Juwei Lu
[pdf]
[DOI]

NSNet: Non-Saliency Suppression Sampler for Efficient Video Recognition
Boyang Xia, Wenhao Wu, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang
[pdf]
[DOI]

Video Activity Localisation with Uncertainties in Temporal Boundary
Jiabo Huang, Hailin Jin, Shaogang Gong, Yang Liu
[pdf]
[DOI]

Temporal Saliency Query Network for Efficient Video Recognition
Boyang Xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han
[pdf]
[DOI]

Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency
Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson
[pdf]
[DOI]

Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation
Guodong Ding, Angela Yao
[pdf]
[DOI]

"Spotting Temporally Precise, Fine-Grained Events in Video"
James Hong, Haotian Zhang, Michaël Gharbi, Matthew Fisher, Kayvon Fatahalian
[pdf]
[DOI]

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation
Nadine Behrmann, S. Alireza Golestaneh, Zico Kolter, Jürgen Gall, Mehdi Noroozi
[pdf]
[DOI]

Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang, Xitong Yang, Hengduo Li, Li Liu, Zuxuan Wu, Yu-Gang Jiang
[pdf]
[DOI]

Long Movie Clip Classification with State-Space Video Models
Md Mohaiminul Islam, Gedas Bertasius
[pdf]
[DOI]

Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
[pdf]
[DOI]

Asymmetric Relation Consistency Reasoning for Video Relation Grounding
Huan Li, Ping Wei, Jiapeng Li, Zeyu Ma, Jiahui Shang, Nanning Zheng
[pdf]
[DOI]

Self-Supervised Social Relation Representation for Human Group Detection
Jiacheng Li, Ruize Han, Haomin Yan, Zekun Qian, Wei Feng, Song Wang
[pdf]
[DOI]

K-Centered Patch Sampling for Efficient Video Recognition
Seong Hyeon Park, Jihoon Tack, Byeongho Heo, Jung-Woo Ha, Jinwoo Shin
[pdf]
[DOI]

A Deep Moving-Camera Background Model
Guy Erez, Ron Shapira Weber, Oren Freifeld
[pdf]
[DOI]

GraphVid: It Only Takes a Few Nodes to Understand a Video
Eitan Kosman, Dotan Di Castro
[pdf]
[DOI]

Delta Distillation for Efficient Video Processing
Amirhossein Habibian, Haitam Ben Yahia, Davide Abati, Efstratios Gavves, Fatih Porikli
[pdf]
[DOI]

MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang, Kunchang Li, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou
[pdf]
[DOI]

COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality
Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf
[pdf]
[DOI]

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context
Zizhang Li, Mengmeng Wang, Huaijin Pi, Kechun Xu, Jianbiao Mei, Yong Liu
[pdf]
[DOI]

TDViT: Temporal Dilated Video Transformer for Dense Video Tasks
Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson
[pdf]
[DOI]

Semi-Supervised Learning of Optical Flow by Flow Supervisor
Woobin Im, Sebin Lee, Sung-Eui Yoon
[pdf]
[DOI]

Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization
Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson
[pdf]
[DOI]

Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion
Yiheng Li, Connelly Barnes, Kun Huang, Fang-Lue Zhang
[pdf]
[DOI]

MaCLR: Motion-Aware Contrastive Learning of Representations for Videos
Fanyi Xiao, Joseph Tighe, Davide Modolo
[pdf]
[DOI]

Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Kyle Min, Sourya Roy, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar
[pdf]
[DOI]

Frozen CLIP Models Are Efficient Video Learners
Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li
[pdf]
[DOI]

PIP: Physical Interaction Prediction via Mental Simulation with Span Selection
Jiafei Duan, Samson Yu, Soujanya Poria, Bihan Wen, Cheston Tan
[pdf]
[DOI]

Panoramic Vision Transformer for Saliency Detection in 360° Videos
Heeseung Yun, Sehun Lee, Gunhee Kim
[pdf]
[DOI]

Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration
Aditi Basu Bal, Ramy Mounir, Sathyanarayanan Aakur, Sudeep Sarkar, Anuj Srivastava
[pdf]
[DOI]

Motion Sensitive Contrastive Learning for Self-Supervised Video Representation
Jingcheng Ni, Nan Zhou, Jie Qin, Qian Wu, Junqi Liu, Boxun Li, Di Huang
[pdf]
[DOI]

Dynamic Temporal Filtering In Video Models
Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei
[pdf]
[DOI]

Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification
Renrui Zhang, Wei Zhang, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li
[pdf]
[DOI]

Temporal Lift Pooling for Continuous Sign Language Recognition
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng
[pdf]
[DOI]

MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
[pdf]
[DOI]

SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding
Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei
[pdf]
[DOI]

Cross-Modal Prototype Driven Network for Radiology Report Generation
Jun Wang, Abhir Bhalerao, Yulan He
[pdf]
[DOI]

TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts
Chuan Guo, Xinxin Zuo, Sen Wang, Li Cheng
[pdf]
[DOI]

SeqTR: A Simple Yet Universal Network for Visual Grounding
Chaoyang Zhu, Yiyi Zhou, Yunhang Shen, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji
[pdf]
[DOI]

VTC: Improving Video-Text Retrieval with User Comments
Laura Hanu, James Thewlis, Yuki M. Asano, Christian Rupprecht
[pdf]
[DOI]

FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

Weakly Supervised Grounding for VQA in Vision-Language Transformers
Aisha Urooj, Hilde Kuehne, Chuang Gan, Niels Da Vitoria Lobo, Mubarak Shah
[pdf]
[DOI]

Automatic Dense Annotation of Large-Vocabulary Sign Language Videos
Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman
[pdf]
[DOI]

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval
Yuying Ge, Yixiao Ge, Xihui Liu, Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo
[pdf]
[DOI]

"GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval"
Yuxuan Wang, Difei Gao, Licheng Yu, Weixian Lei, Matt Feiszli, Mike Zheng Shou
[pdf]
[DOI]

A Simple and Robust Correlation Filtering Method for Text-Based Person Search
Wei Suo, Mengyang Sun, Kai Niu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu
[pdf]
[DOI]

Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing
Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Stephanie Hyland, Maria Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez-Valle, Hoifung Poon, Ozan Oktay
[pdf]
[DOI]

Generative Negative Text Replay for Continual Vision-Language Pretraining
Shipeng Yan, Lanqing Hong, Hang Xu, Jianhua Han, Tinne Tuytelaars, Zhenguo Li, Xuming He
[pdf]
[DOI]

Video Graph Transformer for Video Question Answering
Junbin Xiao, Pan Zhou, Tat-Seng Chua, Shuicheng Yan
[pdf]
[DOI]

Trace Controlled Text to Image Generation
Kun Yan, Lei Ji, Chenfei Wu, Jianmin Bao, Ming Zhou, Nan Duan, Shuai Ma
[pdf]
[DOI]

Video Question Answering with Iterative Video-Text Co-Tokenization
AJ Piergiovanni, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova
[pdf]
[DOI]

Rethinking Data Augmentation for Robust Visual Question Answering
Long Chen, Yuhang Zheng, Jun Xiao
[pdf]
[DOI]

Explicit Image Caption Editing
Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao, Jun Xiao
[pdf]
[DOI]

Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding
Jiachang Hao, Haifeng Sun, Pengfei Ren, Jingyu Wang, Qi Qi, Jianxin Liao
[pdf]
[DOI]

Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Spencer Whitehead, Suzanne Petryk, Vedaad Shakib, Joseph Gonzalez, Trevor Darrell, Anna Rohrbach, Marcus Rohrbach
[pdf]
[DOI]

GRIT: Faster and Better Image Captioning Transformer Using Dual Visual Features
Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
[pdf]
[DOI]

Selective Query-Guided Debiasing for Video Corpus Moment Retrieval
Sunjae Yoon, Ji Woo Hong, Eunseop Yoon, Dahyun Kim, Junyeong Kim, Hee Suk Yoon, Chang D. Yoo
[pdf]
[DOI]

Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding
Cheng Shi, Sibei Yang
[pdf]
[DOI]

Object-Centric Unsupervised Image Captioning
Zihang Meng, David Yang, Xuefei Cao, Ashish Shah, Ser-Nam Lim
[pdf]
[DOI]

Contrastive Vision-Language Pre-training with Limited Resources
Quan Cui, Boyan Zhou, Yu Guo, Weidong Yin, Hao Wu, Osamu Yoshie, Yubo Chen
[pdf]
[DOI]

Learning Linguistic Association towards Efficient Text-Video Retrieval
Sheng Fang, Shuhui Wang, Junbao Zhuo, Xinzhe Han, Qingming Huang
[pdf]
[DOI]

ASSISTER: Assistive Navigation via Conditional Instruction Generation
Zanming Huang, Zhongkai Shangguan, Jimuyang Zhang, Gilad Bar, Matthew Boyd, Eshed Ohn-Bar
[pdf]
[DOI]

X-DETR: A Versatile Architecture for Instance-Wise Vision-Language Tasks
Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto
[pdf]
[DOI]

Learning Disentanglement with Decoupled Labels for Vision-Language Navigation
Wenhao Cheng, Xingping Dong, Salman Khan, Jianbing Shen
[pdf]
[DOI]

Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input
Qingpei Guo, Kaisheng Yao, Wei Chu
[pdf]
[DOI]

Word-Level Fine-Grained Story Visualization
Bowen Li
[pdf]
[DOI]

Unifying Event Detection and Captioning as Sequence Generation via Pre-training
Qi Zhang, Yuqing Song, Qin Jin
[pdf]
[DOI]

Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation
Chuang Lin, Yi Jiang, Jianfei Cai, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan
[pdf]
[DOI]

Fine-Grained Visual Entailment
Christopher Thomas, Yipeng Zhang, Shih-Fu Chang
[pdf]
[DOI]

Bottom Up Top down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki
[pdf]
[DOI]

New Datasets and Models for Contextual Reasoning in Visual Dialog
Yifeng Zhang, Ming Jiang, Qi Zhao
[pdf]
[DOI]

VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Joanna Hong, Minsu Kim, Yong Man Ro
[pdf]
[DOI]

Classification-Regression for Chart Comprehension
Matan Levy, Rami Ben-Ari, Dani Lischinski
[pdf]
[DOI]

AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric Assistant
Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou
[pdf]
[DOI]

FindIt: Generalized Localization with Natural Language Queries
Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova
[pdf]
[DOI]

UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang, Zhe Gan, Jianfeng Wang, Xiaowei Hu, Faisal Ahmed, Zicheng Liu, Yumao Lu, Lijuan Wang
[pdf]
[DOI]

Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin
[pdf]
[DOI]

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Jack Hessel, Jena D. Hwang, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko, Yejin Choi
[pdf]
[DOI]

Speaker-Adaptive Lip Reading with User-Dependent Padding
Minsu Kim, Hyunjun Kim, Yong Man Ro
[pdf]
[DOI]

TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation
Tan M. Dinh, Rang Nguyen, Binh-Son Hua
[pdf]
[DOI]

SemAug: Semantically Meaningful Image Augmentations for Object Detection through Language Grounding
Morgan Heisler, Amin Banitalebi-Dehkordi, Yong Zhang
[pdf]
[DOI]

Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance
Myungsub Choi
[pdf]
[DOI]

NewsStories: Illustrating Articles with Visual Summaries
Reuben Tan, Bryan A. Plummer, Kate Saenko, JP Lewis, Avneesh Sud, Thomas Leung
[pdf]
[DOI]

Webly Supervised Concept Expansion for General Purpose Vision Models
Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi
[pdf]
[DOI]

FedVLN: Privacy-Preserving Federated Vision-and-Language Navigation
Kaiwen Zhou, Xin Eric Wang
[pdf]
[DOI]

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang, Dongliang He, Wenhao Wu, Boyang Xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang
[pdf]
[DOI]

Language-Driven Artistic Style Transfer
Tsu-Jui Fu, Xin Eric Wang, William Yang Wang
[pdf]
[DOI]

Single-Stream Multi-level Alignment for Vision-Language Pretraining
Zaid Khan, Vijay Kumar B G, Xiang Yu, Samuel Schulter, Manmohan Chandraker, Yun Fu
[pdf]
[DOI]

Most and Least Retrievable Images in Visual-Language Query Systems
Liuwan Zhu, Rui Ning, Jiang Li, Chunsheng Xin, Hongyi Wu
[pdf]
[DOI]

Sports Video Analysis on Large-Scale Data
Dekun Wu, He Zhao, Xingce Bao, Richard P. Wildes
[pdf]
[DOI]

Grounding Visual Representations with Texts for Domain Generalization
Seonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, Jinkyu Kim
[pdf]
[DOI]

Bridging the Visual Semantic Gap in VLN via Semantically Richer Instructions
Joaquín Ossandón, Benjamín Earle, Alvaro Soto
[pdf]
[DOI]

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
Adyasha Maharana, Darryl Hannan, Mohit Bansal
[pdf]
[DOI]

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson, Stella Biderman, Daniel Kornis, Dashiell Stander, Eric Hallahan, Louis Castricato, Edward Raff
[pdf]
[DOI]

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou
[pdf]
[DOI]

End-to-End Active Speaker Detection
Juan León Alcázar, Moritz Cordes, Chen Zhao, Bernard Ghanem
[pdf]
[DOI]

Emotion Recognition for Multiple Context Awareness
Dingkang Yang, Shuai Huang, Shunli Wang, Yang Liu, Peng Zhai, Liuzhen Su, Mingcheng Li, Lihua Zhang
[pdf]
[DOI]

Adaptive Fine-Grained Sketch-Based Image Retrieval
Ayan Kumar Bhunia, Aneeshan Sain, Parth Hiren Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
[pdf]
[DOI]

Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov
[pdf]
[DOI]

Uncertainty-Aware Multi-modal Learning via Cross-Modal Random Network Prediction
Hu Wang, Jianpeng Zhang, Yuanhong Chen, Congbo Ma, Jodie Avery, Louise Hull, Gustavo Carneiro
[pdf]
[DOI]

Localizing Visual Sounds the Easy Way
Shentong Mo, Pedro Morgado
[pdf]
[DOI]

Learning Visual Styles from Audio-Visual Associations
Tingle Li, Yichen Liu, Andrew Owens, Hang Zhao
[pdf]
[DOI]

Remote Respiration Monitoring of Moving Person Using Radio Signals
Jae-Ho Choi, Ki-Bong Kang, Kyung-Tae Kim
[pdf]
[DOI]

Camera Pose Estimation and Localization with Active Audio Sensing
Karren Yang, Michael Firman, Eric Brachmann, Clément Godard
[pdf]
[DOI]

PACS: A Dataset for Physical Audiovisual Commonsense Reasoning
Samuel Yu, Peter Wu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency
[pdf]
[DOI]

VoViT: Low Latency Graph-Based Audio-Visual Voice Separation Transformer
Juan F. Montesinos, Venkatesh S. Kadandale, Gloria Haro
[pdf]
[DOI]

Telepresence Video Quality Assessment
Zhenqiang Ying, Deepti Ghadiyaram, Alan Bovik
[pdf]
[DOI]

MultiMAE: Multi-modal Multi-task Masked Autoencoders
Roman Bachmann, David Mizrahi, Andrei Atanov, Amir Zamir
[pdf]
[DOI]

AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey
[pdf]
[DOI]

Audio—Visual Segmentation
Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong
[pdf]
[DOI]

Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression
Yeying Jin, Wenhan Yang, Robby T. Tan
[pdf]
[DOI]

Relationformer: A Unified Framework for Image-to-Graph Generation
Suprosanna Shit, Rajat Koner, Bastian Wittmann, Johannes Paetzold, Ivan Ezhov, Hongwei Li, Jiazhen Pan, Sahand Sharifzadeh, Georgios Kaissis, Volker Tresp, Bjoern Menze
[pdf]
[DOI]

GAMa: Cross-view Video Geo-localization
Shruti Vyas, Chen Chen, Mubarak Shah
[pdf]
[DOI]

Revisiting a kNN-based Image Classification System with High-capacity Storage
Kengo Nakata, Youyang Ng, Daisuke Miyashita, Asuka Maki, Yu-Chieh Lin, Jun Deguchi
[pdf]
[DOI]

Geometric Representation Learning for Document Image Rectification
Hao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang, Houqiang Li
[pdf]
[DOI]

S2-VER: Semi-Supervised Visual Emotion Recognition
Guoli Jia, Jufeng Yang
[pdf]
[DOI]

Image Coding for Machines with Omnipotent Feature Learning
Ruoyu Feng, Xin Jin, Zongyu Guo, Runsen Feng, Yixin Gao, Tianyu He, Zhizheng Zhang, Simeng Sun, Zhibo Chen
[pdf]
[DOI]

Feature Representation Learning for Unsupervised Cross-Domain Image Retrieval
Conghui Hu, Gim Hee Lee
[pdf]
[DOI]

"Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition"
Shilin Xu, Xiangtai Li, Jingbo Wang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
[pdf]
[DOI]

Semantic-Guided Multi-Mask Image Harmonization
Xuqian Ren, Yifan Liu
[pdf]
[DOI]

Learning an Isometric Surface Parameterization for Texture Unwrapping
Sagnik Das, Ke Ma, Zhixin Shu, Dimitris Samaras
[pdf]
[DOI]

Towards Regression-Free Neural Networks for Diverse Compute Platforms
Rahul Duggal, Hao Zhou, Shuo Yang, Jun Fang, Yuanjun Xiong, Wei Xia
[pdf]
[DOI]

Relationship Spatialization for Depth Estimation
Xiaoyu Xu, Jiayan Qiu, Xinchao Wang, Zhou Wang
[pdf]
[DOI]

Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
Chenfeng Xu, Shijia Yang, Tomer Galanti, Bichen Wu, Xiangyu Yue, Bohan Zhai, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka
[pdf]
[DOI]

FAR: Fourier Aerial Video Recognition
Divya Kothandaraman, Tianrui Guan, Xijun Wang, Shuowen Hu, Ming Lin, Dinesh Manocha
[pdf]
[DOI]

Translating a Visual LEGO Manual to a Machine-Executable Plan
Ruocheng Wang, Yunzhi Zhang, Jiayuan Mao, Chin-Yi Cheng, Jiajun Wu
[pdf]
[DOI]

Fabric Material Recovery from Video Using Multi-Scale Geometric Auto-Encoder
Junbang Liang, Ming Lin
[pdf]
[DOI]

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
Jie Ren, Wenteng Liang, Ran Yan, Luo Mai, Shiwen Liu, Xiao Liu
[pdf]
[DOI]

The One Where They Reconstructed 3D Humans and Environments in TV Shows
Georgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa
[pdf]
[DOI]

TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices Using Submodular Mutual Information
Suraj Kothawade, Saikat Ghosh, Sumit Shekhar, Yu Xiang, Rishabh Iyer
[pdf]
[DOI]

An Efficient Person Clustering Algorithm for Open Checkout-Free Groceries
Junde Wu, Yu Zhang, Rao Fu, Yuanpei Liu, Jing Gao
[pdf]
[DOI]

POP: Mining POtential Performance of New Fashion Products via Webly Cross-Modal Query Expansion
Christian Joppi, Geri Skenderi, Marco Cristani
[pdf]
[DOI]

Pose Forecasting in Industrial Human-Robot Collaboration
Alessio Sampieri, Guido Maria D’Amely di Melendugno, Andrea Avogaro, Federico Cunico, Francesco Setti, Geri Skenderi, Marco Cristani, Fabio Galasso
[pdf]
[DOI]

Actor-Centered Representations for Action Localization in Streaming Videos
Sathyanarayanan Aakur, Sudeep Sarkar
[pdf]
[DOI]

Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT
Xiufeng Xie, Ning Zhou, Wentao Zhu, Ji Liu
[pdf]
[DOI]

Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment
Paritosh Parmar, Amol Gharat, Helge Rhodin
[pdf]
[DOI]

Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Mohan Zhou, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao, Tao Mei
[pdf]
[DOI]

"Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics"
Sen Zhang, Jing Zhang, Dacheng Tao
[pdf]
[DOI]

TIPS: Text-Induced Pose Synthesis
Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
[pdf]
[DOI]

Addressing Heterogeneity in Federated Learning via Distributional Transformation
Haolin Yuan, Bo Hui, Yuchen Yang, Philippe Burlina, Neil Zhenqiang Gong, Yinzhi Cao
[pdf]
[DOI]

Where in the World Is This Image? Transformer-Based Geo-Localization in the Wild
Shraman Pramanick, Ewa M. Nowara, Joshua Gleason, Carlos D. Castillo, Rama Chellappa
[pdf]
[DOI]

Colorization for In Situ Marine Plankton Images
Guannan Guo, Qi Lin, Tao Chen, Zhenghui Feng, Zheng Wang, Jianping Li
[pdf]
[DOI]

Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection
Mingyu Yang, Yu Chen, Hun-Seok Kim
[pdf]
[DOI]

A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch
Patsorn Sangkloy, Wittawat Jitkrittum, Diyi Yang, James Hays
[pdf]
[DOI]

A Cloud 3D Dataset and Application-Specific Learned Image Compression in Cloud 3D
Tianyi Liu, Sen He, Vinodh Kumaran Jayakumar, Wei Wang
[pdf]
[DOI]

AutoTransition: Learning to Recommend Video Transition Effects
Yaojie Shen, Libo Zhang, Kai Xu, Xiaojie Jin
[pdf]
[DOI]

Online Segmentation of LiDAR Sequences: Dataset and Algorithm
Romain Loiseau, Mathieu Aubry, Loïc Landrieu
[pdf]
[DOI]

Open-World Semantic Segmentation for LIDAR Point Clouds
Jun Cen, Peng Yun, Shiwei Zhang, Junhao Cai, Di Luan, Mingqian Tang, Ming Liu, Michael Yu Wang
[pdf]
[DOI]

KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients
Niklas Hanselmann, Katrin Renz, Kashyap Chitta, Apratim Bhattacharyya, Andreas Geiger
[pdf]
[DOI]

Differentiable Raycasting for Self-Supervised Occupancy Forecasting
Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan
[pdf]
[DOI]

InAction: Interpretable Action Decision Making for Autonomous Driving
Taotao Jing, Haifeng Xia, Renran Tian, Haoran Ding, Xiao Luo, Joshua Domeyer, Rini Sherony, Zhengming Ding
[pdf]
[DOI]

CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection
Jyh-Jing Hwang, Henrik Kretzschmar, Joshua Manela, Sean Rafferty, Nicholas Armstrong-Crews, Tiffany Chen, Dragomir Anguelov
[pdf]
[DOI]

CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
Kaican Li, Kai Chen, Haoyu Wang, Lanqing Hong, Chaoqiang Ye, Jianhua Han, Yukuai Chen, Wei Zhang, Chunjing Xu, Dit-Yan Yeung, Xiaodan Liang, Zhenguo Li, Hang Xu
[pdf]
[DOI]

Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving
Mahyar Najibi, Jingwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov
[pdf]
[DOI]

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally
Adil Kaan Akan, Fatma Güney
[pdf]
[DOI]

RCLane: Relay Chain Prediction for Lane Detection
Shenghua Xu, Xinyue Cai, Bin Zhao, Li Zhang, Hang Xu, Yanwei Fu, Xiangyang Xue
[pdf]
[DOI]

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation
Antonin Vobecky, David Hurych, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic
[pdf]
[DOI]

CenterFormer: Center-based Transformer for 3D Object Detection
Zixiang Zhou, Xiangchen Zhao, Yu Wang, Panqu Wang, Hassan Foroosh
[pdf]
[DOI]

Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches
Zhiyuan Cheng, James Liang, Hongjun Choi, Guanhong Tao, Zhiwen Cao, Dongfang Liu, Xiangyu Zhang
[pdf]
[DOI]

ST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning
Shengchao Hu, Li Chen, Penghao Wu, Hongyang Li, Junchi Yan, Dacheng Tao
[pdf]
[DOI]

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark
Li Chen, Chonghao Sima, Yang Li, Zehan Zheng, Jiajie Xu, Xiangwei Geng, Hongyang Li, Conghui He, Jianping Shi, Yu Qiao, Junchi Yan
[pdf]
[DOI]

PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation
Kwonyoung Kim, Jungin Park, Jiyoung Lee, Dongbo Min, Kwanghoon Sohn
[pdf]
[DOI]

BRNet: Exploring Comprehensive Features for Monocular Depth Estimation
Wencheng Han, Junbo Yin, Xiaogang Jin, Xiangdong Dai, Jianbing Shen
[pdf]
[DOI]

SiamDoGe: Domain Generalizable Semantic Segmentation Using Siamese Network
Zhenyao Wu, Xinyi Wu, Xiaoping Zhang, Lili Ju, Song Wang
[pdf]
[DOI]

Context-Aware Streaming Perception in Dynamic Environments
Gur-Eyal Sela, Ionel Gog, Justin Wong, Kumar Krishna Agrawal, Xiangxi Mo, Sukrit Kalra, Peter Schafhalter, Eric Leong, Xin Wang, Bharathan Balaji, Joseph Gonzalez, Ion Stoica
[pdf]
[DOI]

SpOT: Spatiotemporal Modeling for 3D Object Tracking
Colton Stearns, Davis Rempe, Jie Li, Rareș Ambruș, Sergey Zakharov, Vitor Guizilini, Yanchao Yang, Leonidas J. Guibas
[pdf]
[DOI]

Multimodal Transformer for Automatic 3D Annotation and Object Detection
Chang Liu, Xiaoyan Qian, Binxiao Huang, Xiaojuan Qi, Edmund Lam, Siew-Chong Tan, Ngai Wong
[pdf]
[DOI]

Dynamic 3D Scene Analysis by Point Cloud Accumulation
Shengyu Huang, Zan Gojcic, Jiahui Huang, Andreas Wieser, Konrad Schindler
[pdf]
[DOI]

Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
Xin Li, Botian Shi, Yuenan Hou, Xingjiao Wu, Tianlong Ma, Yikang Li, Liang He
[pdf]
[DOI]

"JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes"
Haimei Zhao, Jing Zhang, Sen Zhang, Dacheng Tao
[pdf]
[DOI]

Semi-Supervised 3D Object Detection with Proficient Teachers
Junbo Yin, Jin Fang, Dingfu Zhou, Liangjun Zhang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang
[pdf]
[DOI]

Point Cloud Compression with Sibling Context and Surface Priors
Zhili Chen, Zian Qian, Sukai Wang, Qifeng Chen
[pdf]
[DOI]

Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module
Han Zhang, Yunchao Gu, Xinliang Wang, Junjun Pan, Minghui Wang
[pdf]
[DOI]

ProposalContrast: Unsupervised Pre-training for LiDAR-Based 3D Object Detection
Junbo Yin, Dingfu Zhou, Liangjun Zhang, Jin Fang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang
[pdf]
[DOI]

PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map
Chenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan
[pdf]
[DOI]

Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather Conditions
Nikhil Reddy, Abhinav Singhal, Abhishek Kumar, Mahsa Baktashmotlagh, Chetan Arora
[pdf]
[DOI]

LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds
Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov
[pdf]
[DOI]

Visual Cross-View Metric Localization with Dense Uncertainty Estimates
Zimin Xia, Olaf Booij, Marco Manfredi, Julian F. P. Kooij
[pdf]
[DOI]

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma
[pdf]
[DOI]

DevNet: Self-Supervised Monocular Depth Learning via Density Volume Construction
Kaichen Zhou, Lanqing Hong, Changhao Chen, Hang Xu, Chaoqiang Ye, Qingyong Hu, Zhenguo Li
[pdf]
[DOI]

Action-Based Contrastive Learning for Trajectory Prediction
Marah Halawa, Olaf Hellwich, Pia Bideau
[pdf]
[DOI]

Radatron: Accurate Detection Using Multi-Resolution Cascaded MIMO Radar
Sohrab Madani, Jayden Guan, Waleed Ahmed, Saurabh Gupta, Haitham Hassanieh
[pdf]
[DOI]

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection
Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

Efficient Point Cloud Segmentation with Geometry-Aware Sparse Networks
Maosheng Ye, Rui Wan, Shuangjie Xu, Tongyi Cao, Qifeng Chen
[pdf]
[DOI]

FH-Net: A Fast Hierarchical Network for Scene Flow Estimation on Real-World Point Clouds
Lihe Ding, Shaocong Dong, Tingfa Xu, Xinli Xu, Jie Wang, Jianan Li
[pdf]
[DOI]

SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention
Simon Doll, Richard Schulz, Lukas Schneider, Viviane Benzin, Markus Enzweiler, Hendrik P.A. Lensch
[pdf]
[DOI]

Pixel-Wise Energy-Biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes
Yu Tian, Yuyuan Liu, Guansong Pang, Fengbei Liu, Yuanhong Chen, Gustavo Carneiro
[pdf]
[DOI]

Rethinking Closed-Loop Training for Autonomous Driving
Chris Zhang, Runsheng Guo, Wenyuan Zeng, Yuwen Xiong, Binbin Dai, Rui Hu, Mengye Ren, Raquel Urtasun
[pdf]
[DOI]

SLiDE: Self-Supervised LiDAR De-Snowing through Reconstruction Difficulty
Gwangtak Bae, Byungjun Kim, Seongyong Ahn, Jihong Min, Inwook Shim
[pdf]
[DOI]

Generative Meta-Adversarial Network for Unseen Object Navigation
Sixian Zhang, Weijie Li, Xinhang Song, Yubing Bai, Shuqiang Jiang
[pdf]
[DOI]

Object Manipulation via Visual Target Localization
Kiana Ehsani, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi
[pdf]
[DOI]

MoDA: Map Style Transfer for Self-Supervised Domain Adaptation of Embodied Agents
Eun Sun Lee, Junho Kim, SangWon Park, Young Min Kim
[pdf]
[DOI]

Housekeep: Tidying Virtual Households Using Commonsense Reasoning
Yash Kant, Arun Ramachandran, Sriram Yenamandra, Igor Gilitschenski, Dhruv Batra, Andrew Szot, Harsh Agrawal
[pdf]
[DOI]

Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects
Qiyu Dai, Jiyao Zhang, Qiwei Li, Tianhao Wu, Hao Dong, Ziyuan Liu, Ping Tan, He Wang
[pdf]
[DOI]

Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction
Chia-Chi Chuang, Donglin Yang, Chuan Wen, Yang Gao
[pdf]
[DOI]

OPD: Single-View 3D Openable Part Detection
Hanxiao Jiang, Yongsen Mao, Manolis Savva, Angel X. Chang
[pdf]
[DOI]

AirDet: Few-Shot Detection without Fine-Tuning for Autonomous Exploration
Bowen Li, Chen Wang, Pranay Reddy, Seungchan Kim, Sebastian Scherer
[pdf]
[DOI]

TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance
Hongtao Wen, Jianhang Yan, Wanli Peng, Yi Sun
[pdf]
[DOI]

StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning
Jinghuan Shang, Kumara Kahatapitiya, Xiang Li, Michael S. Ryoo
[pdf]
[DOI]

TIDEE: Tidying Up Novel Rooms Using Visuo-Semantic Commonsense Priors
Gabriel Sarch, Zhaoyuan Fang, Adam W. Harley, Paul Schydlo, Michael J. Tarr, Saurabh Gupta, Katerina Fragkiadaki
[pdf]
[DOI]

Learning Efficient Multi-agent Cooperative Visual Exploration
Chao Yu, Xinyi Yang, Jiaxuan Gao, Huazhong Yang, Yu Wang, Yi Wu
[pdf]
[DOI]

Zero-Shot Category-Level Object Pose Estimation
Walter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner
[pdf]
[DOI]

Sim-to-Real 6D Object Pose Estimation via Iterative Self-Training for Robotic Bin Picking
Kai Chen, Rui Cao, Stephen James, Yichuan Li, Yun-Hui Liu, Pieter Abbeel, Qi Dou
[pdf]
[DOI]

Active Audio-Visual Separation of Dynamic Sound Sources
Sagnik Majumder, Kristen Grauman
[pdf]
[DOI]

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos
Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang
[pdf]
[DOI]

Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments
Jacob Krantz, Stefan Lee
[pdf]
[DOI]

Style-Agnostic Reinforcement Learning
Juyong Lee, Seokjun Ahn, Jaesik Park
[pdf]
[DOI]

Self-Supervised Interactive Object Segmentation through a Singulation-and-Grasping Approach
Houjian Yu, Changhyun Choi
[pdf]
[DOI]

Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev
[pdf]
[DOI]

"BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking"
Dorian F. Henning, Tristan Laidlow, Stefan Leutenegger
[pdf]
[DOI]

FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion
Fabian Duffhauss, Ngo Anh Vien, Hanna Ziesche, Gerhard Neumann
[pdf]
[DOI]

Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning
Chi Zhang, Sirui Xie, Baoxiong Jia, Ying Nian Wu, Song-Chun Zhu, Yixin Zhu
[pdf]
[DOI]

Video Dialog As Conversation about Objects Living in Space-Time
Hoang-Anh Pham, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran
[pdf]
[DOI]

Quaternion Equivariant Capsule Networks for 3D Point Clouds
Yongheng Zhao, Tolga Birdal, Jan Eric Lenssen, Emanuele Menegatti, Leonidas Guibas, Federico Tombari
[pdf]
[DOI]

DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
Yizhak Ben-Shabat, Stephen Gould
[pdf]
[DOI]

NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search
Zhichao Lu, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti
[pdf]
[DOI]

Describing Textures using Natural Language
Chenyun Wu, Mikayla Timm, Subhransu Maji
[pdf]
[DOI]

Empowering Relational Network by Self-Attention Augmented Conditional Random Fields for Group Activity Recognition
Rizard Renanda Adhi Pramono, Yie Tarng Chen, Wen Hsien Fang
[pdf]
[DOI]

AiR: Attention with Reasoning Capability
Shi Chen, Ming Jiang, Jinhui Yang, Qi Zhao
[pdf]
[DOI]

Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Gu Wang, Fabian Manhardt, Jianzhun Shao, Xiangyang Ji, Nassir Navab , Federico Tombari
[pdf]
[DOI]

Invertible Image Rescaling
Mingqing Xiao, Shuxin Zheng, Chang Liu, Yaolong Wang, Di He, Guolin Ke, Jiang Bian, Zhouchen Lin, Tie-Yan Liu
[pdf]
[DOI]

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation
Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan L. Yuille
[pdf]
[DOI]

House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation
Nelson Nauata, Kai-Hung Chang, Chin-Yi Cheng, Greg Mori, Yasutaka Furukawa
[pdf]
[DOI]

Crowdsampling the Plenoptic Function
Zhengqi Li, Wenqi Xian, Abe Davis, Noah Snavely
[pdf]
[DOI]

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment
Hanyue Tu, Chunyu Wang, Wenjun Zeng
[pdf]
[DOI]

End-to-End Object Detection with Transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko
[pdf]
[DOI]

DeepSFM: Structure From Motion Via Deep Bundle Adjustment
Xingkui Wei, Yinda Zhang, Zhuwen Li, Yanwei Fu, Xiangyang Xue
[pdf]
[DOI]

Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
Yifan Xu, Tianqi Fan, Yi Yuan, Gurprit Singh
[pdf]
[DOI]

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
Zhenbo Xu, Wei Zhang, Xiao Tan, Wei Yang, Huan Huang, Shilei Wen, Errui Ding, Liusheng Huang
[pdf]
[DOI]

Conditional Convolutions for Instance Segmentation
Zhi Tian, Chunhua Shen, Hao Chen
[pdf]
[DOI]

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
Taojiannan Yang, Sijie Zhu, Chen Chen, Shen Yan, Mi Zhang, Andrew Willis
[pdf]
[DOI]

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie , Bharath Hariharan, Hartwig Adam, Serge Belongie
[pdf]
[DOI]

Privacy Preserving Structure-from-Motion
Marcel Geppert, Viktor Larsson, Pablo Speciale, Johannes L. Schönberger, Marc Pollefeys
[pdf]
[DOI]

Rewriting a Deep Generative Model
David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba
[pdf]
[DOI]

Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
[pdf]
[DOI]

Long-term Human Motion Prediction with Scene Context
Zhe Cao, Hang Gao, Karttikeya Mangalam, Qi-Zhi Cai, Minh Vo, Jitendra Malik
[pdf]
[DOI]

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
[pdf]
[DOI]

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
Panos Achlioptas, Ahmed Abdelreheem, Fei Xia, Mohamed Elhoseiny, Leonidas Guibas
[pdf]
[DOI]

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images
Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, James Tompkin
[pdf]
[DOI]

Learning and Aggregating Deep Local Descriptors for Instance-level Recognition
Giorgos Tolias, Tomas Jenicek, Ondřej Chum
[pdf]
[DOI]

A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem
George Terzakis, Manolis Lourakis
[pdf]
[DOI]

Learn to Recover Visible Color for Video Surveillance in a Day
Guangming Wu, Yinqiang Zheng, Zhiling Guo, Zekun Cai, Xiaodan Shi, Xin Ding, Yifei Huang, Yimin Guo, Ryosuke Shibasaki
[pdf]
[DOI]

Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images
Heming Zhu, Yu Cao, Hang Jin, Weikai Chen, Dong Du, Zhangye Wang, Shuguang Cui, Xiaoguang Han
[pdf]
[DOI]

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
Zhenda Xie, Zheng Zhang, Xizhou Zhu, Gao Huang, Stephen Lin
[pdf]
[DOI]

BorderDet: Border Feature for Dense Object Detection
Han Qiu, Yuchen Ma, Zeming Li, Songtao Liu, Jian Sun
[pdf]
[DOI]

Regularization with Latent Space Virtual Adversarial Training
Genki Osada, Budrul Ahsan, Revoti Prasad Bora, Takashi Nishide
[pdf]
[DOI]

Du²Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels
Yinda Zhang, Neal Wadhwa, Sergio Orts-Escolano, Christian Häne, Sean Fanello, Rahul Garg
[pdf]
[DOI]

Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning
Jaekyeom Kim, Hyoungseok Kim, Gunhee Kim
[pdf]
[DOI]

Targeted Attack for Deep Hashing based Retrieval
Jiawang Bai, Bin Chen, Yiming Li, Dongxian Wu, Weiwei Guo, Shu-Tao Xia, En-Hui Yang
[pdf]
[DOI]

Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Hongwei Yong, Jianqiang Huang, Xiansheng Hua, Lei Zhang
[pdf]
[DOI]

Content-Aware Unsupervised Deep Homography Estimation
Jirong Zhang, Chuan Wang, Shuaicheng Liu, Lanpeng Jia, Nianjin Ye, Jue Wang, Ji Zhou, Jian Sun
[pdf]
[DOI]

Multi-View Optimization of Local Feature Geometry
Mihai Dusmanu, Johannes L. Schönberger, Marc Pollefeys
[pdf]
[DOI]

The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization
Jingjing Shen, Thomas J. Cashman, Qi Ye, Tim Hutton, Toby Sharp, Federica Bogo, Andrew Fitzgibbon, Jamie Shotton
[pdf]
[DOI]

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
Miao Liu, Siyu Tang, Yin Li, James M. Rehg
[pdf]
[DOI]

Learning Stereo from Single Images
Jamie Watson, Oisin Mac Aodha, Daniyar Turmukhambetov, Gabriel J. Brostow, Michael Firman
[pdf]
[DOI]

Prototype Rectification for Few-Shot Learning
Jinlu Liu, Liang Song, Yongqiang Qin
[pdf]
[DOI]

Learning Feature Descriptors using Camera Pose Supervision
Qianqian Wang, Xiaowei Zhou, Bharath Hariharan, Noah Snavely
[pdf]
[DOI]

Semantic Flow for Fast and Accurate Scene Parsing
Xiangtai Li, Ansheng You, Zhen Zhu, Houlong Zhao, Maoke Yang, Kuiyuan Yang, Shaohua Tan, Yunhai Tong
[pdf]
[DOI]

Appearance Consensus Driven Self-Supervised Human Mesh Recovery
Jogendra Nath Kundu, Mugalodi Rakesh, Varun Jampani, Rahul Mysore Venkatesh, R. Venkatesh Babu
[pdf]
[DOI]

Diffraction Line Imaging
Mark Sheinin, Dinesh N. Reddy, Matthew O’Toole, Srinivasa G. Narasimhan
[pdf]
[DOI]

Aligning and Projecting Images to Class-conditional Generative Networks
Minyoung Huh, Richard Zhang, Jun-Yan Zhu, Sylvain Paris, Aaron Hertzmann
[pdf]
[DOI]

Suppress and Balance: A Simple Gated Network for Salient Object Detection
Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
[pdf]
[DOI]

Visual Memorability for Robotic Interestingness via Unsupervised Online Learning
Chen Wang, Wenshan Wang, Yuheng Qiu, Yafei Hu, Sebastian Scherer
[pdf]
[DOI]

Post-Training Piecewise Linear Quantization for Deep Neural Networks
Jun Fang, Ali Shafiee, Hamzah Abdel-Aziz, David Thorsley, Georgios Georgiadis, Joseph H. Hassoun
[pdf]
[DOI]

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification
Yang Zou, Xiaodong Yang, Zhiding Yu, B.V.K. Vijaya Kumar, Jan Kautz
[pdf]
[DOI]

In-Home Daily-Life Captioning Using Radio Signals
Lijie Fan, Tianhong Li, Yuan Yuan, Dina Katabi
[pdf]
[DOI]

Self-Challenging Improves Cross-Domain Generalization
Zeyi Huang, Haohan Wang, Eric P. Xing, Dong Huang
[pdf]
[DOI]

A Competence-aware Curriculum for Visual Concepts Learning via Question Answering
Qing Li, Siyuan Huang, Yining Hong, Song-Chun Zhu
[pdf]
[DOI]

Multitask Learning Strengthens Adversarial Robustness
Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song , Junfeng Yang, Carl Vondrick
[pdf]
[DOI]

S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search
Zhihang Yuan, Bingzhe Wu, Guangyu Sun, Zheng Liang, Shiwan Zhao, Weichen Bi
[pdf]
[DOI]

Improving Deep Video Compression by Resolution-adaptive Flow Coding
Zhihao Hu, Zhenghao Chen, Dong Xu, Guo Lu, Wanli Ouyang, Shuhang Gu
[pdf]
[DOI]

Motion Capture from Internet Videos
Junting Dong, Qing Shuai, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao
[pdf]
[DOI]

Appearance-Preserving 3D Convolution for Video-based Person Re-identification
Xinqian Gu, Hong Chang, Bingpeng Ma, Hongkai Zhang, Xilin Chen
[pdf]
[DOI]

Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization
Dylan Campbell, Liu Liu, Stephen Gould
[pdf]
[DOI]

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation
Xingang Pan, Xiaohang Zhan, Bo Dai, Dahua Lin, Chen Change Loy, Ping Luo
[pdf]
[DOI]

Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded Apertures
Mantang Guo, Junhui Hou, Jing Jin, Jie Chen, Lap-Pui Chau
[pdf]
[DOI]

Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling
Xuesong Niu, Zitong Yu, Hu Han, Xiaobai Li, Shiguang Shan, Guoying Zhao
[pdf]
[DOI]

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction
Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll
[pdf]
[DOI]

Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network
Tsai-Shien Chen, Chih-Ting Liu, Chih-Wei Wu, Shao-Yi Chien
[pdf]
[DOI]

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation
Guolei Sun, Wenguan Wang, Jifeng Dai, Luc Van Gool
[pdf]
[DOI]

CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image
Stefan Popov, Pablo Bauszat, Vittorio Ferrari
[pdf]
[DOI]

Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Lei Huang, Jie Qin, Li Liu, Fan Zhu, Ling Shao
[pdf]
[DOI]

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Zachary Teed, Jia Deng
[pdf]
[DOI]

Domain-invariant Stereo Matching Networks
Feihu Zhang, Xiaojuan Qi, Ruigang Yang, Victor Prisacariu, Benjamin Wah, Philip Torr
[pdf]
[DOI]

DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling
Gyeongsik Moon, Takaaki Shiratori, Kyoung Mu Lee
[pdf]
[DOI]

Content Adaptive and Error Propagation Aware Deep Video Compression
Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu , Zhiyong Gao
[pdf]
[DOI]

Towards Streaming Perception
Mengtian Li, Yu-Xiong Wang, Deva Ramanan
[pdf]
[DOI]

Towards Automated Testing and Robustification by Semantic Adversarial Data Generation
Rakshith Shetty, Mario Fritz, Bernt Schiele
[pdf]
[DOI]

Adversarial Generative Grammars for Human Activity Prediction
AJ Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo
[pdf]
[DOI]

GDumb: A Simple Approach that Questions Our Progress in Continual Learning
Ameya Prabhu, Philip H. S. Torr, Puneet K. Dokania
[pdf]
[DOI]

Learning Lane Graph Representations for Motion Forecasting
Ming Liang, Bin Yang, Rui Hu, Yun Chen, Renjie Liao, Song Feng, Raquel Urtasun
[pdf]
[DOI]

What Matters in Unsupervised Optical Flow
Rico Jonschkowski, Austin Stone, Jonathan T. Barron, Ariel Gordon, Kurt Konolige, Anelia Angelova
[pdf]
[DOI]

Synthesis and Completion of Facades from Satellite Imagery
Xiaowei Zhang, Christopher May, Daniel Aliaga
[pdf]
[DOI]

Mapillary Planet-Scale Depth Dataset
Manuel López Antequera, Pau Gargallo, Markus Hofinger, Samuel Rota Bulò, Yubin Kuang, Peter Kontschieder
[pdf]
[DOI]

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction
Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, Raquel Urtasun
[pdf]
[DOI]

Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters
Haoyu Liang, Zhihao Ouyang, Yuyuan Zeng, Hang Su, Zihao He, Shu-Tao Xia, Jun Zhu, Bo Zhang
[pdf]
[DOI]

EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning
Bailin Li, Bowen Wu, Jiang Su, Guangrun Wang
[pdf]
[DOI]

Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation
Marie-Julie Rakotosaona, Maks Ovsjanikov
[pdf]
[DOI]

Cross-Domain Cascaded Deep Translation
Oren Katzir, Dani Lischinski, Daniel Cohen-Or
[pdf]
[DOI]

“Look Ma, no landmarks!” – Unsupervised, Model-based Dense Face Alignment
Tatsuro Koizumi, William A. P. Smith
[pdf]
[DOI]

Online Invariance Selection for Local Feature Descriptors
Rémi Pautrat, Viktor Larsson, Martin R. Oswald, Marc Pollefeys
[pdf]
[DOI]

Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations
Hongyu Liu, Bin Jiang, Yibing Song, Wei Huang, Chao Yang
[pdf]
[DOI]

TextCaps: a Dataset for Image Captioning with Reading Comprehension
Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh
[pdf]
[DOI]

It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction
Karttikeya Mangalam, Harshayu Girase, Shreyas Agarwal, Kuan-Hui Lee, Ehsan Adeli, Jitendra Malik, Adrien Gaidon
[pdf]
[DOI]

Learning What to Learn for Video Object Segmentation
Goutam Bhat, Felix Järemo Lawin, Martin Danelljan, Andreas Robinson, Michael Felsberg, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing
Garvita Tiwari, Bharat Lal Bhatnagar, Tony Tung, Gerard Pons-Moll
[pdf]
[DOI]

LIMP: Learning Latent Shape Representations with Metric Preservation Priors
Luca Cosmo, Antonio Norelli, Oshri Halimi, Ron Kimmel, Emanuele Rodolà
[pdf]
[DOI]

Unsupervised Sketch to Photo Synthesis
Runtao Liu, Qian Yu, Stella X. Yu
[pdf]
[DOI]

A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions
Evgenia Rusak, Lukas Schott, Roland S. Zimmermann, Julian Bitterwolf , Oliver Bringmann, Matthias Bethge, Wieland Brendel
[pdf]
[DOI]

SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification
Yida Wang, David Joseph Tan, Nassir Navab, Federico Tombari
[pdf]
[DOI]

Hierarchical Face Aging through Disentangled Latent Characteristics
Peipei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun
[pdf]
[DOI]

Hybrid Models for Open Set Recognition
Hongjie Zhang, Ang Li, Jie Guo, Yanwen Guo
[pdf]
[DOI]

TopoGAN: A Topology-Aware Generative Adversarial Network
Fan Wang, Huidong Liu, Dimitris Samaras, Chao Chen
[pdf]
[DOI]

Learning to Localize Actions from Moments
Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei
[pdf]
[DOI]

ForkGAN: Seeing into the Rainy Night
Ziqiang Zheng, Yang Wu, Xinran Han, Jianbo Shi
[pdf]
[DOI]

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning
Xinwei Sun, Yilun Xu, Peng Cao, Yuqing Kong, Lingjing Hu, Shanghang Zhang, Yizhou Wang
[pdf]
[DOI]

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval
Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, Osamu Yoshie
[pdf]
[DOI]

TSIT: A Simple and Versatile Framework for Image-to-Image Translation
Liming Jiang, Changxu Zhang, Mingyang Huang, Chunxiao Liu, Jianping Shi, Chen Change Loy
[pdf]
[DOI]

ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices
Xiangyu He, Zitao Mo, Ke Cheng, Weixiang Xu, Qinghao Hu, Peisong Wang, Qingshan Liu, Jian Cheng
[pdf]
[DOI]

HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
Can Wang, Jiefeng Li, Wentao Liu, Chen Qian, Cewu Lu
[pdf]
[DOI]

Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve
Weicheng Kuo, Anelia Angelova, Tsung-Yi Lin, Angela Dai
[pdf]
[DOI]

A Unified Framework of Surrogate Loss by Refactoring and Interpolation
Lanlan Liu, Mingzhe Wang, Jia Deng
[pdf]
[DOI]

Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images
Sai Bi, Zexiang Xu, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi
[pdf]
[DOI]

Memory-augmented Dense Predictive Coding for Video Representation Learning
Tengda Han, Weidi Xie, Andrew Zisserman
[pdf]
[DOI]

PointMixup: Augmentation for Point Clouds
Yunlu Chen, Vincent Tao Hu, Efstratios Gavves, Thomas Mensink, Pascal Mettes, Pengwan Yang, Cees G. M. Snoek
[pdf]
[DOI]

Identity-Guided Human Semantic Parsing for Person Re-Identification
Kuan Zhu, Haiyun Guo, Zhiwei Liu, Ming Tang, Jinqiao Wang
[pdf]
[DOI]

Learning Gradient Fields for Shape Generation
Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, Bharath Hariharan
[pdf]
[DOI]

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder
Kuniaki Saito, Kate Saenko, Ming-Yu Liu
[pdf]
[DOI]

Corner Proposal Network for Anchor-free, Two-stage Object Detection
Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
[pdf]
[DOI]

PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click
Henghui Ding, Scott Cohen, Brian Price, Xudong Jiang
[pdf]
[DOI]

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian, Dingzeyu Li, Chenliang Xu
[pdf]
[DOI]

Learning Delicate Local Representations for Multi-Person Pose Estimation
Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun
[pdf]
[DOI]

Learning to Plan with Uncertain Topological Maps
Edward Beeching, Jilles Dibangoye, Olivier Simonin, Christian Wolf
[pdf]
[DOI]

Neural Design Network: Graphic Layout Generation with Constraints
Hsin-Ying Lee, Lu Jiang, Irfan Essa, Phuong B Le, Haifeng Gong, Ming-Hsuan Yang, Weilong Yang
[pdf]
[DOI]

Learning Open Set Network with Discriminative Reciprocal Points
Guangyao Chen, Limeng Qiao, Yemin Shi, Peixi Peng, Jia Li, Tiejun Huang, Shiliang Pu, Yonghong Tian
[pdf]
[DOI]

Convolutional Occupancy Networks
Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger
[pdf]
[DOI]

Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry
He Chen, Pengfei Guo, Pengfei Li, Gim Hee Lee, Gregory Chirikjian
[pdf]
[DOI]

TIDE: A General Toolbox for Identifying Object Detection Errors
Daniel Bolya, Sean Foley, James Hays, Judy Hoffman
[pdf]
[DOI]

PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas Guibas, Or Litany
[pdf]
[DOI]

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation
Xuefei Ning, Tianchen Zhao, Wenshuo Li, Peng Lei, Yu Wang, Huazhong Yang
[pdf]
[DOI]

Circumventing Outliers of AutoAugment with Knowledge Distillation
Longhui Wei, An Xiao, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Qi Tian
[pdf]
[DOI]

S2DNet: Learning Image Features for Accurate Sparse-to-Dense Matching
Hugo Germain, Guillaume Bourmaud, Vincent Lepetit
[pdf]
[DOI]

RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving
Peixuan Li, Huaici Zhao, Pengfei Liu, Feidao Cao
[pdf]
[DOI]

Video Object Segmentation with Episodic Graph Memory Networks
Xiankai Lu, Wenguan Wang, Martin Danelljan, Tianfei Zhou, Jianbing Shen, Luc Van Gool
[pdf]
[DOI]

Rethinking Bottleneck Structure for Efficient Mobile Network Design
Daquan Zhou, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan
[pdf]
[DOI]

Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks
Jeffrey O. Zhang, Alexander Sax, Amir Zamir, Leonidas Guibas, Jitendra Malik
[pdf]
[DOI]

Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach
Zerui Chen, Yan Huang, Hongyuan Yu, Bin Xue, Ke Han, Yiru Guo, Liang Wang
[pdf]
[DOI]

REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets
Angelina Wang, Arvind Narayanan, Olga Russakovsky
[pdf]
[DOI]

Contrastive Learning for Weakly Supervised Phrase Grounding
Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem
[pdf]
[DOI]

Collaborative Learning of Gesture Recognition and 3D Hand Pose Estimation with Multi-Order Feature Analysis
Siyuan Yang, Jun Liu, Shijian Lu, Meng Hwa Er, Alex C. Kot
[pdf]
[DOI]

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors
Zuxuan Wu, Ser-Nam Lim, Larry S. Davis, Tom Goldstein
[pdf]
[DOI]

TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images
Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo
[pdf]
[DOI]

Semi-Siamese Training for Shallow Face Learning
Hang Du, Hailin Shi, Yuchi Liu, Jun Wang, Zhen Lei, Dan Zeng, Tao Mei
[pdf]
[DOI]

GAN Slimming: All-in-One GAN Compression by A Unified Optimization Framework
Haotao Wang, Shupeng Gui, Haichuan Yang, Ji Liu, Zhangyang Wang
[pdf]
[DOI]

Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition
Yukun Su, Guosheng Lin, Jinhui Zhu, Qingyao Wu
[pdf]
[DOI]

Binarized Neural Network for Single Image Super Resolution
Jingwei Xin, Nannan Wang, Xinrui Jiang, Jie Li, Heng Huang, Xinbo Gao
[pdf]
[DOI]

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
[pdf]
[DOI]

Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation
Zhipeng Fan, Jun Liu, Yao Wang
[pdf]
[DOI]

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking
Jinlong Peng, Changan Wang, Fangbin Wan, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu
[pdf]
[DOI]

Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, Dahua Lin
[pdf]
[DOI]

Hamiltonian Dynamics for Real-World Shape Interpolation
Marvin Eisenberger, Daniel Cremers
[pdf]
[DOI]

Learning to Scale Multilingual Representations for Vision-Language Tasks
Andrea Burns, Donghyun Kim, Derry Wijaya, Kate Saenko, Bryan A. Plummer
[pdf]
[DOI]

Multi-modal Transformer for Video Retrieval
Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid
[pdf]
[DOI]

Feature Representation Matters: End-to-End Learning for Reference-based Image Super-resolution
Yanchun Xie, Jimin Xiao, Mingjie Sun, Chao Yao, Kaizhu Huang
[pdf]
[DOI]

RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera
Zhuo Su, Lan Xu, Zerong Zheng, Tao Yu, Yebin Liu, Lu Fang
[pdf]
[DOI]

Surface Normal Estimation of Tilted Images via Spatial Rectifier
Tien Do, Khiem Vuong, Stergios I. Roumeliotis, Hyun Soo Park
[pdf]
[DOI]

Multimodal Shape Completion via Conditional Generative Adversarial Networks
Rundi Wu, Xuelin Chen, Yixin Zhuang, Baoquan Chen
[pdf]
[DOI]

Generative Sparse Detection Networks for 3D Single-shot Object Detection
JunYoung Gwak, Christopher Choy, Silvio Savarese
[pdf]
[DOI]

Grounded Situation Recognition
Sarah Pratt, Mark Yatskar, Luca Weihs, Ali Farhadi, Aniruddha Kembhavi
[pdf]
[DOI]

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen, Wenhao Jiang, Wei Liu, Yu-Gang Jiang
[pdf]
[DOI]

Unpaired Learning of Deep Image Denoising
Xiaohe Wu, Ming Liu, Yue Cao, Dongwei Ren, Wangmeng Zuo
[pdf]
[DOI]

Self-supervising Fine-grained Region Similarities for Large-scale Image Localization
Yixiao Ge, Haibo Wang, Feng Zhu, Rui Zhao, Hongsheng Li
[pdf]
[DOI]

Rotationally-Temporally Consistent Novel View Synthesis of Human Performance Video
Youngjoong Kwon, Stefano Petrangeli, Dahun Kim, Haoliang Wang, Eunbyung Park, Viswanathan Swaminathan, Henry Fuchs
[pdf]
[DOI]

Side-Aware Boundary Localization for More Precise Object Detection
Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin
[pdf]
[DOI]

SF-Net: Single-Frame Supervision for Temporal Action Localization
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou
[pdf]
[DOI]

Negative Margin Matters: Understanding Margin in Few-shot Classification
Bin Liu, Yue Cao, Yutong Lin, Qi Li, Zheng Zhang, Mingsheng Long, Han Hu
[pdf]
[DOI]

Particularity beyond Commonality: Unpaired Identity Transfer with Multiple References
Ruizheng Wu, Xin Tao, Yingcong Chen, Xiaoyong Shen, Jiaya Jia
[pdf]
[DOI]

Tracking Objects as Points
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl
[pdf]
[DOI]

CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis
Jiadong Liang, Wenjie Pei, Feng Lu
[pdf]
[DOI]

Transporting Labels via Hierarchical Optimal Transport for Semi-Supervised Learning
Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi
[pdf]
[DOI]

MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning
Simon Vandenhende, Stamatios Georgoulis, Luc Van Gool
[pdf]
[DOI]

Learning to Factorize and Relight a City
Andrew Liu, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros, Noah Snavely
[pdf]
[DOI]

Region Graph Embedding Network for Zero-Shot Learning
Guo-Sen Xie, Li Liu, Fan Zhu, Fang Zhao, Zheng Zhang, Yazhou Yao, Jie Qin, Ling Shao
[pdf]
[DOI]

GRAB: A Dataset of Whole-Body Human Grasping of Objects
Omid Taheri, Nima Ghorbani, Michael J. Black, Dimitrios Tzionas
[pdf]
[DOI]

DEMEA: Deep Mesh Autoencoders for Non-Rigidly Deforming Objects
Edgar Tretschk, Ayush Tewari, Michael Zollhöfer, Vladislav Golyanik, Christian Theobalt
[pdf]
[DOI]

RANSAC-Flow: Generic Two-stage Image Alignment
Xi Shen, François Darmon, Alexei A. Efros, Mathieu Aubry
[pdf]
[DOI]

Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool
[pdf]
[DOI]

Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images
Kiru Park, Timothy Patten, Markus Vincze
[pdf]
[DOI]

Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
Jianfeng Yan, Zizhuang Wei, Hongwei Yi, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
[pdf]
[DOI]

Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & Application
Xuchong Qiu, Yang Xiao, Chaohui Wang, Renaud Marlet
[pdf]
[DOI]

MovieNet: A Holistic Dataset for Movie Understanding
Qingqiu Huang, Yu Xiong, Anyi Rao, Jiaze Wang, Dahua Lin
[pdf]
[DOI]

Short-Term and Long-Term Context Aggregation Network for Video Inpainting
Ang Li, Shanshan Zhao, Xingjun Ma, Mingming Gong, Jianzhong Qi, Rui Zhang, Dacheng Tao, Ramamohanarao Kotagiri
[pdf]
[DOI]

DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization
Juan Du, Rui Wang, Daniel Cremers
[pdf]
[DOI]

Face Super-Resolution Guided by 3D Facial Priors
Xiaobin Hu, Wenqi Ren, John LaMaster, Xiaochun Cao, Xiaoming Li, Zechao Li, Bjoern Menze, Wei Liu
[pdf]
[DOI]

Label Propagation with Augmented Anchors: A Simple Semi-Supervised Learning baseline for Unsupervised Domain Adaptation
Yabin Zhang, Bin Deng, Kui Jia, Lei Zhang
[pdf]
[DOI]

Are Labels Necessary for Neural Architecture Search?
Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie
[pdf]
[DOI]

BLSM: A Bone-Level Skinned Model of the Human Mesh
Haoyang Wang, Riza Alp Güler, Iasonas Kokkinos, George Papandreou, Stefanos Zafeiriou
[pdf]
[DOI]

Associative Alignment for Few-shot Image Classification
Arman Afrasiyabi, Jean-François Lalonde, Christian Gagné
[pdf]
[DOI]

Cyclic Functional Mapping: Self-supervised Correspondence between Non-isometric Deformable Shapes
Dvir Ginzburg, Dan Raviv
[pdf]
[DOI]

View-Invariant Probabilistic Embedding for Human Pose
Jennifer J. Sun, Jiaping Zhao, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Ting Liu
[pdf]
[DOI]

Contact and Human Dynamics from Monocular Video
Davis Rempe, Leonidas J. Guibas, Aaron Hertzmann, Bryan Russell, Ruben Villegas, Jimei Yang
[pdf]
[DOI]

PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation
Wenxuan Wu, Zhi Yuan Wang, Zhuwen Li, Wei Liu, Li Fuxin
[pdf]
[DOI]

Points2Surf Learning Implicit Surfaces from Point Clouds
Philipp Erler, Paul Guerrero, Stefan Ohrhallinger, Niloy J. Mitra, Michael Wimmer
[pdf]
[DOI]

Few-Shot Scene-Adaptive Anomaly Detection
Yiwei Lu, Frank Yu, Mahesh Kumar Krishna Reddy, Yang Wang
[pdf]
[DOI]

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting
Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, Baoyuan Wang
[pdf]
[DOI]

Entropy Minimisation Framework for Event-based Vision Model Estimation
Urbano Miguel Nunes, Yiannis Demiris
[pdf]
[DOI]

Reconstructing NBA Players
Luyang Zhu, Konstantinos Rematas, Brian Curless, Steven M. Seitz, Ira Kemelmacher-Shlizerman
[pdf]
[DOI]

PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments
Zhiming Chen, Kean Chen, Weiyao Lin, John See, Hui Yu, Yan Ke, Cong Yang
[pdf]
[DOI]

TENet: Triple Excitation Network for Video Salient Object Detection
Sucheng Ren, Chu Han, Xin Yang, Guoqiang Han, Shengfeng He
[pdf]
[DOI]

Deep Feedback Inverse Problem Solver
Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun
[pdf]
[DOI]

Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
Liuyu Xiang, Guiguang Ding, Jungong Han
[pdf]
[DOI]

Hallucinating Visual Instances in Total Absentia
Jiayan Qiu, Yiding Yang, Xinchao Wang, Dacheng Tao
[pdf]
[DOI]

Weakly-supervised 3D Shape Completion in the Wild
Jiayuan Gu, Wei-Chiu Ma, Sivabalan Manivasagam, Wenyuan Zeng, Zihao Wang, Yuwen Xiong, Hao Su, Raquel Urtasun
[pdf]
[DOI]

DTVNet: Dynamic Time-lapse Video Generation via Single Still Image
Jiangning Zhang, Chao Xu, Liang Liu, Mengmeng Wang, Xia Wu, Yong Liu, Yunliang Jiang
[pdf]
[DOI]

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss
Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan
[pdf]
[DOI]

Collaborative Video Object Segmentation by Foreground-Background Integration
Zongxin Yang, Yunchao Wei, Yi Yang
[pdf]
[DOI]

Adaptive Margin Diversity Regularizer for handling Data Imbalance in Zero-Shot SBIR
Titir Dutta, Anurag Singh, Soma Biswas
[pdf]
[DOI]

ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation
Xucong Zhang, Seonwook Park, Thabo Beeler, Derek Bradley, Siyu Tang , Otmar Hilliges
[pdf]
[DOI]

Calibration-free Structure-from-Motion with Calibrated Radial Trifocal Tensors
Viktor Larsson, Nicolas Zobernig, Kasim Taskin, Marc Pollefeys
[pdf]
[DOI]

Occupancy Anticipation for Efficient Exploration and Navigation
Santhosh K. Ramakrishnan, Ziad Al-Halah, Kristen Grauman
[pdf]
[DOI]

Unified Image and Video Saliency Modeling
Richard Droste, Jianbo Jiao, J. Alison Noble
[pdf]
[DOI]

TAO: A Large-Scale Benchmark for Tracking Any Object
Achal Dave, Tarasha Khurana, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan
[pdf]
[DOI]

A Generalization of Otsu’s Method and Minimum Error Thresholding
Jonathan T. Barron
[pdf]
[DOI]

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks
Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
[pdf]
[DOI]

Big Transfer (BiT): General Visual Representation Learning
Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby
[pdf]
[DOI]

VisualCOMET: Reasoning about the Dynamic Context of a Still Image
Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, Yejin Choi
[pdf]
[DOI]

Few-shot Action Recognition with Permutation-invariant Attention
Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz
[pdf]
[DOI]

Character Grounding and Re-Identification in Story of Videos and Text Descriptions
Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung, Gunhee Kim
[pdf]
[DOI]

AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling
Wenshuo Ma, Tingzhong Tian, Hang Xu, Yimin Huang, Zhenguo Li
[pdf]
[DOI]

Learning Visual Context by Comparison
Minchul Kim, Jongchan Park, Seil Na, Chang Min Park, Donggeun Yoo
[pdf]
[DOI]

Large Scale Holistic Video Understanding
Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jürgen Gall, Rainer Stiefelhagen, Luc Van Gool
[pdf]
[DOI]

Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Krishna Kanth Nakka, Mathieu Salzmann
[pdf]
[DOI]

Predicting Visual Overlap of Images Through Interpretable Non-Metric Box Embeddings
Anita Rau, Guillermo Garcia-Hernando, Danail Stoyanov, Gabriel J. Brostow, Daniyar Turmukhambetov
[pdf]
[DOI]

Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset, Jasper Uijlings, Soravit Changpinyo, Radu Soricut, Vittorio Ferrari
[pdf]
[DOI]

Adversarial T-shirt! Evading Person Detectors in A Physical World
Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, Xue Lin
[pdf]
[DOI]

Bounding-box Channels for Visual Relationship Detection
Sho Inayoshi, Keita Otani, Antonio Tejero-de-Pablos, Tatsuya Harada
[pdf]
[DOI]

Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial Distortion
Zuzana Kukelova, Cenek Albl, Akihiro Sugimoto, Konrad Schindler, Tomas Pajdla
[pdf]
[DOI]

SRFlow: Learning the Super-Resolution Space with Normalizing Flow
Andreas Lugmayr, Martin Danelljan, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

DeepGMR: Learning Latent Gaussian Mixture Models for Registration
Wentao Yuan, Benjamin Eckart, Kihwan Kim, Varun Jampani, Dieter Fox , Jan Kautz
[pdf]
[DOI]

Active Perception using Light Curtains for Autonomous Driving
Siddharth Ancha, Yaadhav Raaj, Peiyun Hu, Srinivasa G. Narasimhan, David Held
[pdf]
[DOI]

Invertible Neural BRDF for Object Inverse Rendering
Zhe Chen, Shohei Nobuhara, Ko Nishino
[pdf]
[DOI]

Semi-supervised Semantic Segmentation via Strong-weak Dual-branch Network
Wenfeng Luo, Meng Yang
[pdf]
[DOI]

Practical Deep Raw Image Denoising on Mobile Devices
Yuzhi Wang, Haibin Huang, Qin Xu, Jiaming Liu, Yiqun Liu, Jue Wang
[pdf]
[DOI]

SoundSpaces: Audio-Visual Navigation in 3D Environments
Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, and Kristen Grauman
[pdf]
[DOI]

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
Yuanhao Zhai, Le Wang, Wei Tang, Qilin Zhang, Junsong Yuan, Gang Hua
[pdf]
[DOI]

Erasing Appearance Preservation in Optimization-based Smoothing
Lvmin Zhang, Chengze Li, Yi JI, Chunping Liu, Tien-tsin Wong
[pdf]
[DOI]

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler
Tsu-Jui Fu, Xin Eric Wang, Matthew F. Peterson,Scott T. Grafton, Miguel P. Eckstein, William Yang Wang
[pdf]
[DOI]

Guided Deep Decoder: Unsupervised Image Pair Fusion
Tatsumi Uezato, Danfeng Hong, Naoto Yokoya, Wei He
[pdf]
[DOI]

Filter Style Transfer between Photos
Jonghwa Yim, Jisung Yoo, Won-joon Do, Beomsu Kim, Jihwan Choe
[pdf]
[DOI]

JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network for 3D Hand Pose Estimation from a Single Depth Image
Linpu Fang, Xingyan Liu, Li Liu, Hang Xu, Wenxiong Kang
[pdf]
[DOI]

Dynamic Group Convolution for Accelerating Convolutional Neural Networks
Zhuo Su, Linpu Fang, Wenxiong Kang, Dewen Hu, Matti Pietikäinen, Li Liu
[pdf]
[DOI]

RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering
Yaoxiong Huang, Mengchao He, Lianwen Jin, Yongpan Wang
[pdf]
[DOI]

Object-Contextual Representations for Semantic Segmentation
Yuhui Yuan, Xilin Chen, Jingdong Wang
[pdf]
[DOI]

Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring
Zhihang Zhong, Ye Gao, Yinqiang Zheng, Bo Zheng
[pdf]
[DOI]

Joint Semantic Instance Segmentation on Graphs with the Semantic Mutex Watershed
Steffen Wolf, Yuyan Li, Constantin Pape, Alberto Bailoni, Anna Kreshuk, Fred A. Hamprecht
[pdf]
[DOI]

Photon-Efficient 3D Imaging with A Non-Local Neural Network
Jiayong Peng, Zhiwei Xiong, Xin Huang, Zheng-Ping Li, Dong Liu, Feihu Xu
[pdf]
[DOI]

GeLaTO: Generative Latent Textured Objects
Ricardo Martin-Brualla, Rohit Pandey, Sofien Bouaziz, Matthew Brown, Dan B Goldman
[pdf]
[DOI]

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra
[pdf]
[DOI]

Directional Temporal Modeling for Action Recognition
Xinyu Li, Bing Shuai, Joseph Tighe
[pdf]
[DOI]

Shonan Rotation Averaging: Global Optimality by Surfing SO(p)(n)
Frank Dellaert, David M. Rosen, Jing Wu, Robert Mahony, Luca Carlone
[pdf]
[DOI]

Semantic Curiosity for Active Visual Learning
Devendra Singh Chaplot, Helen Jiang, Saurabh Gupta, Abhinav Gupta
[pdf]
[DOI]

Multi-Temporal Recurrent Neural Networks For Progressive Non-Uniform Single Image Deblurring With Incremental Temporal Training
Dongwon Park, Dong Un Kang, Jisoo Kim, Se Young Chun
[pdf]
[DOI]

ProgressFace: Scale-Aware Progressive Learning for Face Detection
Jiashu Zhu, Dong Li, Tiantian Han, Lu Tian, Yi Shan
[pdf]
[DOI]

Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference
Erik Nijkamp, Bo Pang, Tian Han, Linqi Zhou, Song-Chun Zhu, Ying Nian Wu
[pdf]
[DOI]

CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos
Zhensheng Shi, Cheng Guan, Liangjie Cao, Qianqian Li, Ju Liang, Zhaorui Gu, Haiyong Zheng, Bing Zheng
[pdf]
[DOI]

Modeling the Effects of Windshield Refraction for Camera Calibration
Frank Verbiest, Marc Proesmans, Luc Van Gool
[pdf]
[DOI]

Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search
Prashant Pandey, Aayush Kumar Tyagi, Sameer Ambekar, Prathosh AP
[pdf]
[DOI]

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
Eunhyeok Park, Sungjoo Yoo
[pdf]
[DOI]

Visual Relation Grounding in Videos
Junbin Xiao, Xindi Shang, Xun Yang, Sheng Tang, Tat-Seng Chua
[pdf]
[DOI]

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
[pdf]
[DOI]

Controlling Style and Semantics in Weakly-Supervised Image Generation
Dario Pavllo, Aurelien Lucchi, Thomas Hofmann
[pdf]
[DOI]

Jointly learning visual motion and confidence from local patches in event cameras
Daniel R. Kepple, Daewon Lee, Colin Prepsius, Volkan Isler, Il Memming Park, Daniel D. Lee
[pdf]
[DOI]

SODA: Story Oriented Dense Video Captioning Evaluation Framework
Soichiro Fujita, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
[pdf]
[DOI]

Sketch-Guided Object Localization in Natural Images
Aditay Tripathi, Rajath R. Dani, Anand Mishra and Anirban Chakraborty
[pdf]
[DOI]

A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
Malik Boudiaf, Jérôme Rony, Imtiaz Masud Ziko, Eric Granger, Marco Pedersoli, Pablo Piantanida, Ismail Ben Ayed
[pdf]
[DOI]

Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu
[pdf]
[DOI]

The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement
William Peebles, John Peebles, Jun-Yan Zhu, Alexei Efros, Antonio Torralba
[pdf]
[DOI]

STAR: Sparse Trained Articulated Human Body Regressor
Ahmed A. A. Osman, Timo Bolkart, Michael J. Black
[pdf]
[DOI]

Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer
Xinghao Chen, Yiman Zhang, Yunhe Wang, Han Shu, Chunjing Xu, Chang Xu
[pdf]
[DOI]

Collaboration by Competition: Self-coordinated Knowledge Amalgamation for Multi-talent Student Learning
Sihui Luo, Wenwen Pan, Xinchao Wang, Dazhou Wang, Haihong Tang, Mingli Song
[pdf]
[DOI]

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians
Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun
[pdf]
[DOI]

Learning 3D Part Assembly from a Single Image
Yichen Li, Kaichun Mo, Lin Shao, Minhyuk Sung, Leonidas Guibas
[pdf]
[DOI]

PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
Kaichun Mo, He Wang, Xinchen Yan, Leonidas Guibas
[pdf]
[DOI]

Highly Efficient Salient Object Detection with 100K Parameters
Shang-Hua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan
[pdf]
[DOI]

HardGAN: A Haze-Aware Representation Distillation GAN for Single Image Dehazing
Qili Deng, Ziling Huang, Chung-Chi Tsai, Chia-Wen Lin
[pdf]
[DOI]

Lifespan Age Transformation Synthesis
Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, Ira Kemelmacher-Shlizerman
[pdf]
[DOI]

Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation
Xingchao Peng, Yichen Li, Kate Saenko
[pdf]
[DOI]

Simulating Content Consistent Vehicle Datasets with Attribute Descent
Yue Yao, Liang Zheng, Xiaodong Yang, Milind Naphade, Tom Gedeon
[pdf]
[DOI]

Multiview Detection with Feature Perspective Transformation
Yunzhong Hou, Liang Zheng, Stephen Gould
[pdf]
[DOI]

Learning Object Relation Graph and Tentative Policy for Visual Navigation
Heming Du, Xin Yu, Liang Zheng
[pdf]
[DOI]

Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition
Chenyang Si, Xuecheng Nie, Wei Wang, Liang Wang, Tieniu Tan, Jiashi Feng
[pdf]
[DOI]

Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning
Liad Pollak Zuckerman, Eyal Naor, George Pisha, Shai Bagon, Michal Irani
[pdf]
[DOI]

Inducing Optimal Attribute Representations for Conditional GANs
Binod Bhattarai, Tae-Kyun Kim
[pdf]
[DOI]

AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
Yue Meng, Chung-Ching Lin, Rameswar Panda, Prasanna Sattigeri, Leonid Karlinsky, Aude Oliva, Kate Saenko, Rogerio Feris
[pdf]
[DOI]

Image-to-Voxel Model Translation for 3D Scene Reconstruction and Segmentation
Vladimir V. Kniaz, Vladimir A. Knyaz, Fabio Remondino, Artem Bordodymov, Petr Moshkantsev
[pdf]
[DOI]

Consistency Guided Scene Flow Estimation
Yuhua Chen, Luc Van Gool, Cordelia Schmid, Cristian Sminchisescu
[pdf]
[DOI]

Autoregressive Unsupervised Image Segmentation
Yassine Ouali, Céline Hudelot, Myriam Tami
[pdf]
[DOI]

Controllable Image Synthesis via SegVAE
Yen-Chi Cheng, Hsin-Ying Lee, Min Sun, Ming-Hsuan Yang
[pdf]
[DOI]

Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search
Yuan Tian, Qin Wang, Zhiwu Huang, Wen Li, Dengxin Dai, Minghao Yang , Jun Wang, Olga Fink
[pdf]
[DOI]

Efficient Non-Line-of-Sight Imaging from Transient Sinograms
Mariko Isogawa, Dorian Chan, Ye Yuan, Kris Kitani, Matthew O’Toole
[pdf]
[DOI]

Texture Hallucination for Large-Factor Painting Super-Resolution
Yulun Zhang, Zhifei Zhang, Stephen DiVerdi, Zhaowen Wang, Jose Echevarria, Yun Fu
[pdf]
[DOI]

Learning