ECVA | European Computer Vision Association

Data Efficient 3D Learner via Knowledge Transferred from 2D Model

Ping-Chung Yu, Cheng Sun, Min Sun ;

Abstract

"Collecting and labeling 3D data is costly. As a result, 3D resources for training are typically limited in quantity compared to the 2D images counterpart. In this work, we deal with the data scarcity challenge of 3D tasks by transferring knowledge from strong 2D models via abundant RGB-D images. Specifically, we utilize an strong and well-trained semantic segmentation model for 2D images to augment abundant RGB-D images with pseudo-label. The augmented dataset can then be used to pre-train 3D models. Finally, by simply fine-tuning on a few labeled 3D instances, our method already outperforms existing state-of-the-art that is tailored for 3D label efficiency. We further improve our results by using simple semi-supervised techniques ({\it i.e.}, mean teacher and entropy minimization). We verify the effectiveness of our pre-training on two popular 3D models on three different tasks. On ScanNet official evaluation, we establish new state-of-the-art semantic segmentation results on the data-efficient track."

Related Material

[pdf] [supplementary material] [DOI]