Adaptive Co-Teaching for Unsupervised Monocular Depth Estimation
"Unsupervised depth estimation using photometric losses suffers from local minimum and training instability. We address this issue by proposing an adaptive co-teaching framework to distill the learned knowledge from unsupervised teacher networks to a student network. We design an ensemble architecture for our teacher networks, integrating a depth basis decoder with multiple depth coefficient decoders. Depth prediction can then be formulated as a combination of the predicted depth bases weighted by coefficients. By further constraining their correlations, multiple coefficient decoders can yield a diversity of depth predictions, serving as the ensemble teachers. During the co-teaching step, our method allows different supervision sources from not only ensemble teachers but also photometric losses to constantly compete with each other, and adaptively select the optimal ones to teach the student, which effectively improves the ability of the student to jump out of the local minimum. Our method is shown to significantly benefit unsupervised depth estimation and sets new state of the art on both KITTI and Nuscenes datasets."