ECVA | European Computer Vision Association

Distance-Normalized Unified Representation for Monocular 3D Object Detection

Xuepeng Shi, Zhixiang Chen, Tae-Kyun Kim ;

Abstract

Monocular 3D object detection plays an important role in autonomous driving and still remains challenging. To achieve fast and accurate monocular 3D object detection, we introduce a single-stage and multi-scale framework to learn a unified representation for objects within different distance ranges, termed as UR3D. UR3D formulates different tasks of detection by exploiting the scale information, to reduce model capacity requirement and achieve accurate monocular 3D object detection. Besides, distance estimation is enhanced by a distance-guided NMS, which automatically selects candidate boxes with better distance estimates. In addition, an efficient fully convolutional cascaded point regression method is proposed to infer accurate locations of the projected 2D corners and centers of 3D boxes, which can be used to recover object physical size and orientation by a projection-consistency loss. Experimental results on the challenging KITTI autonomous driving dataset show that UR3D achieves accurate monocular 3D object detection with a compact architecture."

Related Material

[pdf]