Rethinking Pseudo-LiDAR Representation
The recently proposed pseudo-LiDAR based 3D detectors greatly improves the benchmark of monocular/stereo 3D detection task. However, the underlying mechanism is still obscure to the research community. In this paper, we perform an in-depth investigation and observe that the pseudo-LiDAR representation is effective because of the coordinate transformation, instead of data representation itself. Based on this observation, we design an image based CNN detector named PatchNet, which is a generalized version that can represent pseudo-LiDAR based 3D detectors. In PatchNet, we organize pseudo-LiDAR data as the image representation, which means existing 2D CNN designs can be easily utilized for extracting deep features from input data and boosting 3D detection performance. We conduct extensive experiments on the challenging KITTI dataset, where the proposed PatchNet outperforms all existing pseudo-LiDAR based counterparts. Co de has been made available at: https://github.com/xinzhuma/patchnet"