3D Scene Reconstruction from a Single Viewport
We present a novel approach to infer volumetric reconstructions from a single viewport, based only on an RGB image and a reconstructed normal image. To overcome the problem of reconstructing regions in 3D that are occluded in the 2D image, we propose to learn this information from synthetically generated high-resolution data. To do this, we introduce a deep network architecture that is specifically designed for volumetric TSDF data by featuring a specific tree net architecture. Our framework can handle a 3D resolution of $512^3$ by introducing a dedicated compression technique based on a modified autoencoder. Furthermore, we introduce a novel loss shaping technique for 3D data that guides the learning process towards regions where free and occupied space are close to each other. As we show in experiments on synthetic and realistic benchmark data, this leads to very good reconstruction results, both visually and in terms of quantitative measures. "