Dual Grid Net: Hand Mesh Vertex Regression from Single Depth Maps
We aim to recover the dense 3D surface of the hand from depth maps and propose a network that can predict mesh vertices, transformation matrices for every joint and joint coordinates in a single forward pass. Use fully convolutional architectures, we first map depth image features to the mesh grid and then regress the mesh coordinates into real world 3D coordinates. The final mesh is found by sampling from the mesh grid refit in closed-form based on an articulated template mesh.
When trained with supervision from sparse key-points, our accuracy is comparable with state-of-the-art on the NYU dataset for key point localization, all while recovering mesh vertices and dense correspondences. Under multi-view settings for training, our framework can also learn through self-supervision by minimizing a set of data-fitting terms and kinematic priors. Our approach is competitive with strongly supervised methods and showcases the potential for self-supervision in dense mesh estimation."