Geometry Constrained Weakly Supervised Object Localization
We propose a geometry constrained network, termed GCNet, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classiﬁer. The detector predicts the object location deﬁned by a set of coeﬃcients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classiﬁer takes the resulting masked images as input and performs two complementary classiﬁcation tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net."