Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding across Heads

Jiawei Ma, Guangxing Han, Shiyuan Huang, Yuncong Yang, Shih-Fu Chang ;


"Few-shot object detection (FSOD) aims to detect objects of new classes and learn effective models without exhaustive annotation. The end-to-end detection framework has been proposed to generate sparse proposals and set a stack of detection heads to improve the performance. For each proposal, the predictions at lower heads are fed into deeper heads. However, the deeper head may not concentrate on the detected objects and then degrades, resulting in inefficient training and further limiting the performance gain in few-shot scenario. In this paper, we propose a few-shot adaptation strategy, Constantly Concentrated Encoding across heads (CoCo-RCNN), for the end-to-end detectors. For each class, we gather the encodings which detect on its object instances and then train them to be discriminative to avoid degraded prediction. In addition, we embed the class-relevant encodings to the learnable proposals to facilitate the adaptation at lower heads. Extensive experimental results show that our model brought clear gain on benchmarks. Detailed ablation studies are provided to justify the selection of each component."

Related Material

[pdf] [supplementary material] [DOI]