Intra-class Feature Variation Distillation for Semantic Segmentation
Current state-of-the-art semantic segmentation methods usually require high computational resources for accurate segmentation. One promising way to achieve a good trade-off between segmentation accuracy and efficiency is knowledge distillation. In this paper, different from previous methods performing knowledge distillation for densely pairwise relations, we propose a novel intra-class feature variation distillation (IFVD) to transfer the intra-class feature variation (IFV) of the cumbersome model (teacher) to the compact model (student). Concretely, we compute the feature center (regarded as the prototype) of each class and characterize the IFV with the set of similarity between the feature on each pixel and its corresponding class-wise prototype. The teacher model usually learns more robust intra-class feature representation than the student model, making them have different IFV. Transferring such IFV from teacher to student could make the student mimic the teacher better in terms of feature distribution, and thus improve the segmentation accuracy. We evaluate the proposed approach on three widely adopted benchmarks: Cityscapes, CamVid and Pascal VOC 2012, consistently improving state-of-the-art methods."