Local Correlation Consistency for Knowledge Distillation
Sufficient knowledge extraction from the teacher network plays a critical role in the knowledge distillation task to improve the performance of the student network. Existing methods mainly focus on the consistency of instance-level features and their relationships, but neglect the local features and their correlation, which also contain many details and discriminative patterns. In this paper, we propose the local correlation exploration framework for knowledge distillation. It models three kinds of local knowledge, including intra-instance local relationship, inter-instance relationship on the same local position, and the inter-instance relationship across different local positions. Moreover, to make the student focus on those informative local regions of the teacher's feature maps, we propose a novel class-aware attention module to highlight the class-relevant regions and remove the confusing class-irrelevant regions, which makes the local correlation knowledge more accurate and valuable. We conduct extensive experiments and ablation studies on challenging datasets, including CIFAR100 and ImageNet, to show our superiority over the state-of-the-art methods."