Dynamic Metric Learning with Cross-Level Concept Distillation
"A good similarity metric should be consistent with the human perception of similarities: a sparrow is more similar to an owl if compared to a dog but is more similar to a dog if compared to a car. It depends on the semantic levels to determine if two images are from the same class. As most existing metric learning methods push away interclass samples and pull closer intraclass samples, it seems contradictory if the labels cross semantic levels. The core problem is that a negative pair on a finer semantic level can be a positive pair on a coarser semantic level, so pushing away this pair damages the class structure on the coarser semantic level. We identify the negative repulsion as the key obstacle in existing methods since a positive pair is always positive for coarser semantic levels but not for negative pairs. Our solution, cross-level concept distillation (CLCD), is simple in concept: we only pull closer positive pairs. To facilitate the cross-level semantic structure of the image representations, we propose a hierarchical concept refiner to construct multiple levels of concept embeddings of an image and then pull closer the distance of the corresponding concepts. Extensive experiments demonstrate that the proposed CLCD method outperforms all other competing methods on the hierarchically labeled datasets."