Exploiting Temporal Coherence for Self-Supervised One-shot Video Re-identification
While supervised techniques in re-identification are extremely effective, the need for large amounts of annotations makes them impractical for large camera networks. One-shot re-identification, which uses a singular labeled tracklet for each identity along with a pool of unlabeled tracklets, is a potential candidate towards reducing this labeling effort. Current one-shot re-identification methods function by modeling the inter-relationships amongst the labeled and the unlabeled data, but fail to fully exploit such relationships that exist within the pool of unlabeled data itself. In this paper, we propose a new framework named Temporal Consistency Progressive Learning, which uses temporal coherence as a novel self-supervised auxiliary task in the one-shot learning paradigm to capture such relationships amongst the unlabeled tracklets. Optimizing two new losses, which enforce consistency on a local and global scale, our framework can learn learn richer and more discriminative representations. Extensive experiments on two challenging video re-identification datasets - MARS and DukeMTMC-VideoReID - demonstrate that our proposed method is able to estimate the true labels of the unlabeled data more accurately by up to $8\%$, and obtain significantly better re-identification performance compared to the existing state-of-the-art techniques."