Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking

Yingjie Yao, Xiaohe Wu, Lei Zhang, Shiguang Shan, Wangmeng Zuo ; The European Conference on Computer Vision (ECCV), 2018, pp. 552-567


Correlation filter (CF) based trackers generally include two modules, i.e., feature representation and on-line model adaptation. In existing off-line deep learning models for CF trackers, the model adaptation usually is either abandoned or has closed-form solution to make it feasible to learn deep representation in an end-to-end manner. However, such solutions fail to exploit the advances in CF models, and cannot achieve competitive accuracy in comparison with the state-of-the-art CF trackers. In this paper, we investigate the joint learning of deep representation and model adaptation, where an updater network is introduced for better tracking on future frame by taking current frame representation, tracking result, and last CF tracker as input. By modeling the representor as convolutional neural network (CNN), we truncate the alternating direction method of multipliers (ADMM) and interpret it as a deep network of updater, resulting in our model for learning representation and truncated inference (RTINet). Experiments demonstrate that our RTINet tracker achieves favorable tracking accuracy against the state-of-the-art trackers and its rapid version can run at a real-time speed of 24 fps. The code and pre-trained model will be publicly available.

Related Material

author = {Yao, Yingjie and Wu, Xiaohe and Zhang, Lei and Shan, Shiguang and Zuo, Wangmeng},
title = {Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}