SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling
"Downsampling is widely adopted to achieve a good trade-off between accuracy and latency for visual recognition. However, the commonly used pooling layers are not learned, which causes possible loss of information. As another dimension reduction method, adaptive sampling weights and processes regions that are relevant to the task, which can better preserve useful information. However, the use of adaptive sampling has been limited to certain layers. In this paper, we show that using adaptive sampling as the main component in a deep neural network can improve network efficiency. In particular, we propose SSBNet which is built by inserting sampling layers into existing networks like ResNet. The proposed SSBNet achieved competitive results in the ImageNet and COCO datasets. For example, the SSB-ResNet-RS-200 achieved 82.6% accuracy in the ImageNet dataset, which is 0.6% higher than the baseline ResNet-RS-152 with similar complexity. Visualization shows the advantage of SSBNet in allowing different layers to focus on different positions, and ablation studies further validate the advantage of adaptive sampling over uniform methods."