Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks
We propose novel approaches for simultaneously identifying important weights of a convolutional neural network (ConvNet) and providing more attention to the important weights during training. More formally, we identify two characteristics of a weight, its magnitude and its location, which can be linked with the importance of the weight. By targeting these characteristics of a weight during training, we develop two separate weight excitation (WE) mechanisms via weight reparameterization-based backpropagation modifications. We demonstrate significant improvements over popular baseline ConvNets on multiple computer vision applications using WE (e.g. 1.3% accuracy improvement over ResNet50 baseline on ImageNet image classification, etc.). These improvements come at no extra computational cost or ConvNet structural change during inference. Additionally, including WE methods in a convolution block is straightforward, requiring few lines of extra code. Lastly, WE mechanisms can provide complementary benefits when used with external attention mechanisms such as the popular Squeeze-and-Excitation attention block."