Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning
"Spiking Neural Networks (SNNs) can be energy efficient alternatives to commonly used deep neural networks (DNNs). However, computation over multiple timesteps increases latency and energy and incurs memory access overhead of membrane potentials. Hence, latency reduction is pivotal to obtain SNNs with high energy efficiency. But, reducing latency can have an adverse effect on accuracy. To optimize the accuracy-energy-latency trade-off, we propose a temporal pruning method which starts with an SNN of T timesteps, and reduces T every iteration of training, with threshold and leak as trainable parameters. This results in a continuum of SNNs from T timesteps, all the way up to unit timestep. Training SNNs directly with 1 timestep results in convergence failure due to layerwise spike vanishing and difficulty in finding optimum thresholds. The proposed temporal pruning overcomes this by enabling the learning of suitable layerwise thresholds with backpropagation by maintaining sufficient spiking activity. Using the proposed algorithm, we achieve top-1 accuracy of 93.05%, 70.15% and 69.00% on CIFAR-10, CIFAR-100 and ImageNet, respectively with VGG16, in just 1 timestep. Note, SNNs with leaky-integrate-and-fire (LIF) neurons behave as Recurrent Neural Networks (RNNs), with the membrane potential retaining information of previous inputs. The proposed SNNs also enable performing sequential tasks such as reinforcement learning on Cartpole and Atari pong environments using only 1 to 5 timesteps."