Patch-wise Attack for Fooling Deep Neural Network

Lianli Gao, Qilong Zhang, Jingkuan Song, Xianglong Liu, Heng Tao Shen ;

Abstract


By adding human-imperceptible noise to clean images, the resultant adversarial examples can fool other unknown models. Features of a pixel extracted by deep neural networks (DNNs) are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. Motivated by this, we propose a \underline{patch}-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models, which differs from the existing attack methods manipulating \underline{pixel}-wise noise. In this way, without sacrificing the performance of the substitute model, our adversarial examples can have strong transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $psilon$-constraint is properly assigned to its surrounding regions by a projection kernel. Our method can be generally integrated to any gradient-based attack methods. Compared with the current state-of-the-art attacks, we significantly improve the success rate by 9.2\% for defense models and 3.7\% for normally trained models on average. Our anonymous code is available at \url{http://tiny.cc/da7vkz}"

Related Material


[pdf]