Manifold Projection for Adversarial Defense on Face Recognition
Although deep convolutional neural network based face recognition system has achieved remarkable success, it is susceptible to adversarial images: carefully constructed imperceptible perturbations can easily mislead deep neural networks. A recent study has shown that in addition to regular off-manifold adversarial images, there are also adversarial images on the manifold. In this paper, we propose adversarial variational autoencoder (A-VAE), a novel framework to tackle both types of attacks. We hypothesize that both off-manifold and on-manifold attacks move the image away from the high probability region of image manifold. We utilize variational autoencoder (VAE) to estimate the lower bound of the log-likelihood of image and explore to project the input images back into the high probability regions of image manifold again. At inference time, our model synthesizes multiple similar realizations of a given image by random sampling, then the nearest neighbor of the given image is selected as the final input of the face recognition model. We also use adversarial training to enhance the robustness of our model against adversarial perturbations. As a preprocessing operation, our method is attack-agnostic and can adapt to a wide range of resolutions. The experimental results on LFW demonstrate that our method achieves state-of-the-art defense success rate against conventional off-manifold attacks such as FGSM, PGD, and C\&W under both grey-box and white-box settings, and even on-manifold attack."