SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image Editing
Deep neural networks (DNNs) have achieved great successes in various vision applications due to their strong expressive power. However, recent studies have shown that DNNs are vulnerable to adversarial examples which are manipulated instances targeting to mislead DNNs to make incorrect predictions. Currently, most such adversarial examples try to guarantee ``subtle perturbation"" by limiting the $L_p$ norm of the perturbation. In this paper, we propose SemanticAdv to generate a new type of semantically realistic adversarial examples via attribute-conditioned image editing. Compared to existing methods, our SemanticAdv enables fine-grained analysis and evaluation of DNNs with input variations in the attribute space. We conduct comprehensive experiments to show that our adversarial examples not only exhibit semantically meaningful appearances but also achieve high targeted attack success rates under both whitebox and blackbox settings. Moreover, we show that the existing pixel-based and attribute-based defense methods fail to defend against our attribute-conditioned adversarial examples. We demonstrate the applicability of SemanticAdv on both face recognition and general street-view images to show its generalization. Such non-$L_p$ bounded adversarial examples with controlled attribute manipulation can shed light on further understanding about vulnerabilities of DNNs as well as novel defense approaches. "