Connecting the Dots: Detecting Adversarial Perturbations Using Context Inconsistency

Shasha Li, Shitong Zhu, Sudipta Paul, Amit Roy-Chowdhury, Chengyu Song, Srikanth Krishnamurthy, Ananthram Swami, Kevin S Chan ;

Abstract


There has been a recent surge in research on adversarial perturbations that defeat Deep Neural Networks (DNNs); most of these attacks target object classifiers. Inspired by the observation that humans are able to recognize objects that appear out of place in a scene or along with other unlikely objects, we augment the DNN with a system that learns context consistency rules during training and checks for the violations of the same during testing. In brief, our approach builds a set of autoencoders, one for each object class, appropriately trained so as to output a discrepancy between the input and output if a perturbation was added to the sample and trigger context violation. Experiments on PASCAL VOC and MS COCO show that our method effectively detects various adversarial attacks and achieves high ROC-AUC (over 0.95 in most cases); this corresponds to over 20-45 % improvement over a baseline context agnostic method."

Related Material


[pdf]