Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. [...]
Can we find a single small image perturbation that fools a state-of-the-art deep neural network classifier on all natural images? We show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassified natural images with high probability. Specifically, by adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability.
Such perturbations are dubbed universal, as they are image-agnostic. The existence of these perturbations is problematic when the classifier is deployed in real-world (and possibly hostile) environments, as such a single perturbation can be exploited by adversaries to break the classifier. Indeed, the perturbation process involves the mere addition of one very small perturbation to all natural images, and can be relatively straightforward to implement by adversaries in real-world environments, while being relatively difficult to detect as such perturbations are very small and thus do not significantly affect data distributions. The surprising existence of universal perturbations further reveals new insights on the topology of the decision boundaries of deep neural networks.
This technology could dramatically impact the SCORPION STARE program. But I know how I'm convolving my selfies from now on!