Today in CV-Dazzle: Universal Adversarial Perturbations

Universal Adversarial Perturbations

Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. [...]

Can we find a single small image perturbation that fools a state-of-the-art deep neural network classifier on all natural images? We show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassified natural images with high probability. Specifically, by adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability.

Such perturbations are dubbed universal, as they are image-agnostic. The existence of these perturbations is problematic when the classifier is deployed in real-world (and possibly hostile) environments, as such a single perturbation can be exploited by adversaries to break the classifier. Indeed, the perturbation process involves the mere addition of one very small perturbation to all natural images, and can be relatively straightforward to implement by adversaries in real-world environments, while being relatively difficult to detect as such perturbations are very small and thus do not significantly affect data distributions. The surprising existence of universal perturbations further reveals new insights on the topology of the decision boundaries of deep neural networks.

This technology could dramatically impact the SCORPION STARE program. But I know how I'm convolving my selfies from now on!

Previously, previously, previously, previously, previously, previously, previously, previously, previously, previously, previously.

Tags: , , , ,

13 Responses:

  1. Also looks like a step towards optic nerve hacking. If it perturbs artificial neural nets why not meat ones?

    • jwz says:

      Photosensitive epilepsy as malloc bomb.

    • tfb says:

      More interesting is that these perturbations don't seem to work on humans: it's been obvious for a while that we're in the late stages of an AI-hype cycle (which must be the fourth?), with claims being made that AI systems are just about to be better than humans at (image recognition|driving cars|medical diagnosis|&c). Except they turn out to be absurdly fragile in the way that AI systems have always been and humans tend not to be. And here's an example of just such fragility.

      (Note I'm not claiming that human optical systems can do magic, or are not susceptible to tricks like this, just that they may be harder to do and less general.)

      • X says:

        I can guess a reason for the discrepancy: Human visual recognition algorithms have been "trained"/evolved with only the color space, resolution, and visual information available to human eyes. Image recognition AIs, on the other hand, use (I could imagine) only the raw, full colourspace, high-resolution image data.

        If an image classifier were to be trained on only data available to human eyes, a "universal perturbation" effective on this AI just might be effective on human vision as well.

        • jwz says:

          A filter that prevents both human eyes and A-eyes from classifying an image is a solid black rectangle.

  2. Brian Van Nieuwenhoven says:

    Oh, great, iPhoto tagged all my selfies again with "Nyarlathotep", SMH

  3. pavel_lishin says:

    Great, can't wait to order my pair of shatter shades so I don't accidentally see a Langford Parrot in the subway on my way to work.

  4. JJP says:

    Video demonstration (cited in the paper):

  5. Opus says:

    (digs through his closet for that 'ol cylinder seal containing the nam-shub of Enki...)

  6. Rob says:

    Is this one of those images where you slightly cross your eyes and all of a sudden you see a picture of a unicorn or something in 3-D ? I've been staring at it for the past hour and still nothing...

  7. pat says:

    Watching the video, the mis-classifications seem to all be "here's something that's usually pretty smooth/flat, and it's mis-classified as something with texture", no huge surprise given that the perturbation reminds me of patterns if you look really closely at animal skin. It would be interesting to see what happens to the classifier if you were to apply it to images that already exhibit texture at the scale the "universal perturbation" looks to have.

    I only glanced at the paper, but section 4 (Explaining the vulnerability to universal perturbations) reads to me like they've identified "decision boundaries", where the algorithms are particularly sensitive to disturbances.

    • yan says:

      I had a similar thought. The multiple ids as "space heater" reminded me of the moire patterns that occur when scanning printed material at a sympathetic resolution to the printing resolution.

  8. Tim says:

    Re: dazzle, this floating hotel seems suitably dazzling: