English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Towards the First Adversarially Robust Neural Network Model on MNIST

Schott, L., Rauber, J., Bethge, M., & Brendel, W. (2019). Towards the First Adversarially Robust Neural Network Model on MNIST. In Seventh International Conference on Learning Representations (ICLR 2019) (pp. 1-16).

Item is

Basic

show hide
Item Permalink: http://hdl.handle.net/21.11116/0000-0003-743A-A Version Permalink: http://hdl.handle.net/21.11116/0000-0003-743B-9
Genre: Conference Paper

Files

show Files

Locators

show
hide
Description:
-

Creators

show
hide
 Creators:
Schott, L, Author
Rauber, J, Author
Bethge, M1, 2, Author              
Brendel, W, Author
Affiliations:
1Research Group Computational Vision and Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497805              
2Max Planck Institute for Biological Cybernetics, Max Planck Society, Spemannstrasse 38, 72076 Tübingen, DE, ou_1497794              

Content

show
hide
Free keywords: -
 Abstract: Despite much effort, deep neural networks remain highly susceptible to tiny input perturbations and even for MNIST, one of the most common toy datasets in computer vision, no neural network model exists for which adversarial perturbations are large and make semantic sense to humans. We show that even the widely recognized and by far most successful defense by Madry et al. (1) overfits on the L-infinity metric (it's highly susceptible to L2 and L0 perturbations), (2) classifies unrecognizable images with high certainty, (3) performs not much better than simple input binarization and (4) features adversarial perturbations that make little sense to humans. These results suggest that MNIST is far from being solved in terms of adversarial robustness. We present a novel robust classification model that performs analysis by synthesis using learned class-conditional data distributions. We derive bounds on the robustness and go to great length to empirically evaluate our model using maximally effective adversarial attacks by (a) applying decision-based, score-based, gradient-based and transfer-based attacks for several different Lp norms, (b) by designing a new attack that exploits the structure of our defended model and (c) by devising a novel decision-based attack that seeks to minimize the number of perturbed pixels (L0). The results suggest that our approach yields state-of-the-art robustness on MNIST against L0, L2 and L-infinity perturbations and we demonstrate that most adversarial examples are strongly perturbed towards the perceptual boundary between the original and the adversarial class.

Details

show
hide
Language(s):
 Dates: 2019-04
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Method: -
 Identifiers: -
 Degree: -

Event

show
hide
Title: Seventh International Conference on Learning Representations (ICLR 2019)
Place of Event: New Orleans, LA, USA
Start-/End Date: 2019-05-06 - 2019-05-09

Legal Case

show

Project information

show

Source 1

show
hide
Title: Seventh International Conference on Learning Representations (ICLR 2019)
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 1 - 16 Identifier: -