hide
Free keywords:
Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Cryptography and Security, cs.CR,Computer Science, Learning, cs.LG,Statistics, Machine Learning, stat.ML
Abstract:
Obtaining deep networks that are robust against adversarial examples and
generalize well is an open problem. A recent hypothesis even states that both
robust and accurate models are impossible, i.e., adversarial robustness and
generalization are conflicting goals. In an effort to clarify the relationship
between robustness and generalization, we assume an underlying, low-dimensional
data manifold and show that: 1. regular adversarial examples leave the
manifold; 2. adversarial examples constrained to the manifold, i.e.,
on-manifold adversarial examples, exist; 3. on-manifold adversarial examples
are generalization errors, and on-manifold adversarial training boosts
generalization; 4. and regular robustness is independent of generalization.
These assumptions imply that both robust and accurate models are possible.
However, different models (architectures, training strategies etc.) can exhibit
different robustness and generalization characteristics. To confirm our claims,
we present extensive experiments on synthetic data (with access to the true
manifold) as well as on EMNIST, Fashion-MNIST and CelebA.