Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Confidence-Calibrated Adversarial Training and Detection: More Robust Models Generalizing Beyond the Attack Used During Training

Stutz, D., Hein, M., & Schiele, B. (2019). Confidence-Calibrated Adversarial Training and Detection: More Robust Models Generalizing Beyond the Attack Used During Training. Retrieved from http://arxiv.org/abs/1910.06259.

Item is

Basisdaten

einblenden: ausblenden:
Genre: Forschungspapier

Dateien

einblenden: Dateien
ausblenden: Dateien
:
arXiv:1910.06259.pdf (Preprint), 2MB
Name:
arXiv:1910.06259.pdf
Beschreibung:
File downloaded from arXiv at 2019-12-09 13:21
OA-Status:
Sichtbarkeit:
Öffentlich
MIME-Typ / Prüfsumme:
application/pdf / [MD5]
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Stutz, David1, Autor           
Hein, Matthias2, Autor
Schiele, Bernt1, Autor           
Affiliations:
1Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society, ou_1116547              
2External Organizations, ou_persistent22              

Inhalt

einblenden:
ausblenden:
Schlagwörter: Computer Science, Learning, cs.LG,Computer Science, Cryptography and Security, cs.CR,Computer Science, Computer Vision and Pattern Recognition, cs.CV,Statistics, Machine Learning, stat.ML
 Zusammenfassung: Adversarial training is the standard to train models robust against
adversarial examples. However, especially for complex datasets, adversarial
training incurs a significant loss in accuracy and is known to generalize
poorly to stronger attacks, e.g., larger perturbations or other threat models.
In this paper, we introduce confidence-calibrated adversarial training (CCAT)
where the key idea is to enforce that the confidence on adversarial examples
decays with their distance to the attacked examples. We show that CCAT
preserves better the accuracy of normal training while robustness against
adversarial examples is achieved via confidence thresholding, i.e., detecting
adversarial examples based on their confidence. Most importantly, in strong
contrast to adversarial training, the robustness of CCAT generalizes to larger
perturbations and other threat models, not encountered during training. For
evaluation, we extend the commonly used robust test error to our detection
setting, present an adaptive attack with backtracking and allow the attacker to
select, per test example, the worst-case adversarial example from multiple
black- and white-box attacks. We present experimental results using $L_\infty$,
$L_2$, $L_1$ and $L_0$ attacks on MNIST, SVHN and Cifar10.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2019-10-142019-11-252019
 Publikationsstatus: Online veröffentlicht
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: arXiv: 1910.06259
URI: http://arxiv.org/abs/1910.06259
BibTex Citekey: Stutz_arXiv1910.06259
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle

einblenden: