B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable

Arya, Shreyash; Rao, Sukrut; Boehle, Moritz; Schiele, Bernt

Lokale TagsFreigabegeschichteDetailsÜbersicht

B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable

Arya, S., Rao, S., Boehle, M., & Schiele, B. (2024). B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, et al. (Eds.), Advances in Neural Information Processing Systems 37 (pp. 62756-62786). Curran Associates, Inc.

Item is Freigegeben

einblenden: alle

Basisdaten

ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-0010-0FBE-9 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0010-C40E-2

Genre: Konferenzbeitrag

Latex : B-cosification: {T}ransforming Deep Neural Networks to be Inherently Interpretable

Dateien

ausblenden: Dateien

:

arXiv:2411.00715.pdf (Preprint), 26MB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/21.11116/0000-0010-0FC0-5

Name:
arXiv:2411.00715.pdf

Beschreibung:
File downloaded from arXiv at 2024-11-12 09:15 Neural Information Processing Systems (NeurIPS) 2024

OA-Status:
Keine Angabe

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
-

Copyright Info:
-

Lizenz:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Externe Referenzen

ausblenden:

externe Referenz:
https://proceedings.neurips.cc/paper_files/paper/2024/file/72d50a87b218d84c175d16f4557f7e12-Paper-Conference.pdf (Verlagsversion) Open Access Status unbekannt

Beschreibung:
-

OA-Status:
Keine Angabe

Urheber

ausblenden:

Urheber:
Arya, Shreyash¹, Autor
Rao, Sukrut¹, Autor
Boehle, Moritz¹, Autor
Schiele, Bernt¹, Autor

Affiliations:
1Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society, ou_1116547

Inhalt

ausblenden:

Schlagwörter: Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Artificial Intelligence, cs.AI,Computer Science, Learning, cs.LG

Zusammenfassung: B-cos Networks have been shown to be effective for obtaining highly human
interpretable explanations of model decisions by architecturally enforcing
stronger alignment between inputs and weight. B-cos variants of convolutional
networks (CNNs) and vision transformers (ViTs), which primarily replace linear
layers with B-cos transformations, perform competitively to their respective
standard variants while also yielding explanations that are faithful by design.
However, it has so far been necessary to train these models from scratch, which
is increasingly infeasible in the era of large, pre-trained foundation models.
In this work, inspired by the architectural similarities in standard DNNs and
B-cos networks, we propose 'B-cosification', a novel approach to transform
existing pre-trained models to become inherently interpretable. We perform a
thorough study of design choices to perform this conversion, both for
convolutional neural networks and vision transformers. We find that
B-cosification can yield models that are on par with B-cos models trained from
scratch in terms of interpretability, while often outperforming them in terms
of classification performance at a fraction of the training cost. Subsequently,
we apply B-cosification to a pretrained CLIP model, and show that, even with
limited data and compute cost, we obtain a B-cosified version that is highly
interpretable and competitive on zero shot performance across a variety of
datasets. We release our code and pre-trained model weights at
https://github.com/shrebox/B-cosification.

Details

ausblenden:

Sprache(n): eng - English

Datum: Erstellt: 2024-11-01Angenommen: 2024Online veröffentlicht: 2024

Publikationsstatus: Online veröffentlicht

Seiten: 31 p.

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: BibTex Citekey: Arya_Neurips24

Art des Abschluß: -

Veranstaltung

ausblenden:

Titel: 38th Conference on Neural Information Processing Systems

Veranstaltungsort: Vancouver, Canada

Start-/Enddatum: 2024-12-10 - 2024-12-15

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

ausblenden:

Titel: Advances in Neural Information Processing Systems 37

Kurztitel : NeurIPS 2024

Genre der Quelle: Konferenzband

Urheber:
Globerson, A.¹, Herausgeber
Mackey, L.¹, Herausgeber
Belgrave, D.¹, Herausgeber
Fan, A.¹, Herausgeber
Paquet, U.¹, Herausgeber
Tomczak, J.¹, Herausgeber
Zhang, C.¹, Herausgeber

Affiliations:
1 External Organizations, ou_persistent22

Ort, Verlag, Ausgabe: Curran Associates, Inc.

Seiten: - Band / Heft: - Artikelnummer: - Start- / Endseite: 62756 - 62786 Identifikator: -