English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

Optimising for Interpretability: Convolutional Dynamic Alignment Networks

MPS-Authors
/persons/resource/persons230363

Böhle,  Moritz Daniel
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society;

/persons/resource/persons45383

Schiele,  Bernt
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:2109.13004.pdf
(Preprint), 14MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Böhle, M. D., Fritz, M., & Schiele, B. (2021). Optimising for Interpretability: Convolutional Dynamic Alignment Networks. Retrieved from https://arxiv.org/abs/2109.13004.


Cite as: https://hdl.handle.net/21.11116/0000-0009-8113-F
Abstract
We introduce a new family of neural network models called Convolutional
Dynamic Alignment Networks (CoDA Nets), which are performant classifiers with a
high degree of inherent interpretability. Their core building blocks are
Dynamic Alignment Units (DAUs), which are optimised to transform their inputs
with dynamically computed weight vectors that align with task-relevant
patterns. As a result, CoDA Nets model the classification prediction through a
series of input-dependent linear transformations, allowing for linear
decomposition of the output into individual input contributions. Given the
alignment of the DAUs, the resulting contribution maps align with
discriminative input patterns. These model-inherent decompositions are of high
visual quality and outperform existing attribution methods under quantitative
metrics. Further, CoDA Nets constitute performant classifiers, achieving on par
results to ResNet and VGG models on e.g. CIFAR-10 and TinyImagenet. Lastly,
CoDA Nets can be combined with conventional neural network models to yield
powerful classifiers that more easily scale to complex datasets such as
Imagenet whilst exhibiting an increased interpretable depth, i.e., the output
can be explained well in terms of contributions from intermediate layers within
the network.