Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Of Human Observers and Deep Neural Networks: A Detailed Psychophysical Comparison

There are no MPG-Authors in the publication available
External Resource

(Any fulltext)

Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Geirhos, R., Jannsen, D., Schütt, H., Bethge, M., & Wichmann, F. (2017). Of Human Observers and Deep Neural Networks: A Detailed Psychophysical Comparison. Poster presented at 17th Annual Meeting of the Vision Sciences Society (VSS 2017), St. Pete Beach, FL, USA.

Cite as: http://hdl.handle.net/21.11116/0000-0000-C445-5
Deep Neural Networks (DNNs) have recently been put forward as computational models for feedforward processing in the human and monkey ventral streams. Not only do they achieve human-level performance in image classification tasks, recent studies also found striking similarities between DNNs and ventral stream processing systems in terms of the learned representations (e.g. Cadieu et al., 2014, PLOS Comput. Biol.) or the spatial and temporal stages of processing (Cichy et al., 2016, arXiv). In order to obtain a more precise understanding of the similarities and differences between current DNNs and the human visual system, we here investigate how classification accuracies depend on image properties such as colour, contrast, the amount of additive visual noise, as well as on image distortions resulting from the Eidolon Factory. We report results from a series of image classification (object recognition) experiments on both human observers and three DNNs (AlexNet, VGG-16, GoogLeNet). We used experimental conditions favouring single-fixation, purely feedforward processing in human observers (short presentation time of t = 200 ms followed by a high contrast mask); additionally, we used exactly the same images from 16 basic level categories for human observers and DNNs. Under non-manipulated conditions we find that DNNs indeed outperformed human observers (96.2 correct versus 88.5; colour, full-contrast, noise-free images). However, human observers clearly outperformed DNNs for all of the image degrading manipulations: most strikingly, DNN performance severely breaks down with even small quantities of visual random noise. Our findings reinforce how robust the human visual system is against various image degradations, and indicate that there may still be marked differences in the way the human visual system and the three tested DNNs process visual information. We discuss which differences between known properties of the early and higher visual system and DNNs may be responsible for the behavioural discrepancies we find.