DeepGaze II: A big step towards explaining all information in image-based 
saliency

Kümmerer, M; Bethge, M

doi:10.1167/16.12.330

Lokale TagsFreigabegeschichteDetailsÜbersicht

DeepGaze II: A big step towards explaining all information in image-based saliency

Kümmerer, M., & Bethge, M. (2016). DeepGaze II: A big step towards explaining all information in image-based saliency. Poster presented at 16th Annual Meeting of the Vision Sciences Society (VSS 2016), St. Pete Beach, FL, USA.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-0000-7B20-2 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0005-9B12-8

Genre: Poster

Dateien

einblenden: Dateien

Externe Referenzen

einblenden:

ausblenden:

externe Referenz:
Link (beliebiger Volltext) Open Access Status unbekannt

Beschreibung:
-

OA-Status:

Urheber

einblenden:

ausblenden:

Urheber:
Kümmerer, M¹, Autor
Bethge, M¹, Autor

Affiliations:
1Werner-Reichardt-Centre for Integrative Neuroscience, University Tübingen, ou_persistent22

Inhalt

einblenden:

ausblenden:

Schlagwörter: -

Zusammenfassung: When free-viewing scenes, the first few fixations of human observers are driven in part by bottom-up attention. Over the last decade various models have been proposed to explain these fixations. We recently standardized model comparison using an information-theoretic framework and were able to show that these models captured not more than 1/3 of the explainable mutual information between image content and the fixation locations, which might be partially due to the limited data available (Kuemmerer et al, PNAS, in press). Subsequently, we have shown that this limitation can be tackled effectively by using a transfer learning strategy. Our model "DeepGaze I" uses a neural network (AlexNet) that was originally trained for object detection on the ImageNet dataset. It achieved a large improvement over the previous state of the art, explaining 56 of the explainable information (Kuemmerer et al, ICLR 2015).

A new generation of object recognition models have since been developed, substantially outperforming AlexNet. The success of "DeepGaze I" and similar models suggests that features that yield good object detection performance can be exploited for better saliency prediction, and that object detection and fixation prediction performances are correlated. Here we test this hypothesis. Our new model "DeepGaze II" uses the VGG network to convert an image into a high dimensional representation, which is then fed through a second, smaller network to yield a density prediction. The second network is pre-trained using maximum-likelihood on the SALICON dataset and fine-tuned on the MIT1003 dataset. Remarkably, DeepGaze II explains 88 of the explainable information on held out data, and has since achieved top performance on the MIT Saliency Benchmark. The problem of predicting where people look under free-viewing conditions could be solved very soon. That fixation prediction performance is closely tied to object detection informs theories of attentional selection in scene viewing.

Details

einblenden:

ausblenden:

Sprache(n):

Datum: Erschienen: 2016-08

Publikationsstatus: Erschienen

Seiten: -

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: DOI: 10.1167/16.12.330
BibTex Citekey: KummererB2016

Art des Abschluß: -

Veranstaltung

einblenden:

ausblenden:

Titel: 16th Annual Meeting of the Vision Sciences Society (VSS 2016)

Veranstaltungsort: St. Pete Beach, FL, USA

Start-/Enddatum: 2016-05-13 - 2016-05-18

ausblenden:

Titel: Journal of Vision

Genre der Quelle: Zeitschrift

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: Charlottesville, VA : Scholar One, Inc.

Seiten: - Band / Heft: 16 (12) Artikelnummer: - Start- / Endseite: 330 Identifikator: ISSN: 1534-7362
CoNE: https://pure.mpg.de/cone/journals/resource/111061245811050

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle 1