English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  DeepGaze II: Predicting fixations from deep features over time and tasks

Kümmerer, M., Wallis, T., & Bethge, M. (2017). DeepGaze II: Predicting fixations from deep features over time and tasks. Poster presented at 17th Annual Meeting of the Vision Sciences Society (VSS 2017), St. Pete Beach, FL, USA.

Item is

Files

show Files

Locators

show
hide
Locator:
Link (Any fulltext)
Description:
-

Creators

show
hide
 Creators:
Kümmerer, M, Author              
Wallis, T, Author
Bethge, M1, Author              
Affiliations:
1External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Where humans choose to look can tell us a lot about behaviour in a variety of tasks. Over the last decade numerous models have been proposed to explain fixations when viewing still images. Until recently these models failed to capture a substantial amount of the explainable mutual information between image content and the fixation locations (Kümmerer et al, PNAS 2015). This limitation can be tackled effectively by using a transfer learning strategy (“DeepGaze I”, Kümmerer et al. ICLR workshop 2015), in which features learned on object recognition are used to predict fixations. Our new model “DeepGaze II” converts an image into the high-dimensional feature space of the VGG network. A simple readout network is then used to yield a density prediction. The readout network is pre-trained on the SALICON dataset and fine-tuned on the MIT1003 dataset. DeepGaze II explains 82 of the explainable information on held out data and is achieving top performance on the MIT Saliency Benchmark. The modular architecture of DeepGaze II allows a number of interesting applications. By retraining on partial data, we show that fixations after 500ms presentation time are driven by qualitatively different features than the first 500ms, and we can predict on which images these changes will be largest. Additionally we analyse how different viewing tasks (dataset from Koehler et al. 2014) change fixation behaviour and show that we are able to predict the viewing task from the fixation locations. Finally, we investigate how much fixations are driven by low-level cues versus high-level content: By replacing the VGG features with isotropic mean-luminance-contrast features, we create a low-level saliency model that outperforms all saliency models before DeepGaze I (including saliency models using DNNs and other high level features). We analyse how the contributions of high-level and low-level features to fixation locations change over time.

Details

show
hide
Language(s):
 Dates: 2017-08
 Publication Status: Published in print
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1167/17.10.1147
BibTex Citekey: KummererWB2017
 Degree: -

Event

show
hide
Title: 17th Annual Meeting of the Vision Sciences Society (VSS 2017)
Place of Event: St. Pete Beach, FL, USA
Start-/End Date: 2017-05-19 - 2017-05-24

Legal Case

show

Project information

show

Source 1

show
hide
Title: Journal of Vision
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Charlottesville, VA : Scholar One, Inc.
Pages: - Volume / Issue: 17 (10) Sequence Number: - Start / End Page: 1147 Identifier: ISSN: 1534-7362
CoNE: https://pure.mpg.de/cone/journals/resource/111061245811050