English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Poster

DeepGaze II: Predicting fixations from deep features over time and tasks

MPS-Authors
There are no MPG-Authors in the publication available
External Resource

Link
(Any fulltext)

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Kümmerer, M., Wallis, T., & Bethge, M. (2017). DeepGaze II: Predicting fixations from deep features over time and tasks. Poster presented at 17th Annual Meeting of the Vision Sciences Society (VSS 2017), St. Pete Beach, FL, USA.


Cite as: https://hdl.handle.net/21.11116/0000-0000-C441-9
Abstract
Where humans choose to look can tell us a lot about behaviour in a variety of tasks. Over the last decade numerous models have been proposed to explain fixations when viewing still images. Until recently these models
failed to capture a substantial amount of the explainable mutual information between image content and the fixation locations (Kümmerer et al, PNAS 2015). This limitation can be tackled effectively by using a transfer learning strategy (“DeepGaze I”, Kümmerer et al. ICLR workshop 2015), in which features learned on object recognition are used to predict fixations. Our new model “DeepGaze II” converts an image into the high-dimensional feature space of the VGG network. A simple readout network is then
used to yield a density prediction. The readout network is pre-trained on the SALICON dataset and fine-tuned on the MIT1003 dataset. DeepGaze II explains 82 of the explainable information on held out data and is achieving
top performance on the MIT Saliency Benchmark. The modular
architecture of DeepGaze II allows a number of interesting applications. By retraining on partial data, we show that fixations after 500ms presentation time are driven by qualitatively different features than the first 500ms, and
we can predict on which images these changes will be largest. Additionally we analyse how different viewing tasks (dataset from Koehler et al. 2014) change fixation behaviour and show that we are able to predict the viewing
task from the fixation locations. Finally, we investigate how much fixations are driven by low-level cues versus high-level content: By replacing the VGG features with isotropic mean-luminance-contrast features, we create
a low-level saliency model that outperforms all saliency models before DeepGaze I (including saliency models using DNNs and other high level features). We analyse how the contributions of high-level and low-level features
to fixation locations change over time.