English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Extending DeepGaze II: Scanpath prediction from deep features

Kümmerer, M., Wallis, T., & Bethge, M. (2018). Extending DeepGaze II: Scanpath prediction from deep features. Journal of Vision, 18(10): 32.21, 371.

Item is

Basic

show hide
Genre: Meeting Abstract

Files

show Files

Locators

show
hide
Description:
-

Creators

show
hide
 Creators:
Kümmerer, M1, Author              
Wallis, T, Author
Bethge, M1, Author              
Affiliations:
1External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Predicting where humans choose to fixate can help understanding a variety of human behaviour. The last years have seen substantial progress in predicting spatial fixation distributions when viewing static images. Our own model "DeepGaze II" (Kümmerer et al., ICCV 2017) extracts pretrained deep neural network features from the VGG network from input images and uses a simple pixelwise readout network to predict fixation distributions from these features. DeepGaze II is state-of-the-art for predicting freeviewing fixation densities according to the established MIT Saliency Benchmark. However, DeepGaze II predicts only spatial fixation distributions instead of scanpaths. Therefore, the models model ignores crucial structure in the fixation selection process. Here we extend DeepGaze II to predict fixation densities conditioned on the previous scanpath. We add additional feature maps encoding the previous scanpath (e.g. the distance of image pixels to previous fixations) to the input of the readout network. Except for these few additional feature maps, the architecture is exactly as for DeepGaze II. The model is trained on ground truth human fixation data (MIT1003) using maximum-likelihood optimization. Even using only the last fixation location increases performance by approximately 30 relative to DeepGaze II and reproduces the strong spatial fixation clustering effect reported previously (Engbert et al., JoV 2015). This contradicts the way Inhibition of Return has often been used in computational models of fixation selection. Using a history of two fixations increases performance further and learns a suppression effect around the earlier fixation location. Due to the probabilistic nature of our model, we can sample new scanpaths from the model that capture the statistics of human scanpaths much better than scanpaths sampled from a purely spatial distribution. The modular architecture of our model allows us to explore the effects of many different possible factors on fixation selection.

Details

show
hide
Language(s):
 Dates: 2018-05
 Publication Status: Published in print
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: KummererWB2018
DOI: 10.1167/18.10.371
 Degree: -

Event

show
hide
Title: 18th Annual Meeting of the Vision Sciences Society (VSS 2018)
Place of Event: St. Pete Beach, FL, USA
Start-/End Date: 2018-05-18 - 2018-05-23

Legal Case

show

Project information

show

Source 1

show
hide
Title: Journal of Vision
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Charlottesville, VA : Scholar One, Inc.
Pages: - Volume / Issue: 18 (10) Sequence Number: 32.21 Start / End Page: 371 Identifier: ISSN: 1534-7362
CoNE: https://pure.mpg.de/cone/journals/resource/111061245811050