Benutzerhandbuch Datenschutzhinweis Impressum Kontakt





A parametric texture model based on deep convolutional features closely matches texture appearance for humans

Es sind keine MPG-Autoren in der Publikation vorhanden
Externe Ressourcen

(beliebiger Volltext)

Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Wallis, T., Funke, C., Ecker, A., Gatys, L., Wichmann, F., & Bethge, M. (2017). A parametric texture model based on deep convolutional features closely matches texture appearance for humans. Poster presented at 17th Annual Meeting of the Vision Sciences Society (VSS 2017), St. Pete Beach, FL, USA.

Much of our visual environment consists of texture—“stuff” like cloth, bark or gravel as distinct from “things” like dresses, trees or paths—and we humans are adept at perceiving textures and their subtle variation. How does our visual system achieve this feat? Here we psychophysically evaluate a new parameteric model of texture appearance (the CNN texture model; Gatys et al., 2015) that is based on the features encoded by a deep convolutional neural network (deep CNN) trained to recognise objects in images (the VGG-19; Simonyan and Zisserman, 2015). By cumulatively matching the correlations of deep features up to a given layer (using up to five convolutional layers) we were able to evaluate models of increasing complexity. We used a three-alternative spatial oddity task to test whether model-generated textures could be discriminated from original natural textures under two viewing conditions: when test patches were briefly presented to the parafovea (“single fixation”) and when observers were able to make eye movements to all three patches (“inspection”). For 9 of the 12 source textures we tested, the models using more than three layers produced images that were indiscriminable from the originals even under foveal inspection. The venerable parameteric texture model of Portilla and Simoncelli (Portilla and Simoncelli, 2000) was also able to match the appearance of these textures in the single fixation condition, but not under inspection. Of the three source textures our model could not match, two contain strong periodicities. In a second experiment, we found that matching the power spectrum in addition to the deep features used above (Liu et al., 2016) greatly improved matches for these two textures. These results suggest that the features learned by deep CNNs encode statistical regularities of natural scenes that capture important aspects of material perception in humans.