A parametric texture model based on deep convolutional features closely matches 
texture appearance for humans

Wallis, TSA; Funke, CM; Ecker, AS; Gatys, LA; Wichmann, FA; Bethge, M

doi:10.1167/17.10.1081

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Poster

A parametric texture model based on deep convolutional features closely matches texture appearance for humans

MPG-Autoren

Es sind keine MPG-Autoren in der Publikation vorhanden

Externe Ressourcen

Link
(beliebiger Volltext)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Wallis, T., Funke, C., Ecker, A., Gatys, L., Wichmann, F., & Bethge, M. (2017). A parametric texture model based on deep convolutional features closely matches texture appearance for humans. Poster presented at 17th Annual Meeting of the Vision Sciences Society (VSS 2017), St. Pete Beach, FL, USA.

Zitierlink: https://hdl.handle.net/21.11116/0000-0000-C403-F

Zusammenfassung

Much of our visual environment consists of texture—“stuff” like cloth, bark or gravel as distinct from “things” like dresses, trees or paths—and we humans are adept at perceiving textures and their subtle variation. How does our visual system achieve this feat? Here we psychophysically evaluate a new parameteric model of texture appearance (the CNN texture model; Gatys et al., 2015) that is based on the features encoded by a deep
convolutional neural network (deep CNN) trained to recognise objects in images (the VGG-19; Simonyan and Zisserman, 2015). By cumulatively matching the correlations of deep features up to a given layer (using up to five convolutional layers) we were able to evaluate models of increasing complexity. We used a three-alternative spatial oddity task to test whether model-generated textures could be discriminated from original natural textures under two viewing conditions: when test patches were briefly presented to the parafovea (“single fixation”) and when observers were able to make eye movements to all three patches (“inspection”). For 9 of the 12 source textures we tested, the models using more than three layers produced images that were indiscriminable from the originals even
under foveal inspection. The venerable parameteric texture model of Portilla and Simoncelli (Portilla and Simoncelli, 2000) was also able to match the appearance of these textures in the single fixation condition, but not under inspection. Of the three source textures our model could not match, two contain strong periodicities. In a second experiment, we found that matching the power spectrum in addition to the deep features used above (Liu et al., 2016) greatly improved matches for these two textures. These
results suggest that the features learned by deep CNNs encode statistical regularities of natural scenes that capture important aspects of material perception in humans.