hide
Free keywords:

Abstract:
A long standing question of biological vision research is to identify the computational goal
underlying the response properties of sensory neurons in the early visual system. Some response
properties of visual neurons such as bandpass filtering and contrast gain control have
been shown to exhibit a clear advantage in terms of redundancy reduction. The situation is less
clear in the case of complex cells whose defining property is that of phase invariance. While
it has been shown that complex cells can be learned based on the redundancy reduction principle
by means of subspace ICA [Hyvarinen Hoyer 2000], the resulting gain in redundancy
reduction is very small [Sinz, Simoncelli, Bethge 2010]. Slow feature analysis (SFA, [Wiskott
Sejnowski 2002]) advocates an alternative objective function which does not seek to fit a
density model but constitutes a special case of oriented PCA by maximizing the signal to noise
ratio when treating temporal changes as noise.
Here we set out to evaluate SFA with respect to two important empirical properties of complex
cells RFs: (1) locality (i.e. finite RF size) and (2) an inverse relationship between RF size and
RF spatial frequency. To this end we use an approach similar to that employed by [Field 1987]
for sparse coding. Instead of single Gabor functions, however, we use the energy model of
complex cells which is built with a (quadrature) pair of even and odd symmetric Gabor filters.
We evaluate the objective function of SFA on the energy model responses to motion sequences
of natural images for different spatial frequencies and envelope sizes, with patch sizes ranging
from 6464 to 512512.
We find that the objective function of SFA grows without bound for increasing envelope size
(see Figure, blue line). Consequently, SFA by itself cannot explain spatially localized RFs but
would need to evoke other mechanisms such as anatomical wiring constraints to limit the size
of the RF. It is unlikely, however, that such anatomical constraints are able to reproduce the
inverse relationship between RF size and spatial frequency.
64x6 4 2 56x256 512x512
0
1
2
3
4
5
6
Patch size in pixels
optimal envelop width/wavelength
ICA
SFA
Range of physiological
data [Ringach 2002]
In contrast to SFA, the objective function of subspace ICA yields
a clear optimum for finite envelope sizes, regardless of assumed
patch size (see Figure, red line). In particular, the optimum envelope
size is inversely proportional to spatial frequency — just
as observed for physiologically measured RFs in primary visual
cortex of cat [Field Tolhust 1986] and monkey ([Ringach 2002],
histogram see Figure).
We conclude that SFA fails to reproduce important features of
complex cells. In contrast, the envelope size predicted by subspace
ICA lies well within the range of physiologically measured
receptive field sizes. As a consequence, if we interpret complex cell coding as a step towards
building an invariant representation, the underlying algorithm is more likely to resemble a
sparse coding strategy as employed by subspace ICA than the covariance based learning rule
employed by SFA.