Abstract
Although understanding the processing of natural sounds is an important goal in auditory neuroscience, relatively little is known about the neural coding of these sounds. Recently we demonstrated that the spectral temporal receptive field (STRF), a description of the stimulus-response function of auditory neurons, could be derived from responses to arbitrary ensembles of complex sounds including vocalizations. In this study, we use this method to investigate the auditory processing of natural sounds in the birdsong system. We obtain neural responses from several regions of the songbird auditory forebrain to a large ensemble of bird songs and use these data to calculate the STRFs, which are the best linear model of the spectral-temporal features of sound to which auditory neurons respond. We find that these neurons respond to a wide variety of features in songs ranging from simple tonal components to more complex spectral-temporal structures such as frequency sweeps and multi-peaked frequency stacks. We quantify spectral and temporal characteristics of these features by extracting several parameters from the STRFs. Moreover, we assess the linearity versus nonlinearity of encoding by quantifying the quality of the predictions of the neural responses to songs obtained using the STRFs. Our results reveal successively complex functional stages of song analysis by neurons in the auditory forebrain. When we map the properties of auditory forebrain neurons, as characterized by the STRF parameters, onto conventional anatomical subdivisions of the auditory forebrain, we find that although some properties are shared across different subregions, the distribution of several parameters is suggestive of hierarchical processing.