Speech fine structure contains critical temporal cues to support speech 
segmentation

Teng, Xiangbin; Cogan, Gregory; Poeppel, David

doi:10.1016/j.neuroimage.2019.116152

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

Speech fine structure contains critical temporal cues to support speech segmentation

MPG-Autoren

/persons/resource/persons212714

Teng, Xiangbin
Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Max Planck Society;

/persons/resource/persons173724

Poeppel, David
Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Max Planck Society;
Department of Psychology, New York University;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Teng, X., Cogan, G., & Poeppel, D. (2019). Speech fine structure contains critical temporal cues to support speech segmentation. NeuroImage, 202: 116152. doi:10.1016/j.neuroimage.2019.116152.

Zitierlink: https://hdl.handle.net/21.11116/0000-0003-6938-9

Zusammenfassung

Segmenting the continuous speech stream into units for further perceptual and linguistic analyses is fundamental to speech recognition. The speech amplitude envelope (SE) has long been considered a fundamental temporal cue for segmenting speech. Does the temporal fine structure (TFS), a significant part of speech signals often considered to contain primarily spectral information, contribute to speech segmentation? Using magnetoencephalography, we show that the TFS entrains cortical oscillatory responses between 3-6 Hz and demonstrate, using mutual information analysis, that (i) the temporal information in the TFS can be reconstructed from a measure of frame-to-frame spectral change and correlates with the SE and (ii) that spectral resolution is key to the extraction of such temporal information. Furthermore, we show behavioural evidence that, when the SE is temporally distorted, the TFS provides cues for speech segmentation and aids speech recognition significantly. Our findings show that it is insufficient to investigate solely the SE to understand temporal speech segmentation, as the SE and the TFS derived from a band-filtering method convey comparable, if not inseparable, temporal information. We argue for a more synthetic view of speech segmentation — the auditory system groups speech signals coherently in both temporal and spectral domains.