Speech rate variation: How to perceive fast and slow speech, and how to speed 
up and slow down in speech production

Maslowski, Merel; Rodd, Joe

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Talk

Speech rate variation: How to perceive fast and slow speech, and how to speed up and slow down in speech production

MPS-Authors

/persons/resource/persons197053

Maslowski, Merel
Psychology of Language Department, MPI for Psycholinguistics, Max Planck Society;
International Max Planck Research School for Language Sciences, MPI for Psycholinguistics, Max Planck Society;

/persons/resource/persons192404

Rodd, Joe
Psychology of Language Department, MPI for Psycholinguistics, Max Planck Society;

External Resource

Link to ACLC Seminar site
(Supplementary material)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Maslowski, M., & Rodd, J. (2019). Speech rate variation: How to perceive fast and slow speech, and how to speed up and slow down in speech production. Talk presented at the ACLC Seminar. Amsterdam, The Netherlands. 2019-04-26.

Cite as: https://hdl.handle.net/21.11116/0000-0003-151E-5

Abstract

Speech rate is one of the more salient stylistic dimensions along which speech can vary. We present both sides of this story: how listeners make use of this variation to optimise speech perception, and how the speech production system is modulated to produce speech at different rates.

Listeners take speech rate variation into account by normalizing vowel duration or contextual speech rate: an ambiguous Dutch word /m?t/ is perceived as short /mAt/ when embedded in a slow context, but long /ma:t/ in a fast context. Many have argued that rate normalization involves low-level early and automatic perceptual processing. However, prior research on rate-dependent speech perception has only used explicit recognition tasks to investigate the phenomenon, involving both perceptual processing and decision making. Speech rate effects are induced by both local adjacent temporal cues and global non-adjacent cues. In this talk, I present evidence that local rate normalization takes place, at least in part, at a perceptual level, and even in the absence of an explicit recognition task. In contrast, global effects of speech rate seem to involve higher-level cognitive adjustments, possibly taking place at a later decision-making level.

That speakers can vary their speech rate is evident, but how they accomplish this has hardly been studied. Consider this analogy: when walking, speed can be continuously increased, within limits, but to speed up further, humans must run. Are there multiple qualitatively distinct speech 'gaits' that resemble walking and running? Or is control achieved solely by continuous modulation of a single gait? These possibilities are investigated through simulations of a new connectionist computational model of the cognitive process of speech production. The model has parameters that can be adjusted to fit the temporal characteristics of natural speech at different rates. During training, different clusters of parameter values (regimes) were identified for different speech rates. In a one gait system, the regimes used to achieve fast and slow speech are qualitatively similar, but quantitatively different. In a multiple gait system, there is no linear relationship between the parameter settings associated with each gait, resulting in an abrupt shift in parameter values to move from speaking slowly to speaking fast. After training, the model achieved good fits in all three speech rates. The parameter settings associated with each speech rate were not linearly related, suggesting the presence of cognitive gaits, and thus that speakers make use of distinct cognitive configurations for different speech rates.