Analyzing speech in both time and space: Generalized additive mixed models can 
uncover systematic patterns of variation in vocal tract shape in real-time MRI

Carignan, C.; Hoole, P.; Kunay, E.; Pouplier, M.; Joseph, A.; Voit, D.; Frahm, J.; Harrington, J.

doi:10.5334/labphon.214

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

Analyzing speech in both time and space: Generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI

MPS-Authors

Joseph, A.
Research Group of Biomedical NMR, MPI for Biophysical Chemistry, Max Planck Society;

Voit, D.
Research Group of Biomedical NMR, MPI for Biophysical Chemistry, Max Planck Society;

/persons/resource/persons15082

Frahm, J.
Research Group of Biomedical NMR, MPI for Biophysical Chemistry, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

3374957.pdf
(Publisher version), 6MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Carignan, C., Hoole, P., Kunay, E., Pouplier, M., Joseph, A., Voit, D., et al. (2020). Analyzing speech in both time and space: Generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 11. doi:10.5334/labphon.214.

Cite as: https://hdl.handle.net/21.11116/0000-000A-8EAE-3

Abstract

We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech.