Tracking contours of orofacial articulators from real-time MRI of speech.

Labrunie, M.; Badin, P.; Voit, D.; Joseph, A. A.; Lamalle, L.; Vilain, C.; Boë, L. J.; Frahm, J.

doi:10.21437/Interspeech.2016-78

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Tracking contours of orofacial articulators from real-time MRI of speech.

MPS-Authors

/persons/resource/persons15968

Voit, D.
Biomedical NMR Research GmbH, MPI for biophysical chemistry, Max Planck Society;

/persons/resource/persons59192

Joseph, A. A.
Biomedical NMR Research GmbH, MPI for biophysical chemistry, Max Planck Society;

/persons/resource/persons15082

Frahm, J.
Biomedical NMR Research GmbH, MPI for biophysical chemistry, Max Planck Society;

External Resource

https://hal.archives-ouvertes.fr/hal-01368251/document
(Publisher version)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

2354985.pdf
(Publisher version), 184KB

Supplementary Material (public)

There is no public supplementary material available

Citation

Labrunie, M., Badin, P., Voit, D., Joseph, A. A., Lamalle, L., Vilain, C., et al. (2016). Tracking contours of orofacial articulators from real-time MRI of speech. In Proceedings of Interspeech 2016 (pp. 470-474). Red Hook, N.Y.: Curran. doi:10.21437/Interspeech.2016-78.

Cite as: https://hdl.handle.net/11858/00-001M-0000-002B-AEC0-6

Abstract

We introduce a method for predicting midsagittal contours of orofacial articulators from real-time MRI data. A corpus of about 26 minutes of speech has been recorded of a French speaker at a rate of 55 images / s using highly undersampled radial gradient-echo MRI with image reconstruction by nonlinear inversion. The contours of each articulator have been manually traced for a set of about 60 images selected – by hierarchical clustering – to optimally represent the diversity of the speaker articulations. The data serve to build articulator-specific Principal Component Analysis (PCA) models of contours and associated image intensities, as well as multilinear regression (MLR) models that predict contour parameters from image parameters. The contours obtained by MLR are then refined, using the local information about pixel intensity profiles along the contours' normals, by means of modified Active Shape Models (ASM) trained on the same data. The method reaches RMS of predicted points to reference contour distances between 0.54 and 0.93 mm, depending on articulators. The processing of the corpus demonstrated the efficiency of the procedure, despite the possibility of further improvements. This work opens new perspectives for studying articulatory motion in speech.