Articulatory feature classification using convolutional neural networks

Merkx, Danny; Scharenborg, Odette

doi:10.21437/Interspeech.2018-2275

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Articulatory feature classification using convolutional neural networks

MPS-Authors

/persons/resource/persons282348

Merkx, Danny
Center for Language Studies, External Organizations;
International Max Planck Research School for Language Sciences, MPI for Psycholinguistics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

merkx18_interspeech.pdf
(Publisher version), 2MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Merkx, D., & Scharenborg, O. (2018). Articulatory feature classification using convolutional neural networks. In Proceedings of Interspeech 2018 (pp. 2142-2146). doi:10.21437/Interspeech.2018-2275.

Cite as: https://hdl.handle.net/21.11116/0000-000B-639A-8

Abstract

The ultimate goal of our research is to improve an existing speech-based computational model of human speech recognition on the task of simulating the role of fine-grained phonetic information in human speech processing. As part of this work we are investigating articulatory feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. Articulatory feature (AF) modelling of speech has received a considerable amount of attention in automatic speech recognition research. Different approaches have been used to build AF classifiers, most notably multi-layer perceptrons. Recently, deep neural networks have been applied to the task of AF classification. This paper aims to improve AF classification by investigating two different approaches: 1) investigating the usefulness of a deep Convolutional neural network (CNN) for AF classification; 2) integrating the Mel filtering operation into the CNN architecture. The results showed a remarkable improvement in classification accuracy of the CNNs over state-of-the-art AF classification results for Dutch, most notably in the minority classes. Integrating the Mel filtering operation into the CNN architecture did not further improve classification performance.