Read my lips: Speech distortions in musical lyrics can be overcome (slightly) 
by facial information

Massaro, Dominic W.; Jesse, Alexandra

doi:10.1016/j.specom.2008.05.013

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

学術論文

Read my lips: Speech distortions in musical lyrics can be overcome (slightly) by facial information

MPS-Authors

/persons/resource/persons86

Jesse, Alexandra
Language Comprehension Group, MPI for Psycholinguistics, Max Planck Society;
Other Research, MPI for Psycholinguistics, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

Massaro_2009_read.pdf
(出版社版), 283KB

付随資料 (公開)

There is no public supplementary material available

引用

Massaro, D. W., & Jesse, A. (2009). Read my lips: Speech distortions in musical lyrics can be overcome (slightly) by facial information. Speech Communication, 51(7), 604-621. doi:10.1016/j.specom.2008.05.013.

引用: https://hdl.handle.net/11858/00-001M-0000-0013-1A98-A

要旨

Understanding the lyrics of many contemporary songs is difficult, and an earlier study [Hidalgo-Barnes, M., Massaro, D.W., 2007. Read my lips: an animated face helps communicate musical lyrics. Psychomusicology 19, 3–12] showed a benefit for lyrics recognition when seeing a computer-animated talking head (Baldi) mouthing the lyrics along with hearing the singer. However, the contribution of visual information was relatively small compared to what is usually found for speech. In the current experiments, our goal was to determine why the face appears to contribute less when aligned with sung lyrics than when aligned with normal speech presented in noise. The first experiment compared the contribution of the talking head with the originally sung lyrics versus the case when it was aligned with the Festival text-to-speech synthesis (TtS) spoken at the original duration of the song’s lyrics. A small and similar influence of the face was found in both conditions. In the three experiments, we compared the presence of the face when the durations of the TtS were equated with the duration of the original musical lyrics to the case when the lyrics were read with typical TtS durations and this speech embedded in noise. The results indicated that the unusual temporally distorted durations of musical lyrics decreases the contribution of the visible speech from the face.