Help Privacy Policy Disclaimer
  Advanced SearchBrowse




Journal Article

Phoneme detection as a tool for comparing perception of natural and synthetic speech

There are no MPG-Authors in the publication available
External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

Cutler_1993_Phoneme detection.pdf
(Publisher version), 780KB

Supplementary Material (public)
There is no public supplementary material available

Nix, A. J., Mehta, G., Dye, J., & Cutler, A. (1993). Phoneme detection as a tool for comparing perception of natural and synthetic speech. Computer Speech and Language, 7, 211-228. doi:10.1006/csla.1993.1011.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-3542-3
On simple intelligibility measures, high-quality synthesiser output now scores almost as well as natural speech. Nevertheless, it is widely agreed that perception of synthetic speech is a harder task for listeners than perception of natural speech; in particular, it has been hypothesized that listeners have difficulty identifying phonemes in synthetic speech. If so, a simple measure of the speed with which a phoneme can be identified should prove a useful tool for comparing perception of synthetic and natural speech. The phoneme detection task was here used in three experiments comparing perception of natural and synthetic speech. In the first, response times to synthetic and natural targets were not significantly different, but in the second and third experiments response times to synthetic targets were significantly slower than to natural targets. A speed-accuracy tradeoff in the third experiment suggests that an important factor in this task is the response criterion adopted by subjects. It is concluded that the phoneme detection task is a useful tool for investigating phonetic processing of synthetic speech input, but subjects must be encouraged to adopt a response criterion which emphasizes rapid responding. When this is the case, significantly longer response times for synthetic targets can indicate a processing disadvantage for synthetic speech at an early level of phonetic analysis.