English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  On the impact of language familiarity in talker change detection

Sharma, N., Krishnamohan, V., Ganapathy, S., Gangopadhayay, A., & Fink, L. (2020). On the impact of language familiarity in talker change detection. In The Institute of Electrical and Electronics EngineersSignal Processing Society (Ed.), 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing: Proceedings (pp. 6249-6253). doi:10.1109/ICASSP40776.2020.9054294.

Item is

Basic

show hide
Genre: Conference Paper

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Sharma, Neeraj1, Author
Krishnamohan, Venkat2, Author
Ganapathy, Sriram2, Author
Gangopadhayay, Ahana3, Author
Fink, Lauren4, 5, Author           
Affiliations:
1Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA, ou_persistent22              
2Learning and Extraction of Acoustic Patterns (LEAP) lab, Indian Institute of Science, Bangalore, ou_persistent22              
3Electrical And Systems Engineering, Washington University in St. Louis, MO, USA, ou_persistent22              
4Department of Music, Max Planck Institute for Empirical Aesthetics, Max Planck Society, ou_2421696              
5Center for Mind and Brain, Univ. of California, Davis, CA, USA, ou_persistent22              

Content

show
hide
Free keywords: Talker change detection, language familiarity, benchmarking speaker diarization, response time, human versus machine
 Abstract: The ability to detect talker changes when listening to conversational speech is fundamental to perception and understanding of multi-talker speech. In this paper, we propose an experimental paradigm to provide insights on the impact of language familiarity on talker change detection. Two multi-talker speech stimulus sets, one in a language familiar to the listeners (English) and the other unfamiliar (Chinese), are created. A listening test is performed in which listeners indicate the number of talkers in the presented stimuli. Analysis of human performance shows statistically significant results for: (a) lower miss (and a higher false alarm) rate in familiar versus unfamiliar language, and (b) longer response time in familiar versus unfamiliar language. These results signify a link between perception of talker attributes and language proficiency. Subsequently, a machine system is designed to perform the same task. The system makes use of the current state-of-the-art diarization approach with x-vector embeddings. A performance comparison on the same stimulus set indicates that the machine system falls short of human performance by a huge margin, for both languages.

Details

show
hide
Language(s): eng - English
 Dates: 2020-05-14
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1109/ICASSP40776.2020.9054294
 Degree: -

Event

show
hide
Title: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Place of Event: Barcelona, Spain
Start-/End Date: 2020-05-04 - 2020-05-08

Legal Case

show

Project information

show

Source 1

show
hide
Title: 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing: Proceedings
Source Genre: Proceedings
 Creator(s):
The Institute of Electrical and Electronics EngineersSignal Processing Society, Editor              
Affiliations:
-
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 6249 - 6253 Identifier: ISBN: 978-1-5090-6631-5
ISBN: 978-1-5090-6632-2