English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Talker diarization in the wild: The case of child-centered daylong audio-recordings

Cristia, A., Ganesh, S., Casillas, M., & Ganapathy, S. (2018). Talker diarization in the wild: The case of child-centered daylong audio-recordings. In Proceedings of Interspeech 2018 (pp. 2583-2587). doi:10.21437/Interspeech.2018-2078.

Item is

Basic

show hide
Genre: Conference Paper

Files

show Files
hide Files
:
Cristia_etal_2018.pdf (Publisher version), 3MB
Name:
Cristia_etal_2018.pdf
Description:
-
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Cristia, Alejandrina1, Author           
Ganesh, Shobhana2, Author
Casillas, Marisa3, Author           
Ganapathy, Sriram2, Author
Affiliations:
1LSCP, Département d’études cognitives, (ENS, EHESS, CNRS, PSL University), Paris, 75005, France, ou_persistent22              
2Learning and Extraction of Acoustic Patterns (LEAP) lab, Electrical Engineering, Indian Institute of Science, Bangalore, 560012, India, ou_persistent22              
3Language Development Department, MPI for Psycholinguistics, Max Planck Society, ou_2340691              

Content

show
hide
Free keywords: speaker diarization, language acquisition, spontaneous speech, i-vectors
 Abstract: Speaker diarization (answering 'who spoke when') is a widely researched subject within speech technology. Numerous experiments have been run on datasets built from broadcast news, meeting data, and call centers—the task sometimes appears close to being solved. Much less work has begun to tackle the hardest diarization task of all: spontaneous conversations in real-world settings. Such diarization would be particularly useful for studies of language acquisition, where researchers investigate the speech children produce and hear in their daily lives. In this paper, we study audio gathered with a recorder worn by small children as they went about their normal days. As a result, each child was exposed to different acoustic environments with a multitude of background noises and a varying number of adults and peers. The inconsistency of speech and noise within and across samples poses a challenging task for speaker diarization systems, which we tackled via retraining and data augmentation techniques. We further studied sources of structured variation across raw audio files, including the impact of speaker type distribution, proportion of speech from children, and child age on diarization performance. We discuss the extent to which these findings might generalize to other samples of speech in the wild.

Details

show
hide
Language(s): eng - English
 Dates: 2018-03-262018-06-032018-10
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.21437/Interspeech.2018-2078
 Degree: -

Event

show
hide
Title: Interspeech 2018
Place of Event: Hyderabad, India
Start-/End Date: 2018-09-02 - 2018-09-05

Legal Case

show

Project information

show

Source 1

show
hide
Title: Proceedings of Interspeech 2018
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 2583 - 2587 Identifier: -