A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation 
Tasks

Khan, Arif; Steiner, Ingmar; Sugano, Yusuke; Bulling, Andreas; Macdonald, Ross

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Bitte beachten Sie, dass eine neuere Version dieses Datensatzes verfügbar ist:
https://pure.mpg.de/pubman/item/item_2535108_5

DetailsÜbersicht

Freigegeben

Forschungspapier

A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks

MPG-Autoren

/persons/resource/persons86799

Bulling, Andreas
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

arXiv:1712.04798.pdf
(Preprint), 490KB

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Khan, A., Steiner, I., Sugano, Y., Bulling, A., & Macdonald, R. (2017). A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks. Retrieved from http://arxiv.org/abs/1712.04798.

Zitierlink: https://hdl.handle.net/21.11116/0000-0000-403F-2

Zusammenfassung

Phonetic segmentation is the process of splitting speech into distinct phonetic units. Human experts routinely perform this task manually by analyzing auditory and visual cues using analysis software, which is an extremely time-consuming process. Methods exist for automatic segmentation, but these are not always accurate enough. In order to improve automatic segmentation, we need to model it as close to the manual segmentation as possible. This corpus is an effort to capture the human segmentation behavior by recording experts performing a segmentation task. We believe that this data will enable us to highlight the important aspects of manual segmentation, which can be used in automatic segmentation to improve its accuracy.