Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Giving robots a voice: Human-in-the-loop voice creation and open-ended labeling

MPG-Autoren
/persons/resource/persons255681

van Rijn,  Pol       
Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Max Planck Society;

/persons/resource/persons242173

Jacoby,  Nori       
Research Group Computational Auditory Perception, Max Planck Institute for Empirical Aesthetics, Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)

24-cap-jac-03-giving.pdf
(Verlagsversion), 15MB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

van Rijn, P., Mertes, S., Janowski, K., Weitz, K., Jacoby, N., & André, E. (2024). Giving robots a voice: Human-in-the-loop voice creation and open-ended labeling. In F. F. Mueller, P. Kyburz, J. R. Williamson, C. Sas, M. L. Wilson, P. T. Dugas, et al. (Eds.), CHI '24: Proceedings of the CHI Conference on Human Factors in Computing Systems (pp. 1-34). doi:10.1145/3613904.3642038.


Zitierlink: https://hdl.handle.net/21.11116/0000-000F-600C-8
Zusammenfassung
Speech is a natural interface for humans to interact with robots. Yet, aligning a robot’s voice to its appearance is challenging due to the rich vocabulary of both modalities. Previous research has explored a few labels to describe robots and tested them on a limited number of robots and existing voices. Here, we develop a robot-voice creation tool followed by large-scale behavioral human experiments (N=2,505). First, participants collectively tune robotic voices to match 175 robot images using an adaptive human-in-the-loop pipeline. Then, participants describe their impression of the robot or their matched voice using another human-in-the-loop paradigm for open-ended labeling. The elicited taxonomy is then used to rate robot attributes and to predict the best voice for an unseen robot. We offer a web interface to aid engineers in customizing robot voices, demonstrating the synergy between cognitive science and machine learning for engineering tools.