Information encoding by deep neural networks: what can we learn?

Ten Bosch, Louis; Boves, L.

doi:10.21437/Interspeech.2018-1896

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Konferenzbeitrag

Information encoding by deep neural networks: what can we learn?

MPG-Autoren

/persons/resource/persons73147

Ten Bosch, Louis
Centre for Language Studies, Radboud University;
Other Research, MPI for Psycholinguistics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

TenBosch_Boves_2018_Information encoding by deep neural networks.pdf
(Verlagsversion), 523KB

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Ten Bosch, L., & Boves, L. (2018). Information encoding by deep neural networks: what can we learn? In Proceedings of Interspeech 2018 (pp. 1457-1461). doi:10.21437/Interspeech.2018-1896.

Zitierlink: https://hdl.handle.net/21.11116/0000-0004-9482-1

Zusammenfassung

The recent advent of deep learning techniques in speech tech-nology and in particular in automatic speech recognition hasyielded substantial performance improvements. This suggeststhat deep neural networks (DNNs) are able to capture structurein speech data that older methods for acoustic modeling, suchas Gaussian Mixture Models and shallow neural networks failto uncover. In image recognition it is possible to link repre-sentations on the first couple of layers in DNNs to structuralproperties of images, and to representations on early layers inthe visual cortex. This raises the question whether it is possi-ble to accomplish a similar feat with representations on DNNlayers when processing speech input. In this paper we presentthree different experiments in which we attempt to untanglehow DNNs encode speech signals, and to relate these repre-sentations to phonetic knowledge, with the aim to advance con-ventional phonetic concepts and to choose the topology of aDNNs more efficiently. Two experiments investigate represen-tations formed by auto-encoders. A third experiment investi-gates representations on convolutional layers that treat speechspectrograms as if they were images. The results lay the basisfor future experiments with recursive networks.