Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Exploring Portability of Data Programming Paradigm

Rakhmanberdieva, N. (2017). Exploring Portability of Data Programming Paradigm. Master Thesis, Universität des Saarlandes, Saarbrücken.

Item is

Basisdaten

einblenden: ausblenden:
Genre: Hochschulschrift

Dateien

einblenden: Dateien
ausblenden: Dateien
:
2017_MSc Thesis Rakhmanberdieva, Nurzat.pdf (beliebiger Volltext), 2MB
 
Datei-Permalink:
-
Name:
2017_MSc Thesis Rakhmanberdieva, Nurzat.pdf
Beschreibung:
-
OA-Status:
Sichtbarkeit:
Eingeschränkt (Max Planck Institute for Informatics, MSIN; )
MIME-Typ / Prüfsumme:
application/pdf
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-
Lizenz:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Rakhmanberdieva, Nurzat1, Autor           
Klakow, Dietrich2, Ratgeber
Berberich, Klaus3, Gutachter           
Affiliations:
1International Max Planck Research School, MPI for Informatics, Max Planck Society, ou_1116551              
2External Organizations, ou_persistent22              
3Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Inhalt

einblenden:

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2017-11-302017
 Publikationsstatus: Erschienen
 Seiten: 83 p.
 Ort, Verlag, Ausgabe: Saarbrücken : Universität des Saarlandes
 Inhaltsverzeichnis: Machine Learning methods, especially Deep Learning, had an enormous breakthrough in Natural Language Processing and Computer Vision. They showed incredible performance in solving complex problems with minimum human interaction when large amount of labeled data is available. The hardest part is labeling large quantities of unlabeled data as it is time-consuming, expensive and requires expert knowledge. The Data Programming Paradigm which was introduced at NIPS 2016 proposes a method that uses labeling functions. They are a set of heuristic rules that produce large, but noisy training data which is later denoised by a generative model of these labeling functions.
In this thesis, we explored portability of Data Programming Paradigm to new domains. We applied it to sequence labeling also known as Slot-filling for Spoken Language Understanding and Named Entity Extraction. First, to allow these tasks to be included as part of the pipeline, we modified the initial data processing and candidate generation steps in the model. Second, we introduced a new type of labeling functions to test the hypothesis that "lightly" trained models can serve as a solid labeling function in combination with other functions. In this context, "lightly" trained models denote Deep Learning methods such as Convolutional and Recurrent Neural Networks that are trained with a small subset of data. Third, we described the strategies to implement and select optimal labeling functions. Finally, we showed that Data Programming Paradigm can be successfully extended to such tasks and outperforms its counterparts on noisy data. The experimental results for Slot-filling showed that the for the clean data, Data Programming Paradigm achieved 5.9 points better F1 score than the baseline. But on noisy data, it outperforms twice its counterparts such as Conditional Random Fields. We examined the model with benchmarks such as Air Travel Information System and SAP related datasets.
 Art der Begutachtung: -
 Identifikatoren: BibTex Citekey: RakhmanberdievaMaster2018
 Art des Abschluß: Master

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle

einblenden: