English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  AligNarr: Aligning Narratives of Different Length for Movie Summarization

Abouhamra, M. (2019). AligNarr: Aligning Narratives of Different Length for Movie Summarization. Master Thesis, Universität des Saarlandes, Saarbrücken.

Item is

Files

show Files
hide Files
:
Master_Thesis_Mostafa_Abouhamra.pdf (Any fulltext), 941KB
 
File Permalink:
-
Name:
Master_Thesis_Mostafa_Abouhamra.pdf
Description:
-
OA-Status:
Visibility:
Restricted (Max Planck Institute for Informatics, MSIN; )
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Abouhamra, Mostafa1, 2, Author           
Weikum, Gerhard1, Advisor           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551              

Content

show
hide
Free keywords: -
 Abstract: Automatic text alignment is an important problem in natural language processing. It
can be used to create the data needed to train different language models. Most research
about automatic summarization revolves around summarizing news articles or scientific
papers, which are somewhat small texts with simple and clear structure. The bigger the
difference in size between the summary and the original text, the harder the problem will
be since important information will be sparser and identifying them can be more difficult.
Therefore, creating datasets from larger texts can help improve automatic summarization.
In this project, we try to develop an algorithm which can automatically create a
dataset for abstractive automatic summarization for bigger narrative text bodies such
as movie scripts. To this end, we chose sentences as summary text units and scenes
as script text units and developed an algorithm which uses some of the latest natural
language processing techniques to align scenes and sentences based on the similarity in
their meanings.
Solving this alignment problem can provide us with important information about how
to evaluate the meaning of a text, which can help us create better abstractive summariza-
tion models. We developed a method which uses different similarity scoring techniques
(embedding similarity, word inclusion and entity inclusion) to align script scenes and sum-
mary sentences which achieved an F1 score of 0.39. Analyzing our results showed that
the bigger the differences in the number of text units being aligned, the more difficult the
alignment problem is. We also critiqued of our own similarity scoring techniques and dif-
ferent alignment algorithms based on integer linear programming and local optimization
and showed their limitations and discussed ideas to improve them.

Details

show
hide
Language(s): eng - English
 Dates: 20192019
 Publication Status: Issued
 Pages: 54 p.
 Publishing info: Saarbrücken : Universität des Saarlandes
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: AbouhamraMSc2019
 Degree: Master

Event

show

Legal Case

show

Project information

show

Source

show