English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Text-image synergy for multimodal retrieval and annotation

Chowdhury, S. N. (2021). Text-image synergy for multimodal retrieval and annotation. PhD Thesis, Universität des Saarlandes, Saarbrücken. doi:10.22028/D291-34509.

Item is

Files

show Files

Locators

show
hide
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Chowdhury, Sreyasi Nag1, 2, Author           
Weikum, Gerhard3, Referee           
de Melo, Gerard3, Referee           
Berberich, Klaus3, Referee           
Affiliations:
1Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society, ou_1116547              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551              
3Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Content

show
hide
Free keywords: image retrieval image-text alignment image captioning commonsense knowledge
 Abstract: Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.

Details

show
hide
Language(s): eng - English
 Dates: 2021-06-2820212021
 Publication Status: Published in print
 Pages: 131 p.
 Publishing info: Saarbrücken : Universität des Saarlandes
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: Chowphd2021
DOI: 10.22028/D291-34509
 Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show