English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  VISIR: Visual and Semantic Image Label Refinement

Nag Chowdhury, S., Tandon, N., Ferhatosmanoglu, H., & Weikum, G. (2019). VISIR: Visual and Semantic Image Label Refinement. Retrieved from http://arxiv.org/abs/1909.00741.

Item is

Files

show Files
hide Files
:
arXiv:1909.00741.pdf (Preprint), 3MB
Name:
arXiv:1909.00741.pdf
Description:
File downloaded from arXiv at 2020-01-21 10:18
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Nag Chowdhury, Sreyasi1, Author           
Tandon, Niket2, Author           
Ferhatosmanoglu, Hakan2, Author           
Weikum, Gerhard1, Author           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2External Organizations, ou_persistent22              

Content

show
hide
Free keywords: Computer Science, Multimedia, cs.MM,Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Information Retrieval, cs.IR
 Abstract: The social media explosion has populated the Internet with a wealth of
images. There are two existing paradigms for image retrieval: 1) content-based
image retrieval (CBIR), which has traditionally used visual features for
similarity search (e.g., SIFT features), and 2) tag-based image retrieval
(TBIR), which has relied on user tagging (e.g., Flickr tags). CBIR now gains
semantic expressiveness by advances in deep-learning-based detection of visual
labels. TBIR benefits from query-and-click logs to automatically infer more
informative labels. However, learning-based tagging still yields noisy labels
and is restricted to concrete objects, missing out on generalizations and
abstractions. Click-based tagging is limited to terms that appear in the
textual context of an image or in queries that lead to a click. This paper
addresses the above limitations by semantically refining and expanding the
labels suggested by learning-based object detection. We consider the semantic
coherence between the labels for different objects, leverage lexical and
commonsense knowledge, and cast the label assignment into a constrained
optimization problem solved by an integer linear program. Experiments show that
our method, called VISIR, improves the quality of the state-of-the-art visual
labeling tools like LSDA and YOLO.

Details

show
hide
Language(s): eng - English
 Dates: 2019-09-022019
 Publication Status: Published online
 Pages: 9 p.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: arXiv: 1909.00741
URI: http://arxiv.org/abs/1909.00741
BibTex Citekey: Nag_arXiv1909.00741
 Degree: -

Event

show

Legal Case

show

Project information

show

Source

show