Help Privacy Policy Disclaimer
  Advanced SearchBrowse





VISIR: Visual and Semantic Image Label Refinement


Nag Chowdhury,  Sreyasi
Databases and Information Systems, MPI for Informatics, Max Planck Society;


Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

(Preprint), 3MB

Supplementary Material (public)
There is no public supplementary material available

Nag Chowdhury, S., Tandon, N., Ferhatosmanoglu, H., & Weikum, G. (2019). VISIR: Visual and Semantic Image Label Refinement. Retrieved from http://arxiv.org/abs/1909.00741.

Cite as: https://hdl.handle.net/21.11116/0000-0005-83CE-F
The social media explosion has populated the Internet with a wealth of
images. There are two existing paradigms for image retrieval: 1) content-based
image retrieval (CBIR), which has traditionally used visual features for
similarity search (e.g., SIFT features), and 2) tag-based image retrieval
(TBIR), which has relied on user tagging (e.g., Flickr tags). CBIR now gains
semantic expressiveness by advances in deep-learning-based detection of visual
labels. TBIR benefits from query-and-click logs to automatically infer more
informative labels. However, learning-based tagging still yields noisy labels
and is restricted to concrete objects, missing out on generalizations and
abstractions. Click-based tagging is limited to terms that appear in the
textual context of an image or in queries that lead to a click. This paper
addresses the above limitations by semantically refining and expanding the
labels suggested by learning-based object detection. We consider the semantic
coherence between the labels for different objects, leverage lexical and
commonsense knowledge, and cast the label assignment into a constrained
optimization problem solved by an integer linear program. Experiments show that
our method, called VISIR, improves the quality of the state-of-the-art visual
labeling tools like LSDA and YOLO.