English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

TiFi: Taxonomy Induction for Fictional Domains [Extended version]

MPS-Authors
/persons/resource/persons180796

Chu,  Cuong Xuan
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons212613

Razniewski,  Simon
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:1901.10263.pdf
(Preprint), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Chu, C. X., Razniewski, S., & Weikum, G. (2019). TiFi: Taxonomy Induction for Fictional Domains [Extended version]. Retrieved from http://arxiv.org/abs/1901.10263.


Cite as: https://hdl.handle.net/21.11116/0000-0003-FE67-C
Abstract
Taxonomies are important building blocks of structured knowledge bases, and
their construction from text sources and Wikipedia has received much attention.
In this paper we focus on the construction of taxonomies for fictional domains,
using noisy category systems from fan wikis or text extraction as input. Such
fictional domains are archetypes of entity universes that are poorly covered by
Wikipedia, such as also enterprise-specific knowledge bases or highly
specialized verticals. Our fiction-targeted approach, called TiFi, consists of
three phases: (i) category cleaning, by identifying candidate categories that
truly represent classes in the domain of interest, (ii) edge cleaning, by
selecting subcategory relationships that correspond to class subsumption, and
(iii) top-level construction, by mapping classes onto a subset of high-level
WordNet categories. A comprehensive evaluation shows that TiFi is able to
construct taxonomies for a diverse range of fictional domains such as Lord of
the Rings, The Simpsons or Greek Mythology with very high precision and that it
outperforms state-of-the-art baselines for taxonomy induction by a substantial
margin.