Factoring Out Prior Knowledge from Low-Dimensional Embeddings

Heiter, Edith; Fischer, Jonas; Vreeken, Jilles

Item

ITEM ACTIONSEXPORT

DownloadE-Mail

Please note that a newer version of this item is available:
https://pure.mpg.de/pubman/item/item_3289867_2

DetailsSummary

Factoring Out Prior Knowledge from Low-Dimensional Embeddings

Heiter, E., Fischer, J., & Vreeken, J. (2021). Factoring Out Prior Knowledge from Low-Dimensional Embeddings. Retrieved from https://arxiv.org/abs/2103.01828.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0008-16ED-5 Version Permalink: https://hdl.handle.net/21.11116/0000-0008-16EE-4

Genre: Paper

Files

show Files

hide Files

arXiv:2103.01828.pdf (Preprint), 13MB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0008-16EF-3

Name:
arXiv:2103.01828.pdf

Description:
File downloaded from arXiv at 2021-03-04 10:38

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Locators

show

Creators

show

hide

Creators:
Heiter, Edith¹, Author
Fischer, Jonas², Author
Vreeken, Jilles¹, Author

Affiliations:
1External Organizations, ou_persistent22
2Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

Content

show

hide

Free keywords: Computer Science, Learning, cs.LG,Statistics, Machine Learning, stat.ML

Abstract: Low-dimensional embedding techniques such as tSNE and UMAP allow visualizing
high-dimensional data and therewith facilitate the discovery of interesting
structure. Although they are widely used, they visualize data as is, rather
than in light of the background knowledge we have about the data. What we
already know, however, strongly determines what is novel and hence interesting.
In this paper we propose two methods for factoring out prior knowledge in the
form of distance matrices from low-dimensional embeddings. To factor out prior
knowledge from tSNE embeddings, we propose JEDI that adapts the tSNE objective
in a principled way using Jensen-Shannon divergence. To factor out prior
knowledge from any downstream embedding approach, we propose CONFETTI, in which
we directly operate on the input distance matrices. Extensive experiments on
both synthetic and real world data show that both methods work well, providing
embeddings that exhibit meaningful structure that would otherwise remain
hidden.

Details

show

hide

Language(s): eng - English

Dates: Created: 2021-03-02Published Online: 2021

Publication Status: Published online

Pages: 27 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 2103.01828
URI: https://arxiv.org/abs/2103.01828
BibTex Citekey: heiter:21:factoring

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show