We Are Not Your Real Parents: Telling Causal from Confounded using MDL

Kaltenpoth, David; Vreeken, Jilles

Lokale TagsFreigabegeschichteDetailsÜbersicht

We Are Not Your Real Parents: Telling Causal from Confounded using MDL

Kaltenpoth, D., & Vreeken, J. (2019). We Are Not Your Real Parents: Telling Causal from Confounded using MDL. Retrieved from http://arxiv.org/abs/1901.06950.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-0003-FFEE-3 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0003-FFEF-2

Genre: Forschungspapier

Dateien

einblenden: Dateien

ausblenden: Dateien

:

arXiv:1901.06950.pdf (Preprint), 701KB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/21.11116/0000-0003-FFF0-F

Name:
arXiv:1901.06950.pdf

Beschreibung:
File downloaded from arXiv at 2019-07-09 14:10

OA-Status:

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
-

Copyright Info:
-

Lizenz:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Externe Referenzen

einblenden:

Urheber

einblenden:

ausblenden:

Urheber:
Kaltenpoth, David¹, Autor
Vreeken, Jilles¹, Autor

Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

Inhalt

einblenden:

ausblenden:

Schlagwörter: Computer Science, Learning, cs.LG,Statistics, Machine Learning, stat.ML

Zusammenfassung: Given data over variables $(X_1,...,X_m, Y)$ we consider the problem of
finding out whether $X$ jointly causes $Y$ or whether they are all confounded
by an unobserved latent variable $Z$. To do so, we take an
information-theoretic approach based on Kolmogorov complexity. In a nutshell,
we follow the postulate that first encoding the true cause, and then the
effects given that cause, results in a shorter description than any other
encoding of the observed variables.
The ideal score is not computable, and hence we have to approximate it. We
propose to do so using the Minimum Description Length (MDL) principle. We
compare the MDL scores under the models where $X$ causes $Y$ and where there
exists a latent variables $Z$ confounding both $X$ and $Y$ and show our scores
are consistent. To find potential confounders we propose using latent factor
modeling, in particular, probabilistic PCA (PPCA).
Empirical evaluation on both synthetic and real-world data shows that our
method, CoCa, performs very well -- even when the true generating process of
the data is far from the assumptions made by the models we use. Moreover, it is
robust as its accuracy goes hand in hand with its confidence.

Details

einblenden:

ausblenden:

Sprache(n): eng - English

Datum: Erstellt: 2019-01-21Online veröffentlicht: 2019

Publikationsstatus: Online veröffentlicht

Seiten: 10 p.

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: arXiv: 1901.06950
URI: http://arxiv.org/abs/1901.06950
BibTex Citekey: Kaltenpoth_arXiv1901.06950

Art des Abschluß: -

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle