English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  We Are Not Your Real Parents: Telling Causal from Confounded using MDL

Kaltenpoth, D., & Vreeken, J. (2019). We Are Not Your Real Parents: Telling Causal from Confounded using MDL. Retrieved from http://arxiv.org/abs/1901.06950.

Item is

Basic

show hide
Item Permalink: http://hdl.handle.net/21.11116/0000-0003-FFEE-3 Version Permalink: http://hdl.handle.net/21.11116/0000-0003-FFEF-2
Genre: Paper

Files

show Files
hide Files
:
arXiv:1901.06950.pdf (Preprint), 701KB
Name:
arXiv:1901.06950.pdf
Description:
File downloaded from arXiv at 2019-07-09 14:10
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Kaltenpoth, David1, Author              
Vreeken, Jilles1, Author              
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Content

show
hide
Free keywords: Computer Science, Learning, cs.LG,Statistics, Machine Learning, stat.ML
 Abstract: Given data over variables $(X_1,...,X_m, Y)$ we consider the problem of finding out whether $X$ jointly causes $Y$ or whether they are all confounded by an unobserved latent variable $Z$. To do so, we take an information-theoretic approach based on Kolmogorov complexity. In a nutshell, we follow the postulate that first encoding the true cause, and then the effects given that cause, results in a shorter description than any other encoding of the observed variables. The ideal score is not computable, and hence we have to approximate it. We propose to do so using the Minimum Description Length (MDL) principle. We compare the MDL scores under the models where $X$ causes $Y$ and where there exists a latent variables $Z$ confounding both $X$ and $Y$ and show our scores are consistent. To find potential confounders we propose using latent factor modeling, in particular, probabilistic PCA (PPCA). Empirical evaluation on both synthetic and real-world data shows that our method, CoCa, performs very well -- even when the true generating process of the data is far from the assumptions made by the models we use. Moreover, it is robust as its accuracy goes hand in hand with its confidence.

Details

show
hide
Language(s): eng - English
 Dates: 2019-01-212019
 Publication Status: Published online
 Pages: 10 p.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: arXiv: 1901.06950
URI: http://arxiv.org/abs/1901.06950
BibTex Citekey: Kaltenpoth_arXiv1901.06950
 Degree: -

Event

show

Legal Case

show

Project information

show

Source

show