Deutsch
 
Benutzerhandbuch Datenschutzhinweis Impressum Kontakt
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Forschungspapier

Causal Inference on Multivariate Mixed-Type Data by Minimum Description Length

MPG-Autoren
/persons/resource/persons206670

Marx,  Alexander
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons79525

Vreeken,  Jilles
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)

arXiv:1702.06385.pdf
(Preprint), 2MB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Marx, A., & Vreeken, J. (2017). Causal Inference on Multivariate Mixed-Type Data by Minimum Description Length. Retrieved from http://arxiv.org/abs/1702.06385.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-002D-90EF-3
Zusammenfassung
Given data over the joint distribution of two univariate or multivariate random variables $X$ and $Y$ of mixed or single type data, we consider the problem of inferring the most likely causal direction between $X$ and $Y$. We take an information theoretic approach, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction. For practical inference, we propose a score for causal models for mixed type data based on the Minimum Description Length (MDL) principle. In particular, we model dependencies between $X$ and $Y$ using classification and regression trees. Inferring the optimal model is NP-hard, and hence we propose Crack, a fast greedy algorithm to infer the most likely causal direction directly from the data. Empirical evaluation on synthetic, benchmark, and real world data shows that Crack reliably and with high accuracy infers the correct causal direction on both univariate and multivariate cause--effect pairs over both single and mixed type data.