ausblenden:
Schlagwörter:
Statistics, Machine Learning, stat.ML,Computer Science, Learning, cs.LG
Zusammenfassung:
Given data over the joint distribution of two univariate or multivariate
random variables $X$ and $Y$ of mixed or single type data, we consider the
problem of inferring the most likely causal direction between $X$ and $Y$. We
take an information theoretic approach, from which it follows that first
describing the data over cause and then that of effect given cause is shorter
than the reverse direction.
For practical inference, we propose a score for causal models for mixed type
data based on the Minimum Description Length (MDL) principle. In particular, we
model dependencies between $X$ and $Y$ using classification and regression
trees. Inferring the optimal model is NP-hard, and hence we propose Crack, a
fast greedy algorithm to infer the most likely causal direction directly from
the data.
Empirical evaluation on synthetic, benchmark, and real world data shows that
Crack reliably and with high accuracy infers the correct causal direction on
both univariate and multivariate cause--effect pairs over both single and mixed
type data.