Climate classifications: the value of unsupervised clustering

Zscheischler, Jakob; Mahecha, Miguel D.; Harmeling, Stefan

doi:10.1016/j.procs.2012.04.096

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

Climate classifications: the value of unsupervised clustering

MPS-Authors

/persons/resource/persons76340

Zscheischler, Jakob
Research Group Biogeochemical Model-data Integration, Dr. M. Reichstein, Max Planck Institute for Biogeochemistry, Max Planck Society;
IMPRS International Max Planck Research School for Global Biogeochemical Cycles, Max Planck Institute for Biogeochemistry, Max Planck Society;

/persons/resource/persons62472

Mahecha, Miguel D.
Research Group Biogeochemical Model-data Integration, Dr. M. Reichstein, Max Planck Institute for Biogeochemistry, Max Planck Society;

External Resource

http://dx.doi.org/10.1016/j.procs.2012.04.096
(Publisher version)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

BGC1666.pdf
(Publisher version), 4MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Zscheischler, J., Mahecha, M. D., & Harmeling, S. (2012). Climate classifications: the value of unsupervised clustering. Procedia Computer Science, 9, 897-906. doi:10.1016/j.procs.2012.04.096.

Cite as: https://hdl.handle.net/11858/00-001M-0000-000E-DDDA-E

Abstract

Classifying the land surface according to different climate zones is often a prerequisite for global diagnostic or predictive modelling studies. Classical classiﬁcations such as the prominent K̈oppen–Geiger (KG) approach rely on heuristic decision rules. Although these heuristics may transport some process understanding, such a discretization may appear “arbitrary” from a data oriented perspective. In this contribution we compare the precision of a KG classiﬁcation to an unsupervised classiﬁcation (k-means clustering). Generally speaking, we revisit the problem of “climate classiﬁcation” by investigating the inherent patterns in multiple data streams in a purely data driven way. One question is whether we can reproduce the KG boundaries by exploring different combinations of climate and remotely sensed vegetation variables. In this context we also investigate whether climate and vegetation variables build similar clusters. In terms of statistical performances, k-means clearly outperforms classical climate classiﬁcations. However, a subsequent stability analysis only reveals a meaningful number of clusters if both climate and vegetation data are considered in the analysis. This is a setback for the hope to explain vegetation by means of climate alone. Clearly, classiﬁcation schemes like K̈oppen-Geiger will play an important role in the future. However, future developments in this area need to be assessed based on data driven approaches.