hide
Free keywords:
Earth Observations; In-situ Observations; Essential Ecosystem Variables; Regional Validation
Abstract:
Spatio-temporal fields of land–atmosphere fluxes derived from data-driven models can complement simulations
by process-based land surface models. While a number
of strategies for empirical models with eddy-covariance
flux data have been applied, a systematic intercomparison of
these methods has been missing so far. In this study, we performed
a cross-validation experiment for predicting carbon
dioxide, latent heat, sensible heat and net radiation fluxes
across different ecosystem types with 11 machine learning
(ML) methods from four different classes (kernel methods,
neural networks, tree methods, and regression splines). We
applied two complementary setups: (1) 8-day average fluxes
based on remotely sensed data and (2) daily mean fluxes
based on meteorological data and a mean seasonal cycle of
remotely sensed variables. The patterns of predictions from
different ML and experimental setups were highly consistent.
There were systematic differences in performance among the
fluxes, with the following ascending order: net ecosystem
exchange (R2 < 0.5), ecosystem respiration (R2 > 0.6), gross
primary production (R2> 0.7), latent heat (R2 > 0.7), sensible
heat (R2 > 0.7), and net radiation (R2 > 0.8). The ML methods
predicted the across-site variability and the mean seasonal
cycle of the observed fluxes very well (R2 > 0.7), while
the 8-day deviations from the mean seasonal cycle were not
well predicted (R2 < 0.5). Fluxes were better predicted at
forested and temperate climate sites than at sites in extreme
climates or less represented by training data (e.g., the tropics).
The evaluated large ensemble of ML-based models will be the basis of new global flux products.