hide
Free keywords:

Abstract:
The problem of constructing optimal linear prediction models by multivariance regression methods is reviewed. It is well known that as the number of predictors in a model is increased, the skill of the prediction grows, but the statistical significance generally decreases. For predictions using a large number of candidate predictors, strategies are therefore needed to determine optimal prediction models which properly balance the competing requirements of skill and significance. The popular methods of coefficient screening or stepwise regression represent a posteriori predictor selection methods and therefore cannot be used to recover statistically significant models by truncation if the complete model, including all predictors, is statistically insignificant. Higher significance can be achieved only by a priori reduction of the predictor set. To determine the maximum number of predictors which may be meaningfully incorporated in a model, a model hierarchy can be used in which a series of best fit prediction models is constructed for a (prior defined) nested sequence of predictor sets, the sequence being terminated when the significance level either falls below a prescribed limit or reaches a maximum value. The method requires a reliable assessment of model significance. This is characterized by a quadratic statistic which is defined independently of the model skill or artificial skill. As an example, the method is applied to the prediction of sea surface temperature anomalies at Christmas Island (representative of sea surface temperatures in the central equatorial Pacific) and variations of the central and east Pacific Hadley circulation (characterized by the second empirical orthogonal function (EOF) of the meridional component of the trade wind anomaly field) using a general multiple‐time‐lag prediction matrix. The ordering of the predictors is based on an EOF sequence, defined formally as orthogonal variables in the composite space of all (normalized) predictors, irrespective of their different physical dimensions, time lag, and geographic position. The choice of a large set of 20 predictors at 12 time lags yields significant predictability only for forecast periods of 3 to 5 months. However, a prior reduction of the predictor set to 4 predictors at 10 time lags leads to 95% significant predictions with skill values of the order of 0.4 to 0.7 up to 6 or 8 months. For infinitely long time series the construction of optimal prediction models reduces essentially to the problem of linear system identification. However, the model hierarchies normally considered for the simulation of general linear systems differ in structure from the model hierarchies which appear to be most suitable for constructing pure prediction models. Thus the truncation imposed by statistical significance requirements can result in rather different models for the two cases. The relation between optimal prediction models and linear dynamical models is illustrated by the prediction of east‐west sea level changes in the equatorial Pacific from wind field anomalies. It is shown that the optimal empirical prediction is statistically consistent in this case with both the first‐order relaxation and damped oscillator models recently proposed by McWilliams and Gent (but with somewhat different model parameters than suggested by the authors). Thus the data do not allow a distinction between the two physical models; the simplest acceptable model is the first‐order damped response. Finally, the problem of estimating forecast skill is discussed. It is usually stated that the forecast skill is smaller than the true skill, which in turn is smaller than the hindcast skill, by an amount which in both cases is approximately equal to the artificial skill. However, this result applies to the mean skills averaged over the ensemble of all possible hindcast data sets, given the true model. Under the more appropriate side condition of a given hindcast data set and an unknown true model, the estimation of the forecast skill represents a problem of statistical inference and is dependent on the assumed prior probability distribution of true models. The Bayesian hypothesis of a uniform prior distribution yields an average forecast skill equal to the hindcast skill, but other (equally acceptable) assumptions yield lower forecast skills more compatible with the usual hindcast‐averaged expression