hide
Free keywords:
-
Abstract:
Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their
computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs
fixed. We generalise this for the case of Gaussian covariance function, by basing our computations on m Gaussian
basis functions with arbitrary diagonal covariance matrices (or length scales). For a fixed number of basis
functions and any given criteria, this additional flexibility permits approximations no worse and typically better
than was previously possible. Although we focus on g.p. regression, the central idea is applicable to all kernel
based algorithms, such as the support vector machine. We perform gradient based optimisation of the marginal
likelihood, which costs O(m2n) time where n is the number of data points, and compare the method to various
other sparse g.p. methods. Our approach outperforms the other methods, particularly for the case of very few basis
functions, i.e. a very high sparsity ratio.