Published Date: 2018-02-08
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2018, Association for the Advancement of Artificial Intelligence All Rights Reserved.
Gaussian Processes (GPs) provide an extremely powerful mechanism to model a variety of problems but incur an O(N3) complexity in the number of data samples. Common approximation methods rely on what are often termed inducing points but still typically incur an O(NM2) complexity in the data and corresponding inducing points. Using Random Fourier Feature (RFF) maps, we overcome this by transforming the problem into a Bayesian Linear Regression formulation upon which we apply a Bayesian Variational treatment that also allows learning the corresponding kernel hyperparameters, likelihood and noise parameters. In this paper we introduce an alternative method using Fourier series to obtain spectral representations of common kernels, in particular for periodic warpings, which surprisingly have a convergent, non-random form using special functions, requiring fewer spectral features to approximate their corresponding kernel to high accuracy. Using this, we can fuse the Random Fourier Feature spectral representations of common kernels with their periodic counterparts to show how they can more effectively and expressively learn patterns in time-series for both interpolation and extrapolation. This method combines robustness, scalability and equally importantly, interpretability through a symbolic declarative grammar that is both functionally and humanly intuitive — a property that is crucial for explainable decision making. Using probabilistic programming and Variational Inference we are able to efficiently optimise over these rich functional representations. We show significantly improved Gram matrix approximation errors, and also demonstrate the method in several time-series problems comparing other commonly used approaches such as recurrent neural networks.