Volume 9 (2020-2021)
1. Regularized Kernel Machine Learning for Data Driven Forecasting of Chaos Full text (PDF)
Erik Bollt
Pages: 1-26
Abstract: Forecasting outcomes from initial states continues to be an essential task in dynamical systems theory as applied across the sciences and engineering. The data-driven philosophy has become prevalent across the community. While geometric methods founded in time series to rebuild the underlying geometry based on Taken's embedding theorem have been popular and successful in previous decades, they are complex, computationally expensive, and parametrically intensive. The wave of machine learning methods have come to reveal that a black box oriented approach has a great deal to offer the fundamental problem of forecasting the future. Modelling the flow operator as a linear combination of nonlinear basis functions in terms of regression least squares fitting in a data-driven manner is straight forward to pose. However there are two major obstacles to overcome, which are first, the problem of model complexity may lead to either underfitting or overfitting, but these can be mitigated by Tikohonov regularization. Another serious issue regards computational complexity, which can be overcome by the kernel trick, where in all necessary inner products to be computed in a high dimensional feature (basis function) space occur implicitly within low-dimensional kernel operations. In particular kernel methods from the broader theory of support vector machines is founded in the functional analytic theory of Mercer's theorem and also reproducing kernel Hilbert spaces (RKHS), but practically this fundamental concept in machine learning has become central to many efficient algorithms. Putting these concepts together, the efficiency of kernel methods, and the robustness of regularized regression are both possible within an approach called kernelized ridge regression, that we show here makes for an especially useful way to carry forward time-series forecasting problems, as a simple to use and computationally efficient methodology. We demonstrate the utility of these concepts in terms of a progression of examples from low dimensional where direct analysis is possible, to high-dimensional and spatiotemporally chaotic, and then an experimental data set from physiology of heart rate and breathing interactions.