Advances in kernel methods: towards general-purpose and scalable models

<p>A wide range of statistical and machine learning problems involve learning one or multiple latent functions, or properties thereof, from datasets. Examples include regression, classification, principal component analysis, optimisation, learning intensity functions of point processes and rei...

Volledige beschrijving

Bibliografische gegevens
Hoofdauteur: Samo, YLK
Andere auteurs: Roberts, S
Formaat: Thesis
Taal:English
Gepubliceerd in: 2017
Onderwerpen:
_version_ 1826317563527692288
author Samo, YLK
author2 Roberts, S
author_facet Roberts, S
Samo, YLK
author_sort Samo, YLK
collection OXFORD
description <p>A wide range of statistical and machine learning problems involve learning one or multiple latent functions, or properties thereof, from datasets. Examples include regression, classification, principal component analysis, optimisation, learning intensity functions of point processes and reinforcement learning to name but a few. For all these problems, positive semi-definite kernels (or simply kernels) provide a powerful tool for postulating flexible nonparametric hypothesis spaces over functions. Despite recent work on such kernel methods, parametric alternatives, such as deep neural networks, have been at the core of most artificial intelligence breakthroughs in recent years. In this thesis, both theoretical and methodological foundations are presented for constructing fully automated, scalable, and general-purpose kernel machines that perform very well over a wide range of input dimensions and sample sizes. This thesis aims to contribute towards bridging the gap between kernel methods and deep learning and to propose methods that have the advantage over deep learning in performing well on both small and large scale problems.</p> <p>In Part I we provide a gentle introduction to kernel methods, review recent work, identify remaining gaps and outline our contributions.</p> <p>In Part II we develop flexible and scalable Bayesian kernel methods in order to address gaps in methods capable of dealing with the special case of datasets exhibiting locally homogeneous patterns. We begin with two motivating applications. First we consider inferring the intensity function of an inhomogeneous point process in Chapter 2. This application is used to illustrate that often, by carefully adding some mild asymmetry in the dependency structure in Bayesian kernel methods, one may considerably scale-up inference while improving flexibility and accuracy. In Chapter 3 we propose a scalable scheme for online forecasting of time series and fully-online learning of related model parameters, under a kernel-based generative model that is provably sufficiently flexible. This application illustrates that, for one-dimensional input spaces, restricting the degree of differentiability of the latent function of interest may considerably speed-up inference without resorting to approximations and without any adverse effect on flexibility or accuracy. Chapter 4 generalizes these approaches and proposes a novel class of stochastic processes we refer to as string Gaussian processes (string GPs) that, when used as functional prior in a Bayesian nonparametric framework, allow for inference in linear time complexity and linear memory requirement, without resorting to approximations. More importantly, the corresponding inference scheme, which we derive in Chapter 5, also allows flexible learning of locally homogeneous patterns and automated learning of model complexity — that is automated learning of whether there are local patterns in the data in the first place, how much local patterns are present, and where they are located.</p> <p>In Part III we provide a broader discussion covering all types of patterns (homogeneous, locally homogeneous or heterogeneous patterns) and both Bayesian or frequentist kernel methods. In Chapter 6 we begin by discussing what properties a family of kernels should possess to enable fully automated kernel methods that are applicable to any type of datasets. In this chapter, we discuss a novel mathematical formalism for the notion of ‘general-purpose’ families of kernels, and we argue that existing families of kernels are not general-purpose. In Chapter 7 we derive weak sufficient conditions for families of kernels to be general-purpose, and we exhibit tractable such families that enjoy a suitable parametrisation, that we refer to as generalized spectral kernels (GSKs). In Chapter 8 we provide a scalable inference scheme for automated kernel learning using general-purpose families of kernels. The proposed inference scheme scales linearly with the sample size and enables automated learning of nonstationarity and model complexity from the data, in virtually any kernel method. Finally, we conclude with a discussion in Chapter 9 where we show that deep learning can be regarded as a particular type of kernel learning method, and we discuss possible extensions in Chapter 10.</p>
first_indexed 2024-03-07T05:27:17Z
format Thesis
id oxford-uuid:e0ff5f8c-bc28-4d96-8ddb-2d49152b2eee
institution University of Oxford
language English
last_indexed 2025-03-11T16:55:53Z
publishDate 2017
record_format dspace
spelling oxford-uuid:e0ff5f8c-bc28-4d96-8ddb-2d49152b2eee2025-02-21T10:05:45ZAdvances in kernel methods: towards general-purpose and scalable modelsThesishttp://purl.org/coar/resource_type/c_db06uuid:e0ff5f8c-bc28-4d96-8ddb-2d49152b2eeeMachine learningEnglishORA Deposit2017Samo, YLKRoberts, S<p>A wide range of statistical and machine learning problems involve learning one or multiple latent functions, or properties thereof, from datasets. Examples include regression, classification, principal component analysis, optimisation, learning intensity functions of point processes and reinforcement learning to name but a few. For all these problems, positive semi-definite kernels (or simply kernels) provide a powerful tool for postulating flexible nonparametric hypothesis spaces over functions. Despite recent work on such kernel methods, parametric alternatives, such as deep neural networks, have been at the core of most artificial intelligence breakthroughs in recent years. In this thesis, both theoretical and methodological foundations are presented for constructing fully automated, scalable, and general-purpose kernel machines that perform very well over a wide range of input dimensions and sample sizes. This thesis aims to contribute towards bridging the gap between kernel methods and deep learning and to propose methods that have the advantage over deep learning in performing well on both small and large scale problems.</p> <p>In Part I we provide a gentle introduction to kernel methods, review recent work, identify remaining gaps and outline our contributions.</p> <p>In Part II we develop flexible and scalable Bayesian kernel methods in order to address gaps in methods capable of dealing with the special case of datasets exhibiting locally homogeneous patterns. We begin with two motivating applications. First we consider inferring the intensity function of an inhomogeneous point process in Chapter 2. This application is used to illustrate that often, by carefully adding some mild asymmetry in the dependency structure in Bayesian kernel methods, one may considerably scale-up inference while improving flexibility and accuracy. In Chapter 3 we propose a scalable scheme for online forecasting of time series and fully-online learning of related model parameters, under a kernel-based generative model that is provably sufficiently flexible. This application illustrates that, for one-dimensional input spaces, restricting the degree of differentiability of the latent function of interest may considerably speed-up inference without resorting to approximations and without any adverse effect on flexibility or accuracy. Chapter 4 generalizes these approaches and proposes a novel class of stochastic processes we refer to as string Gaussian processes (string GPs) that, when used as functional prior in a Bayesian nonparametric framework, allow for inference in linear time complexity and linear memory requirement, without resorting to approximations. More importantly, the corresponding inference scheme, which we derive in Chapter 5, also allows flexible learning of locally homogeneous patterns and automated learning of model complexity — that is automated learning of whether there are local patterns in the data in the first place, how much local patterns are present, and where they are located.</p> <p>In Part III we provide a broader discussion covering all types of patterns (homogeneous, locally homogeneous or heterogeneous patterns) and both Bayesian or frequentist kernel methods. In Chapter 6 we begin by discussing what properties a family of kernels should possess to enable fully automated kernel methods that are applicable to any type of datasets. In this chapter, we discuss a novel mathematical formalism for the notion of ‘general-purpose’ families of kernels, and we argue that existing families of kernels are not general-purpose. In Chapter 7 we derive weak sufficient conditions for families of kernels to be general-purpose, and we exhibit tractable such families that enjoy a suitable parametrisation, that we refer to as generalized spectral kernels (GSKs). In Chapter 8 we provide a scalable inference scheme for automated kernel learning using general-purpose families of kernels. The proposed inference scheme scales linearly with the sample size and enables automated learning of nonstationarity and model complexity from the data, in virtually any kernel method. Finally, we conclude with a discussion in Chapter 9 where we show that deep learning can be regarded as a particular type of kernel learning method, and we discuss possible extensions in Chapter 10.</p>
spellingShingle Machine learning
Samo, YLK
Advances in kernel methods: towards general-purpose and scalable models
title Advances in kernel methods: towards general-purpose and scalable models
title_full Advances in kernel methods: towards general-purpose and scalable models
title_fullStr Advances in kernel methods: towards general-purpose and scalable models
title_full_unstemmed Advances in kernel methods: towards general-purpose and scalable models
title_short Advances in kernel methods: towards general-purpose and scalable models
title_sort advances in kernel methods towards general purpose and scalable models
topic Machine learning
work_keys_str_mv AT samoylk advancesinkernelmethodstowardsgeneralpurposeandscalablemodels