Advances in kernel methods: towards general-purpose and scalable models

<p>A wide range of statistical and machine learning problems involve learning one or multiple latent functions, or properties thereof, from datasets. Examples include regression, classification, principal component analysis, optimisation, learning intensity functions of point processes and rei...

Ausführliche Beschreibung

Bibliographische Detailangaben
1. Verfasser:	Samo, YLK
Weitere Verfasser:	Roberts, S
Format:	Abschlussarbeit
Sprache:	English
Veröffentlicht:	2017
Schlagworte:	Machine learning

_version_	1826317563527692288
author	Samo, YLK
author2	Roberts, S
author_facet	Roberts, S Samo, YLK
author_sort	Samo, YLK
collection	OXFORD
description	<p>A wide range of statistical and machine learning problems involve learning one or multiple latent functions, or properties thereof, from datasets. Examples include regression, classification, principal component analysis, optimisation, learning intensity functions of point processes and reinforcement learning to name but a few. For all these problems, positive semi-definite kernels (or simply kernels) provide a powerful tool for postulating flexible nonparametric hypothesis spaces over functions. Despite recent work on such kernel methods, parametric alternatives, such as deep neural networks, have been at the core of most artificial intelligence breakthroughs in recent years. In this thesis, both theoretical and methodological foundations are presented for constructing fully automated, scalable, and general-purpose kernel machines that perform very well over a wide range of input dimensions and sample sizes. This thesis aims to contribute towards bridging the gap between kernel methods and deep learning and to propose methods that have the advantage over deep learning in performing well on both small and large scale problems.</p> <p>In Part I we provide a gentle introduction to kernel methods, review recent work, identify remaining gaps and outline our contributions.</p> <p>In Part II we develop flexible and scalable Bayesian kernel methods in order to address gaps in methods capable of dealing with the special case of datasets exhibiting locally homogeneous patterns. We begin with two motivating applications. First we consider inferring the intensity function of an inhomogeneous point process in Chapter 2. This application is used to illustrate that often, by carefully adding some mild asymmetry in the dependency structure in Bayesian kernel methods, one may considerably scale-up inference while improving flexibility and accuracy. In Chapter 3 we propose a scalable scheme for online forecasting of time series and fully-online learning of related model parameters, under a kernel-based generative model that is provably sufficiently flexible. This application illustrates that, for one-dimensional input spaces, restricting the degree of differentiability of the latent function of interest may considerably speed-up inference without resorting to approximations and without any adverse effect on flexibility or accuracy. Chapter 4 generalizes these approaches and proposes a novel class of stochastic processes we refer to as string Gaussian processes (string GPs) that, when used as functional prior in a Bayesian nonparametric framework, allow for inference in linear time complexity and linear memory requirement, without resorting to approximations. More importantly, the corresponding inference scheme, which we derive in Chapter 5, also allows flexible learning of locally homogeneous patterns and automated learning of model complexity — that is automated learning of whether there are local patterns in the data in the first place, how much local patterns are present, and where they are located.</p> <p>In Part III we provide a broader discussion covering all types of patterns (homogeneous, locally homogeneous or heterogeneous patterns) and both Bayesian or frequentist kernel methods. In Chapter 6 we begin by discussing what properties a family of kernels should possess to enable fully automated kernel methods that are applicable to any type of datasets. In this chapter, we discuss a novel mathematical formalism for the notion of ‘general-purpose’ families of kernels, and we argue that existing families of kernels are not general-purpose. In Chapter 7 we derive weak sufficient conditions for families of kernels to be general-purpose, and we exhibit tractable such families that enjoy a suitable parametrisation, that we refer to as generalized spectral kernels (GSKs). In Chapter 8 we provide a scalable inference scheme for automated kernel learning using general-purpose families of kernels. The proposed inference scheme scales linearly with the sample size and enables automated learning of nonstationarity and model complexity from the data, in virtually any kernel method. Finally, we conclude with a discussion in Chapter 9 where we show that deep learning can be regarded as a particular type of kernel learning method, and we discuss possible extensions in Chapter 10.</p>
first_indexed	2024-03-07T05:27:17Z
format	Thesis
id	oxford-uuid:e0ff5f8c-bc28-4d96-8ddb-2d49152b2eee
institution	University of Oxford
language	English
last_indexed	2025-03-11T16:55:53Z
publishDate	2017
record_format	dspace
spelling	oxford-uuid:e0ff5f8c-bc28-4d96-8ddb-2d49152b2eee2025-02-21T10:05:45ZAdvances in kernel methods: towards general-purpose and scalable modelsThesishttp://purl.org/coar/resource_type/c_db06uuid:e0ff5f8c-bc28-4d96-8ddb-2d49152b2eeeMachine learningEnglishORA Deposit2017Samo, YLKRoberts, S<p>A wide range of statistical and machine learning problems involve learning one or multiple latent functions, or properties thereof, from datasets. Examples include regression, classification, principal component analysis, optimisation, learning intensity functions of point processes and reinforcement learning to name but a few. For all these problems, positive semi-definite kernels (or simply kernels) provide a powerful tool for postulating flexible nonparametric hypothesis spaces over functions. Despite recent work on such kernel methods, parametric alternatives, such as deep neural networks, have been at the core of most artificial intelligence breakthroughs in recent years. In this thesis, both theoretical and methodological foundations are presented for constructing fully automated, scalable, and general-purpose kernel machines that perform very well over a wide range of input dimensions and sample sizes. This thesis aims to contribute towards bridging the gap between kernel methods and deep learning and to propose methods that have the advantage over deep learning in performing well on both small and large scale problems.</p> <p>In Part I we provide a gentle introduction to kernel methods, review recent work, identify remaining gaps and outline our contributions.</p> <p>In Part II we develop flexible and scalable Bayesian kernel methods in order to address gaps in methods capable of dealing with the special case of datasets exhibiting locally homogeneous patterns. We begin with two motivating applications. First we consider inferring the intensity function of an inhomogeneous point process in Chapter 2. This application is used to illustrate that often, by carefully adding some mild asymmetry in the dependency structure in Bayesian kernel methods, one may considerably scale-up inference while improving flexibility and accuracy. In Chapter 3 we propose a scalable scheme for online forecasting of time series and fully-online learning of related model parameters, under a kernel-based generative model that is provably sufficiently flexible. This application illustrates that, for one-dimensional input spaces, restricting the degree of differentiability of the latent function of interest may considerably speed-up inference without resorting to approximations and without any adverse effect on flexibility or accuracy. Chapter 4 generalizes these approaches and proposes a novel class of stochastic processes we refer to as string Gaussian processes (string GPs) that, when used as functional prior in a Bayesian nonparametric framework, allow for inference in linear time complexity and linear memory requirement, without resorting to approximations. More importantly, the corresponding inference scheme, which we derive in Chapter 5, also allows flexible learning of locally homogeneous patterns and automated learning of model complexity — that is automated learning of whether there are local patterns in the data in the first place, how much local patterns are present, and where they are located.</p> <p>In Part III we provide a broader discussion covering all types of patterns (homogeneous, locally homogeneous or heterogeneous patterns) and both Bayesian or frequentist kernel methods. In Chapter 6 we begin by discussing what properties a family of kernels should possess to enable fully automated kernel methods that are applicable to any type of datasets. In this chapter, we discuss a novel mathematical formalism for the notion of ‘general-purpose’ families of kernels, and we argue that existing families of kernels are not general-purpose. In Chapter 7 we derive weak sufficient conditions for families of kernels to be general-purpose, and we exhibit tractable such families that enjoy a suitable parametrisation, that we refer to as generalized spectral kernels (GSKs). In Chapter 8 we provide a scalable inference scheme for automated kernel learning using general-purpose families of kernels. The proposed inference scheme scales linearly with the sample size and enables automated learning of nonstationarity and model complexity from the data, in virtually any kernel method. Finally, we conclude with a discussion in Chapter 9 where we show that deep learning can be regarded as a particular type of kernel learning method, and we discuss possible extensions in Chapter 10.</p>
spellingShingle	Machine learning Samo, YLK Advances in kernel methods: towards general-purpose and scalable models
title	Advances in kernel methods: towards general-purpose and scalable models
title_full	Advances in kernel methods: towards general-purpose and scalable models
title_fullStr	Advances in kernel methods: towards general-purpose and scalable models
title_full_unstemmed	Advances in kernel methods: towards general-purpose and scalable models
title_short	Advances in kernel methods: towards general-purpose and scalable models
title_sort	advances in kernel methods towards general purpose and scalable models
topic	Machine learning
work_keys_str_mv	AT samoylk advancesinkernelmethodstowardsgeneralpurposeandscalablemodels

Advances in kernel methods: towards general-purpose and scalable models

Ähnliche Einträge