Testing and learning on distributional and set inputs

<p>As machine learning gains significant attention in many disciplines and research communities, the variety of data structures has increased, with examples including distributions and sets of observations. In this thesis, we consider sets and distributions as inputs for machine learning pr...

Olles dieđut

Bibliográfalaš dieđut
Váldodahkki:	Law, H
Eará dahkkit:	Sejdinovic, D
Materiálatiipa:	Oahppočájánas
Giella:	English
Almmustuhtton:	2019
Fáttát:	Statistics Machine learning

_version_	1826291502680113152
author	Law, H
author2	Sejdinovic, D
author_facet	Sejdinovic, D Law, H
author_sort	Law, H
collection	OXFORD
description	<p>As machine learning gains significant attention in many disciplines and research communities, the variety of data structures has increased, with examples including distributions and sets of observations. In this thesis, we consider sets and distributions as inputs for machine learning problems. In particular, we propose non-parametric tests, supervised learning, semi-supervised learning and metalearning methodologies on these objects. In each case, with careful consideration of the input structure, we construct models that are applicable to various real life tasks.</p> <p>We begin by considering the problem of <em>weakly supervised learning on aggregate outputs</em>, where the labels are only available at a much coarser resolution than the level of inputs, such that a set of inputs corresponds to each output. Constructing a tractable and scalable framework of aggregated observation models using Gaussian processes, we apply it to the important problem of fine-scale spatial modelling of malaria incidences. In particular, it is demonstrated that the prediction of unobserved pixel-level malaria intensities is possible using finescale environmental covariates.</p> <p>Utilising the same data structure, but with the interpretation that the set of samples is drawn from a distribution, we consider the problem of modelling distributions in the context of hyperparameter selection for supervised learning tasks. Through transfer of information from previously solved tasks using learnt representations of the training datasets, we construct a Gaussian process framework that jointly models all the meta-information available. In application to a range of regression and classification tasks, we demonstrate that we achieve faster convergence compared to the state-of-the-art baselines.</p>
first_indexed	2024-03-07T03:00:21Z
format	Thesis
id	oxford-uuid:b0c17cd9-a0f0-4c10-a5f5-b59e2c924e9e
institution	University of Oxford
language	English
last_indexed	2024-03-07T03:00:21Z
publishDate	2019
record_format	dspace
spelling	oxford-uuid:b0c17cd9-a0f0-4c10-a5f5-b59e2c924e9e2022-03-27T03:58:43ZTesting and learning on distributional and set inputsThesishttp://purl.org/coar/resource_type/c_db06uuid:b0c17cd9-a0f0-4c10-a5f5-b59e2c924e9eStatisticsMachine learningEnglishORA Deposit2019Law, HSejdinovic, D<p>As machine learning gains significant attention in many disciplines and research communities, the variety of data structures has increased, with examples including distributions and sets of observations. In this thesis, we consider sets and distributions as inputs for machine learning problems. In particular, we propose non-parametric tests, supervised learning, semi-supervised learning and metalearning methodologies on these objects. In each case, with careful consideration of the input structure, we construct models that are applicable to various real life tasks.</p> <p>We begin by considering the problem of <em>weakly supervised learning on aggregate outputs</em>, where the labels are only available at a much coarser resolution than the level of inputs, such that a set of inputs corresponds to each output. Constructing a tractable and scalable framework of aggregated observation models using Gaussian processes, we apply it to the important problem of fine-scale spatial modelling of malaria incidences. In particular, it is demonstrated that the prediction of unobserved pixel-level malaria intensities is possible using finescale environmental covariates.</p> <p>Utilising the same data structure, but with the interpretation that the set of samples is drawn from a distribution, we consider the problem of modelling distributions in the context of hyperparameter selection for supervised learning tasks. Through transfer of information from previously solved tasks using learnt representations of the training datasets, we construct a Gaussian process framework that jointly models all the meta-information available. In application to a range of regression and classification tasks, we demonstrate that we achieve faster convergence compared to the state-of-the-art baselines.</p>
spellingShingle	Statistics Machine learning Law, H Testing and learning on distributional and set inputs
title	Testing and learning on distributional and set inputs
title_full	Testing and learning on distributional and set inputs
title_fullStr	Testing and learning on distributional and set inputs
title_full_unstemmed	Testing and learning on distributional and set inputs
title_short	Testing and learning on distributional and set inputs
title_sort	testing and learning on distributional and set inputs
topic	Statistics Machine learning
work_keys_str_mv	AT lawh testingandlearningondistributionalandsetinputs

Testing and learning on distributional and set inputs

Geahča maid