On the theory of Lipschitz continuous machine learning

<p>The field of machine learning theory plays an essential role in establishing the mathematical foundations and performance boundaries of data-driven modelling techniques. By providing a rigorous analysis of the underlying properties of an algorithm, theoretical machine learning guides the de...

Full description

Bibliographic Details
Main Author: Huang, JW
Other Authors: Calliess, J
Format: Thesis
Language:English
Published: 2023
Subjects:
Description
Summary:<p>The field of machine learning theory plays an essential role in establishing the mathematical foundations and performance boundaries of data-driven modelling techniques. By providing a rigorous analysis of the underlying properties of an algorithm, theoretical machine learning guides the development of reliable methods that can be utilised in real world applications. In this context, Lipschitz regularity has been a particularly useful tool, aiding in establishing robustness, worst-case error bounds, and generalisation capabilities for a wide range of machine learning frameworks. Building on this foundation, this thesis explores the theoretical properties of the general class of Lipschitz continuous machine learning frameworks with a specific focus on dynamical system identification.</p> <p>The first part of this thesis investigates a fundamental problem of this class of machine learning frameworks which is the estimation of the Lipschitz constant of the target function from data. We derive optimal sample complexity rates for this problem in both the noiseless and the noisy settings under minimal parametric assumptions on the target function. A novel Lipschitz constant estimation technique shown to be computationally efficient and sample optimal is also proposed.</p> <p>The second part of the thesis focuses on a popular non-parametric system identification method utilised in control: Lipschitz interpolation. It derives a series of theoretical results on the asymptotic properties of the framework under a bounded noise assumption. More specifically, general asymptotic consistency and precise upper bounds on the uniform non-parametric convergence rates are obtained. These established bounds can serve as theoretical tools for comparing Lipschitz interpolation against alternative non-parametric regression methods. Various extensions of these results in the context of online learning, online leaning-based control, and a fully data-driven extension of the classical Lipschitz interpolation framework proposed by Calliess et al. [2020] are also obtained.</p> <p>The final part of the thesis considers the use of Lipschitz regularity properties in conjunction with neural network-based identification methods in the field of time series analysis. We utilise relaxed Lipschitz-type regularity assumptions on the dynamics of a general class of non-linear autoregressive processes to obtain a characterisation of mean reversion through theoretical results on geometric ergodicity and tight upper bounds on the first hitting times of these processes as they revert back to mean. The utility of these results is demonstrated in a financial application on improving trading decision rules where the theoretical results are harnessed to develop learning-based pairs trading strategies with probabilistic guarantees on their profitability.</p>