Rough paths, kernels, differential equations and an algebra of functions on streams

<p>This thesis is organised in the following four chapters. Appendix A provides asummary of rough path theory.</p> <p>Chapter 1: Recently, there has been an increased interest in the development of kernel methods for learning with sequential data.The signature kernel is a learning...

Бүрэн тодорхойлолт

Номзүйн дэлгэрэнгүй
Үндсэн зохиолч: Salvi, C
Бусад зохиолчид: Lyons, T
Формат: Дипломын ажил
Хэл сонгох:English
Хэвлэсэн: 2021
Нөхцлүүд:
Тодорхойлолт
Тойм:<p>This thesis is organised in the following four chapters. Appendix A provides asummary of rough path theory.</p> <p>Chapter 1: Recently, there has been an increased interest in the development of kernel methods for learning with sequential data.The signature kernel is a learning tool with potential to handle irregularly sampled, multivariate time series. In [1] the authors introduced a kernel trick for the truncated version of this kernel avoiding the exponential complexity thatwould have been involved in a direct computation. In this chapter we show that for continuously differentiable paths, the signature kernel solves a hyperbolic PDE and recognize the connection with a well known class of differential equations known in the literature as Goursat problems. This Goursat PDE only depends onthe increments of the input sequences, does not require the explicit computationof signatures and can be solved efficiently using state-of-the-art hyperbolic PDE numerical solvers, giving a kernel trick for the untruncated signature kernel, withthe same raw complexity as the method from [1], but with the advantage that the PDE numerical scheme is well suited for GPU parallelization, which effectively reduces the complexity by a full order of magnitude in the length of the input sequences. In the last section of this chapter we extend the previous analysisto the space of geometric rough paths and establish, using classical results from rough path theory, that the rough version of the signature kernel solves a rough integral equation analogous to the aforementioned Goursat problem. Empirical evidence is given on the effectiveness of this PDE kernel as a machine learning tool in various data science applications. We release the python library sigkernel publicly available at https://github.com/crispitagorico/sigkernel.</p> <p>Chapter 2: Distribution Regression refers to the supervised learning problem where inputs are probability distributions (or some samples) and outputs are scalars. In practice it can be used to address situations where labels are only available for groups of inputs instead of individual inputs. In this chapter, we develop a rigorous mathematical framework for distribution regression where inputs are streams of data. Leveraging properties established in Chapter 1 two new learning techniques are introduced, one feature-based and the other kernel-based. Each is suited to a different data regime in terms of the number of data streams and the dimensionality of the individual streams. We provide theoretical results on the universality of both approaches and demonstrate empirically their robustness to irregularly sampled time-series, achieving state-of-the-art performance on examples from thermodynamics, agricultural science and cyber security.</p> <p>Chapter 3: Neural controlled differential equations (CDEs) are the continuous-time analogue of recurrent neural networks (RNNs), as Neural ordinary differential equations (ODEs) are to residual networks (ResNets), and offer a memory-efficient continuous-time way to model functions of potentially irregular time series. Existing methods for computing the forward pass of a Neural CDE involve embedding the incoming time series into path space via interpolation and using evaluations of this path to drive the hidden state. Using rough path theory it is possible to extend this formulation. Instead of directly embedding the data into a path via interpolation, we represent the input signal over small time intervals through its log-signature. This approach derives from the numerical log-ODE method, an efficient numerical method for solving rough differential equations (RDEs). Experimental results suggest that this extension is particularly well suited for problems with long time series. </p> <p>Chapter 4: This chapter provides an algebraic description of some classes of real valued functions on streams. More precisely we identify explicit bases for the algebra of linear functionals on grouplike elements with point-wise multiplication, or equivalently for the space of continuous functions on unparametrized paths. The results can be interpreted as structure theorems for streams, splitting functions on streams (possibly uniquely) into two components: a) a first that extracts and packages the information of a stream into recursively defined atomic objects and b) a second that evaluates a polynomial in these elementary components without further reference to the original stream. The atomic objects in a) are described via different algebraic products and using classical combinatorial objects (Hall sets) arising in the study of free Lie algebras.</p> <p>[1] Franz J Király and Harald Oberhauser. Kernels for sequentially ordered data. Journalof Machine Learning Research, 2019 </p>