Smoothed Online Learning: Theory and Applications

Many of the algorithms and theoretical results surrounding modern machine learning are predicated on the assumption that data are independent and identically distributed. Motivated by the numerous applications that do not satisfy this assumption, many researchers have been interested in relaxations...

Full description

Bibliographic Details
Main Author: Block, Adam B.
Other Authors: Rakhlin, Alexander
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/155382
Description
Summary:Many of the algorithms and theoretical results surrounding modern machine learning are predicated on the assumption that data are independent and identically distributed. Motivated by the numerous applications that do not satisfy this assumption, many researchers have been interested in relaxations of this condition, with online learning being the weakest such assumption. In this setting, the learner observes data points one at a time and makes predictions, before incorporating the data into a training set with the goal of predicting new data points as well as possible. Due to the lack of assumptions on the data, this setting is both computationally and statistically challenging. In this thesis, we investigate the statistical rates and efficient algorithms achievable when the data are constrained in a natural way motivated by the smoothed analysis of algorithms. The first part covers the statistical rates achievable by an arbitrary algorithm without regard to efficiency, covering both the fully adversarial setting and the constrained setting in which improved rates are possible. The second part of the thesis focuses on efficient algorithms for this constrained setting, as well as special cases where bounds can be improved under additional structure. Finally, in the third part we investigate applications of these techniques to sequential decision making, robotics, and differential privacy. We introduce a number of novel techniques, including a Gaussian anti-concentration inequality and a new norm comparison for dependent data.