Learning-Augmented Algorithms

Traditional worst case analysis of algorithms does not fully capture real world behavior in many instances. Inspired by the great success of machine learning algorithms for various practical tasks, there has been recent interest in moving beyond pessimistic analysis of algorithms through the use of...

Full description

Bibliographic Details
Main Author: Silwal, Sandeep
Other Authors: Indyk, Piotr
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139333
_version_ 1811093189364285440
author Silwal, Sandeep
author2 Indyk, Piotr
author_facet Indyk, Piotr
Silwal, Sandeep
author_sort Silwal, Sandeep
collection MIT
description Traditional worst case analysis of algorithms does not fully capture real world behavior in many instances. Inspired by the great success of machine learning algorithms for various practical tasks, there has been recent interest in moving beyond pessimistic analysis of algorithms through the use of additional learned information. In this thesis, we consider further application of this “learning-augmented” framework for three classical algorithms problems: 𝑘-means clustering, counting triangles in a graph stream, and estimating the support of a discrete distribution. The problems we study are fundamental in their own right; clustering is typically one of the first methods used to understand the structure of large datasets and 𝑘-means is the most popular clustering formulation by far. In addition, counting triangles in a graph is a basic tool of network analytics and community detection in social networks. Lastly, the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its element occurs in many applications, including biology, genomics, computer systems and linguistics. In each of these applications, we design algorithms that use predictors (that are based, e.g., on prior instances of the problem) which provide structural information about the inputs. Our theoretical analysis shows that such information can indeed be leveraged to overcome worst case barriers. In addition, we also show that such predictors can be implemented in practice and our algorithms are evaluated on real world datasets. Our experiments demonstrate substantial improvements in the performance compared to prior state-of-the-art algorithms that do not employ any learned information.
first_indexed 2024-09-23T15:41:05Z
format Thesis
id mit-1721.1/139333
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T15:41:05Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1393332022-01-15T03:08:22Z Learning-Augmented Algorithms Silwal, Sandeep Indyk, Piotr Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Traditional worst case analysis of algorithms does not fully capture real world behavior in many instances. Inspired by the great success of machine learning algorithms for various practical tasks, there has been recent interest in moving beyond pessimistic analysis of algorithms through the use of additional learned information. In this thesis, we consider further application of this “learning-augmented” framework for three classical algorithms problems: 𝑘-means clustering, counting triangles in a graph stream, and estimating the support of a discrete distribution. The problems we study are fundamental in their own right; clustering is typically one of the first methods used to understand the structure of large datasets and 𝑘-means is the most popular clustering formulation by far. In addition, counting triangles in a graph is a basic tool of network analytics and community detection in social networks. Lastly, the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its element occurs in many applications, including biology, genomics, computer systems and linguistics. In each of these applications, we design algorithms that use predictors (that are based, e.g., on prior instances of the problem) which provide structural information about the inputs. Our theoretical analysis shows that such information can indeed be leveraged to overcome worst case barriers. In addition, we also show that such predictors can be implemented in practice and our algorithms are evaluated on real world datasets. Our experiments demonstrate substantial improvements in the performance compared to prior state-of-the-art algorithms that do not employ any learned information. S.M. 2022-01-14T15:04:40Z 2022-01-14T15:04:40Z 2021-06 2021-06-24T19:40:12.512Z Thesis https://hdl.handle.net/1721.1/139333 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Silwal, Sandeep
Learning-Augmented Algorithms
title Learning-Augmented Algorithms
title_full Learning-Augmented Algorithms
title_fullStr Learning-Augmented Algorithms
title_full_unstemmed Learning-Augmented Algorithms
title_short Learning-Augmented Algorithms
title_sort learning augmented algorithms
url https://hdl.handle.net/1721.1/139333
work_keys_str_mv AT silwalsandeep learningaugmentedalgorithms