Learning-Augmented Algorithms
Traditional worst case analysis of algorithms does not fully capture real world behavior in many instances. Inspired by the great success of machine learning algorithms for various practical tasks, there has been recent interest in moving beyond pessimistic analysis of algorithms through the use of...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/139333 |
_version_ | 1811093189364285440 |
---|---|
author | Silwal, Sandeep |
author2 | Indyk, Piotr |
author_facet | Indyk, Piotr Silwal, Sandeep |
author_sort | Silwal, Sandeep |
collection | MIT |
description | Traditional worst case analysis of algorithms does not fully capture real world behavior in many instances. Inspired by the great success of machine learning algorithms for various practical tasks, there has been recent interest in moving beyond pessimistic analysis of algorithms through the use of additional learned information.
In this thesis, we consider further application of this “learning-augmented” framework for three classical algorithms problems: 𝑘-means clustering, counting triangles in a graph stream, and estimating the support of a discrete distribution.
The problems we study are fundamental in their own right; clustering is typically one of the first methods used to understand the structure of large datasets and 𝑘-means is the most popular clustering formulation by far. In addition, counting triangles in a graph is a basic tool of network analytics and community detection in social networks. Lastly, the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its element occurs in many applications, including biology, genomics, computer systems and linguistics.
In each of these applications, we design algorithms that use predictors (that are based, e.g., on prior instances of the problem) which provide structural information about the inputs. Our theoretical analysis shows that such information can indeed be leveraged to overcome worst case barriers. In addition, we also show that such predictors can be implemented in practice and our algorithms are evaluated on real world datasets. Our experiments demonstrate substantial improvements in the performance compared to prior state-of-the-art algorithms that do not employ any learned information. |
first_indexed | 2024-09-23T15:41:05Z |
format | Thesis |
id | mit-1721.1/139333 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T15:41:05Z |
publishDate | 2022 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1393332022-01-15T03:08:22Z Learning-Augmented Algorithms Silwal, Sandeep Indyk, Piotr Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Traditional worst case analysis of algorithms does not fully capture real world behavior in many instances. Inspired by the great success of machine learning algorithms for various practical tasks, there has been recent interest in moving beyond pessimistic analysis of algorithms through the use of additional learned information. In this thesis, we consider further application of this “learning-augmented” framework for three classical algorithms problems: 𝑘-means clustering, counting triangles in a graph stream, and estimating the support of a discrete distribution. The problems we study are fundamental in their own right; clustering is typically one of the first methods used to understand the structure of large datasets and 𝑘-means is the most popular clustering formulation by far. In addition, counting triangles in a graph is a basic tool of network analytics and community detection in social networks. Lastly, the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its element occurs in many applications, including biology, genomics, computer systems and linguistics. In each of these applications, we design algorithms that use predictors (that are based, e.g., on prior instances of the problem) which provide structural information about the inputs. Our theoretical analysis shows that such information can indeed be leveraged to overcome worst case barriers. In addition, we also show that such predictors can be implemented in practice and our algorithms are evaluated on real world datasets. Our experiments demonstrate substantial improvements in the performance compared to prior state-of-the-art algorithms that do not employ any learned information. S.M. 2022-01-14T15:04:40Z 2022-01-14T15:04:40Z 2021-06 2021-06-24T19:40:12.512Z Thesis https://hdl.handle.net/1721.1/139333 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Silwal, Sandeep Learning-Augmented Algorithms |
title | Learning-Augmented Algorithms |
title_full | Learning-Augmented Algorithms |
title_fullStr | Learning-Augmented Algorithms |
title_full_unstemmed | Learning-Augmented Algorithms |
title_short | Learning-Augmented Algorithms |
title_sort | learning augmented algorithms |
url | https://hdl.handle.net/1721.1/139333 |
work_keys_str_mv | AT silwalsandeep learningaugmentedalgorithms |