On Algorithmic Progress in Data Structures and Approximation Algorithms

In the big data regime, computer systems and algorithms must process large amounts of data, making many traditional exact algorithms too costly to run. To work around this, researchers have developed approximation algorithms, which trade off some accuracy for asymptotic improvements in runtime, and...

Full description

Bibliographic Details
Main Author: Li, Jeffery
Other Authors: Lynch, Jayson
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156755
Description
Summary:In the big data regime, computer systems and algorithms must process large amounts of data, making many traditional exact algorithms too costly to run. To work around this, researchers have developed approximation algorithms, which trade off some accuracy for asymptotic improvements in runtime, and data structures, which can efficiently store and answer multiple queries about a dataset. This naturally leads to the question, how have approximation algorithms and data structures improved over the years? Here, we provide some insight into this question, looking into trends in algorithmic and data structure progress, tradeoffs between speed and accuracy or between runtimes of specific data structure operations, and specific problems of interest. Our analysis is based on a dataset of around 300 approximation algorithms and around 250 data structures. For both fields, we find that research is still fairly active even to the present day, even though significant or asymptotic gains for data structures have been slowly on the decline. Improvements have also been fairly heterogeneous– some problems see a lot of work and improvements put into them, while others have not seen as much progress. In addition, of the problems that have both exact and approximation algorithms, around 1/6 of the problems have seen approximation algorithms have immensely large average yearly improvement rates compared to exact algorithms, while around 1/2 of the problems have seen approximation algorithms have minimal improvement over exact algorithms. For data structures, we find that only 4 out of the 28 abstract data types in our dataset have ever had a tradeoff between storage requirements and/or runtimes of specific operations, with only 2 still existing in the present, suggesting that improvements generally build off of each other without increasing space usage or time required for other operations. This research helps us understand how approximation algorithms and data structures have progressed through the years and how they are now.