Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond

Optimization algorithms stand as a cornerstone for machine learning and statistical inference. The advent of large-scale datasets introduces computational challenges, necessitating the pursuit of more efficient algorithms. Modern optimization techniques are usually tailored to particular machine lea...

Full description

Bibliographic Details
Main Author:	Wang, Haoyue
Other Authors:	Mazumder, Rahul
Format:	Thesis
Published:	Massachusetts Institute of Technology 2024
Online Access:	https://hdl.handle.net/1721.1/155490

_version_	1826196852001734656
author	Wang, Haoyue
author2	Mazumder, Rahul
author_facet	Mazumder, Rahul Wang, Haoyue
author_sort	Wang, Haoyue
collection	MIT
description	Optimization algorithms stand as a cornerstone for machine learning and statistical inference. The advent of large-scale datasets introduces computational challenges, necessitating the pursuit of more efficient algorithms. Modern optimization techniques are usually tailored to particular machine learning issues. These approaches leverage the unique structural characteristics of the problems, resulting in enhanced efficiency compared to current methods applied to these issues. Another key aspect in examining learning algorithms involves comprehending the estimation precision of the derived estimator. In some scenarios, while achieving exact optimization on the training set may be impractical, certain straightforward and effective heuristics can demonstrate commendable estimation accuracy within an appropriate statistical framework. In this thesis, we examine a few large-scale algorithms from both optimization and statistics perspectives. In Chapters 2 and 3, we study two continuous optimization algorithms tailored to structural constraints. Chapter 2 focuses on a generalized Frank-Wolfe method for unbounded constraints with cylinder-like constraints. Chapter 3 focuses on a CD-like method for polyhedral constraints with a small number of extreme points. Both methods have state-of-the-art performance due to their awareness of the problem structures. In Chapter 4, we study a variant of linear regression with possible mismatches between interpreter-response pairs. We study a simple and efficient heuristic method, and give a rigorous analysis of its estimation error in a statistical setting. In Chapters 5 and 6, we examine two algorithms for decision trees. Chapter 5 studies the computation of optimal decision trees, and introduces a new branch-and-bound method for optimal decision trees with general continuous features. Chapter 6 turns to the analysis of the CART algorithm under a sufficient impurity decrease condition. We prove tight error bounds for signals functions with this condition, and discuss a few function classes that satisfy this condition. Chapter 7 studies a density estimation problem with shape restrictions. We propose a cubic-Newton method framework for the computation, and also study the approximation property of the finite mixture.
first_indexed	2024-09-23T10:38:40Z
format	Thesis
id	mit-1721.1/155490
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:38:40Z
publishDate	2024
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1554902024-07-09T03:45:21Z Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond Wang, Haoyue Mazumder, Rahul Massachusetts Institute of Technology. Operations Research Center Sloan School of Management Optimization algorithms stand as a cornerstone for machine learning and statistical inference. The advent of large-scale datasets introduces computational challenges, necessitating the pursuit of more efficient algorithms. Modern optimization techniques are usually tailored to particular machine learning issues. These approaches leverage the unique structural characteristics of the problems, resulting in enhanced efficiency compared to current methods applied to these issues. Another key aspect in examining learning algorithms involves comprehending the estimation precision of the derived estimator. In some scenarios, while achieving exact optimization on the training set may be impractical, certain straightforward and effective heuristics can demonstrate commendable estimation accuracy within an appropriate statistical framework. In this thesis, we examine a few large-scale algorithms from both optimization and statistics perspectives. In Chapters 2 and 3, we study two continuous optimization algorithms tailored to structural constraints. Chapter 2 focuses on a generalized Frank-Wolfe method for unbounded constraints with cylinder-like constraints. Chapter 3 focuses on a CD-like method for polyhedral constraints with a small number of extreme points. Both methods have state-of-the-art performance due to their awareness of the problem structures. In Chapter 4, we study a variant of linear regression with possible mismatches between interpreter-response pairs. We study a simple and efficient heuristic method, and give a rigorous analysis of its estimation error in a statistical setting. In Chapters 5 and 6, we examine two algorithms for decision trees. Chapter 5 studies the computation of optimal decision trees, and introduces a new branch-and-bound method for optimal decision trees with general continuous features. Chapter 6 turns to the analysis of the CART algorithm under a sufficient impurity decrease condition. We prove tight error bounds for signals functions with this condition, and discuss a few function classes that satisfy this condition. Chapter 7 studies a density estimation problem with shape restrictions. We propose a cubic-Newton method framework for the computation, and also study the approximation property of the finite mixture. Ph.D. 2024-07-08T18:54:47Z 2024-07-08T18:54:47Z 2024-05 2024-05-30T21:19:19.098Z Thesis https://hdl.handle.net/1721.1/155490 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Wang, Haoyue Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond
title	Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond
title_full	Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond
title_fullStr	Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond
title_full_unstemmed	Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond
title_short	Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond
title_sort	large scale algorithms for machine learning efficiency estimation errors and beyond
url	https://hdl.handle.net/1721.1/155490
work_keys_str_mv	AT wanghaoyue largescalealgorithmsformachinelearningefficiencyestimationerrorsandbeyond

Large-Scale Algorithms for Machine Learning: Efficiency, Estimation Errors, and Beyond

Similar Items