Optimization Methods for Machine Learning under Structural Constraints
In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, wit...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2023
|
Online Access: | https://hdl.handle.net/1721.1/151349 |
_version_ | 1811090152327479296 |
---|---|
author | Chen, Wenyu |
author2 | Mazumder, Rahul |
author_facet | Mazumder, Rahul Chen, Wenyu |
author_sort | Chen, Wenyu |
collection | MIT |
description | In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, with a focus on shape constraints in nonparametric statistics and sparsity in high-dimensional statistics.
In the first chapter, we consider the subgradient regularized convex regression problem, which aims to fit a convex function between the target variable and covariates. We propose novel large-scale algorithms, based on proximal gradient descent and active set methods, and derive novel linear convergence guarantees for our proposed algorithms. Empirically, our framework can approximately solve instances with n=100,000 and d=10 within minutes.
In the second chapter, we develop a new computational framework for computing log-concave density MLE, based on smoothing techniques in combination with an appropriate integral discretization of increasing accuracy. We establish convergence guarantees of our approaches and demonstrate significant runtime improvements over earlier convex approaches.
In the third chapter, we focus on Gaussian Graphical Models, which aims to estimate a sparse precision matrix from iid multivariate Gaussian samples. We propose a novel estimator via ℓ₀ℓ₂-penalized pseudolikelihood. We then design a specialized nonlinear Branch-and-Bound (BnB) framework that solves a mixed integer programming (MIP) formulation of the proposed estimator. Our estimator is computationally scalable to p~10,000, and provides faster runtime compared to competing ℓ₁ approaches, while leading to superior statistical performance.
In the fourth chapter, we further look into improving the BnB framework for sparse learning problems with the ℓ₀ℓ₂ penalty and general convex smooth losses. We present a novel screening procedure within the BnB framework to fix relaxed variables to 0 or 1 with guarantees. Our experiments indicate that this screening procedure can significantly reduce the runtimes of BnB solvers. |
first_indexed | 2024-09-23T14:35:20Z |
format | Thesis |
id | mit-1721.1/151349 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T14:35:20Z |
publishDate | 2023 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1513492023-08-01T03:21:38Z Optimization Methods for Machine Learning under Structural Constraints Chen, Wenyu Mazumder, Rahul Massachusetts Institute of Technology. Operations Research Center In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, with a focus on shape constraints in nonparametric statistics and sparsity in high-dimensional statistics. In the first chapter, we consider the subgradient regularized convex regression problem, which aims to fit a convex function between the target variable and covariates. We propose novel large-scale algorithms, based on proximal gradient descent and active set methods, and derive novel linear convergence guarantees for our proposed algorithms. Empirically, our framework can approximately solve instances with n=100,000 and d=10 within minutes. In the second chapter, we develop a new computational framework for computing log-concave density MLE, based on smoothing techniques in combination with an appropriate integral discretization of increasing accuracy. We establish convergence guarantees of our approaches and demonstrate significant runtime improvements over earlier convex approaches. In the third chapter, we focus on Gaussian Graphical Models, which aims to estimate a sparse precision matrix from iid multivariate Gaussian samples. We propose a novel estimator via ℓ₀ℓ₂-penalized pseudolikelihood. We then design a specialized nonlinear Branch-and-Bound (BnB) framework that solves a mixed integer programming (MIP) formulation of the proposed estimator. Our estimator is computationally scalable to p~10,000, and provides faster runtime compared to competing ℓ₁ approaches, while leading to superior statistical performance. In the fourth chapter, we further look into improving the BnB framework for sparse learning problems with the ℓ₀ℓ₂ penalty and general convex smooth losses. We present a novel screening procedure within the BnB framework to fix relaxed variables to 0 or 1 with guarantees. Our experiments indicate that this screening procedure can significantly reduce the runtimes of BnB solvers. Ph.D. 2023-07-31T19:33:17Z 2023-07-31T19:33:17Z 2023-06 2023-07-13T16:02:34.475Z Thesis https://hdl.handle.net/1721.1/151349 0009-0007-7952-7735 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Chen, Wenyu Optimization Methods for Machine Learning under Structural Constraints |
title | Optimization Methods for Machine Learning under Structural Constraints |
title_full | Optimization Methods for Machine Learning under Structural Constraints |
title_fullStr | Optimization Methods for Machine Learning under Structural Constraints |
title_full_unstemmed | Optimization Methods for Machine Learning under Structural Constraints |
title_short | Optimization Methods for Machine Learning under Structural Constraints |
title_sort | optimization methods for machine learning under structural constraints |
url | https://hdl.handle.net/1721.1/151349 |
work_keys_str_mv | AT chenwenyu optimizationmethodsformachinelearningunderstructuralconstraints |