Optimization Methods for Machine Learning under Structural Constraints

In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, wit...

Full description

Bibliographic Details
Main Author: Chen, Wenyu
Other Authors: Mazumder, Rahul
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/151349
_version_ 1811090152327479296
author Chen, Wenyu
author2 Mazumder, Rahul
author_facet Mazumder, Rahul
Chen, Wenyu
author_sort Chen, Wenyu
collection MIT
description In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, with a focus on shape constraints in nonparametric statistics and sparsity in high-dimensional statistics. In the first chapter, we consider the subgradient regularized convex regression problem, which aims to fit a convex function between the target variable and covariates. We propose novel large-scale algorithms, based on proximal gradient descent and active set methods, and derive novel linear convergence guarantees for our proposed algorithms. Empirically, our framework can approximately solve instances with n=100,000 and d=10 within minutes. In the second chapter, we develop a new computational framework for computing log-concave density MLE, based on smoothing techniques in combination with an appropriate integral discretization of increasing accuracy. We establish convergence guarantees of our approaches and demonstrate significant runtime improvements over earlier convex approaches. In the third chapter, we focus on Gaussian Graphical Models, which aims to estimate a sparse precision matrix from iid multivariate Gaussian samples. We propose a novel estimator via ℓ₀ℓ₂-penalized pseudolikelihood. We then design a specialized nonlinear Branch-and-Bound (BnB) framework that solves a mixed integer programming (MIP) formulation of the proposed estimator. Our estimator is computationally scalable to p~10,000, and provides faster runtime compared to competing ℓ₁ approaches, while leading to superior statistical performance. In the fourth chapter, we further look into improving the BnB framework for sparse learning problems with the ℓ₀ℓ₂ penalty and general convex smooth losses. We present a novel screening procedure within the BnB framework to fix relaxed variables to 0 or 1 with guarantees. Our experiments indicate that this screening procedure can significantly reduce the runtimes of BnB solvers.
first_indexed 2024-09-23T14:35:20Z
format Thesis
id mit-1721.1/151349
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:35:20Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1513492023-08-01T03:21:38Z Optimization Methods for Machine Learning under Structural Constraints Chen, Wenyu Mazumder, Rahul Massachusetts Institute of Technology. Operations Research Center In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, with a focus on shape constraints in nonparametric statistics and sparsity in high-dimensional statistics. In the first chapter, we consider the subgradient regularized convex regression problem, which aims to fit a convex function between the target variable and covariates. We propose novel large-scale algorithms, based on proximal gradient descent and active set methods, and derive novel linear convergence guarantees for our proposed algorithms. Empirically, our framework can approximately solve instances with n=100,000 and d=10 within minutes. In the second chapter, we develop a new computational framework for computing log-concave density MLE, based on smoothing techniques in combination with an appropriate integral discretization of increasing accuracy. We establish convergence guarantees of our approaches and demonstrate significant runtime improvements over earlier convex approaches. In the third chapter, we focus on Gaussian Graphical Models, which aims to estimate a sparse precision matrix from iid multivariate Gaussian samples. We propose a novel estimator via ℓ₀ℓ₂-penalized pseudolikelihood. We then design a specialized nonlinear Branch-and-Bound (BnB) framework that solves a mixed integer programming (MIP) formulation of the proposed estimator. Our estimator is computationally scalable to p~10,000, and provides faster runtime compared to competing ℓ₁ approaches, while leading to superior statistical performance. In the fourth chapter, we further look into improving the BnB framework for sparse learning problems with the ℓ₀ℓ₂ penalty and general convex smooth losses. We present a novel screening procedure within the BnB framework to fix relaxed variables to 0 or 1 with guarantees. Our experiments indicate that this screening procedure can significantly reduce the runtimes of BnB solvers. Ph.D. 2023-07-31T19:33:17Z 2023-07-31T19:33:17Z 2023-06 2023-07-13T16:02:34.475Z Thesis https://hdl.handle.net/1721.1/151349 0009-0007-7952-7735 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Chen, Wenyu
Optimization Methods for Machine Learning under Structural Constraints
title Optimization Methods for Machine Learning under Structural Constraints
title_full Optimization Methods for Machine Learning under Structural Constraints
title_fullStr Optimization Methods for Machine Learning under Structural Constraints
title_full_unstemmed Optimization Methods for Machine Learning under Structural Constraints
title_short Optimization Methods for Machine Learning under Structural Constraints
title_sort optimization methods for machine learning under structural constraints
url https://hdl.handle.net/1721.1/151349
work_keys_str_mv AT chenwenyu optimizationmethodsformachinelearningunderstructuralconstraints