Optimization Methods for Machine Learning under Structural Constraints

In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, wit...

Full description

Bibliographic Details
Main Author:	Chen, Wenyu
Other Authors:	Mazumder, Rahul
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/151349

_version_	1826209917698048000
author	Chen, Wenyu
author2	Mazumder, Rahul
author_facet	Mazumder, Rahul Chen, Wenyu
author_sort	Chen, Wenyu
collection	MIT
description	In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, with a focus on shape constraints in nonparametric statistics and sparsity in high-dimensional statistics. In the first chapter, we consider the subgradient regularized convex regression problem, which aims to fit a convex function between the target variable and covariates. We propose novel large-scale algorithms, based on proximal gradient descent and active set methods, and derive novel linear convergence guarantees for our proposed algorithms. Empirically, our framework can approximately solve instances with n=100,000 and d=10 within minutes. In the second chapter, we develop a new computational framework for computing log-concave density MLE, based on smoothing techniques in combination with an appropriate integral discretization of increasing accuracy. We establish convergence guarantees of our approaches and demonstrate significant runtime improvements over earlier convex approaches. In the third chapter, we focus on Gaussian Graphical Models, which aims to estimate a sparse precision matrix from iid multivariate Gaussian samples. We propose a novel estimator via ℓ₀ℓ₂-penalized pseudolikelihood. We then design a specialized nonlinear Branch-and-Bound (BnB) framework that solves a mixed integer programming (MIP) formulation of the proposed estimator. Our estimator is computationally scalable to p~10,000, and provides faster runtime compared to competing ℓ₁ approaches, while leading to superior statistical performance. In the fourth chapter, we further look into improving the BnB framework for sparse learning problems with the ℓ₀ℓ₂ penalty and general convex smooth losses. We present a novel screening procedure within the BnB framework to fix relaxed variables to 0 or 1 with guarantees. Our experiments indicate that this screening procedure can significantly reduce the runtimes of BnB solvers.
first_indexed	2024-09-23T14:35:20Z
format	Thesis
id	mit-1721.1/151349
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T14:35:20Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1513492023-08-01T03:21:38Z Optimization Methods for Machine Learning under Structural Constraints Chen, Wenyu Mazumder, Rahul Massachusetts Institute of Technology. Operations Research Center In modern statistical and machine learning models, structural constraints are usually imposed for model interpretability as well as model complexity reduction. In this thesis, we present scalable optimization methods for several large-scale machine learning problems under structural constraints, with a focus on shape constraints in nonparametric statistics and sparsity in high-dimensional statistics. In the first chapter, we consider the subgradient regularized convex regression problem, which aims to fit a convex function between the target variable and covariates. We propose novel large-scale algorithms, based on proximal gradient descent and active set methods, and derive novel linear convergence guarantees for our proposed algorithms. Empirically, our framework can approximately solve instances with n=100,000 and d=10 within minutes. In the second chapter, we develop a new computational framework for computing log-concave density MLE, based on smoothing techniques in combination with an appropriate integral discretization of increasing accuracy. We establish convergence guarantees of our approaches and demonstrate significant runtime improvements over earlier convex approaches. In the third chapter, we focus on Gaussian Graphical Models, which aims to estimate a sparse precision matrix from iid multivariate Gaussian samples. We propose a novel estimator via ℓ₀ℓ₂-penalized pseudolikelihood. We then design a specialized nonlinear Branch-and-Bound (BnB) framework that solves a mixed integer programming (MIP) formulation of the proposed estimator. Our estimator is computationally scalable to p~10,000, and provides faster runtime compared to competing ℓ₁ approaches, while leading to superior statistical performance. In the fourth chapter, we further look into improving the BnB framework for sparse learning problems with the ℓ₀ℓ₂ penalty and general convex smooth losses. We present a novel screening procedure within the BnB framework to fix relaxed variables to 0 or 1 with guarantees. Our experiments indicate that this screening procedure can significantly reduce the runtimes of BnB solvers. Ph.D. 2023-07-31T19:33:17Z 2023-07-31T19:33:17Z 2023-06 2023-07-13T16:02:34.475Z Thesis https://hdl.handle.net/1721.1/151349 0009-0007-7952-7735 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Chen, Wenyu Optimization Methods for Machine Learning under Structural Constraints
title	Optimization Methods for Machine Learning under Structural Constraints
title_full	Optimization Methods for Machine Learning under Structural Constraints
title_fullStr	Optimization Methods for Machine Learning under Structural Constraints
title_full_unstemmed	Optimization Methods for Machine Learning under Structural Constraints
title_short	Optimization Methods for Machine Learning under Structural Constraints
title_sort	optimization methods for machine learning under structural constraints
url	https://hdl.handle.net/1721.1/151349
work_keys_str_mv	AT chenwenyu optimizationmethodsformachinelearningunderstructuralconstraints

Optimization Methods for Machine Learning under Structural Constraints

Similar Items