Integrating Optimization and Modern Machine Learning: Theory, Computation, and Healthcare Applications
Optimization and machine learning are two predominant fields for decision-making today. The increasing availability of data over the past years has facilitated advancements in the intersection of these two domains, which in turn has led to better decision support tools. Optimization has significantl...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/155509 https://orcid.org/0000-0001-5690-8147 |
Summary: | Optimization and machine learning are two predominant fields for decision-making today. The increasing availability of data over the past years has facilitated advancements in the intersection of these two domains, which in turn has led to better decision support tools. Optimization has significantly enhanced traditional machine learning models by refining their training methods, and machine learning has improved many optimization algorithms by enabling better decision-making through accurate predictions.
However, integrating optimization theory with modern machine learning methods, like neural networks and kernel functions, faces two primary challenges. Firstly, these models don't meet the fundamental convexity assumptions of optimization theory. Secondly, these models are primarily used in tasks with numerous parameters and high-dimensional data, requiring highly efficient and scalable algorithms. This focus on efficiency limits consideration for discrete variables and general constraints that are typical in optimization. This thesis introduces novel algorithms to address these challenges.
The work is divided into four chapters, encompassing rigorous theory, computational tools, and diverse applications. In Chapter 1, we extend state-of-the-art tools from robust optimization to non-convex and non-concave settings, allowing us to generate neural networks that are robust against input perturbations. In Chapter 2, we develop a holistic deep learning framework that jointly optimizes for neural network robustness, stability and sparsity by appropriately modifying the loss function. In Chapter 3 we introduce TabText, a flexible methodology that leverages the power of Large Language Models for patient flow predictions from tabular data. Lastly, in Chapter 4 we present a data-driven approach for solving multistage stochastic optimization problems via sparsified kernel methods. |
---|