Predictive and Prescriptive Analytics in Operations Management

The recent surge in data availability and advances in hardware and software and the recent developments and democratization of analytics highlight the critical importance of prediction and prescription in harnessing the power of data to create value through optimal, data-driven decision making. This...

Full description

Bibliographic Details
Main Author: Skali Lami, Omar
Other Authors: Perakis, Georgia
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/144567
_version_ 1811081433553305600
author Skali Lami, Omar
author2 Perakis, Georgia
author_facet Perakis, Georgia
Skali Lami, Omar
author_sort Skali Lami, Omar
collection MIT
description The recent surge in data availability and advances in hardware and software and the recent developments and democratization of analytics highlight the critical importance of prediction and prescription in harnessing the power of data to create value through optimal, data-driven decision making. This thesis proposes novel Machine Learning (ML) and optimization methods in (i) predictive analytics, (ii) prescriptive analytics, and (iii) their high-impact applications in operations management. On the predictive side, this thesis tackles the problems of interpretability and predictive power within the context of tree ensembles. The first chapter introduces the Extended Sampled Trees (XSTrees) method, a novel tree ensemble ML method for classification and regression. Instead of learning a single decision tree like CART, or an collection of trees like Random Forests or Gradient Boosting methods, XSTrees learns the entire probability distribution over the tree space. This approach results in good theoretical guarantees and has a significant edge over other ensemble methods in terms of performance. Analytically, we prove that XSTrees converges to the true underlying tree model with rate [formula], where 𝑛 ∈ N is the number of training observations. Experimentally, we show on publicly available datasets, synthetic data, and two real-world case studies that XSTrees is very competitive with the state-of-theart models, with an average accuracy between 2.5% and 50% higher than competitors for classification and an average R2 between 2% and 85% higher for regression. We further highlight the need and impact of more powerful and interpretable treebased methods in the second chapter through the problem of ancillary services in targeted advertising under an ML lens. This chapter aims to predict the Net Present Value (NPV) of these services, estimate the probability of a customer subscribing to each of them depending on what services are offered to them, and ultimately prescribe the optimal personalized service recommendation that maximizes the expected longterm revenue. First, we propose a novel method called Cluster-While-Classify (CWC). This hybrid optimization-ML method performs joint clustering and classification and subsequently fits a tree-based classifier on the corresponding assignment to predict the sign-up propensity of services based on customer, product, and session-level features. CWC is competitive with the industry state-of-the-art and can be represented in a simple decision tree, making it interpretable and easily actionable. We then use Double Machine Learning (DML) and Causal Forests, another tree-based ML method, to estimate the NPV for each service and finally propose an iterative optimization strategy — that is scalable and efficient — to solve the personalized ancillary service recommendation problem. CWC achieved a competitive 74% out-of-sample accuracy which, alongside the rest of the personalized holistic optimization framework, resulted in an estimated 2.5-3.5% uplift in revenue, which in turn translates to $80-100 million increase in revenue and $15-20 million increase in profits. On the prescriptive side, this thesis moves away from the predict-then-optimize paradigm by doing the prediction and the prescription jointly, resulting in a lower prescription error and higher robustness. The third chapter presents a holistic framework for prescriptive analytics. Given side data 𝑥, decisions 𝑧, and uncertain quantities 𝑦, that are functions of 𝑥 and 𝑧, we propose a framework that simultaneously predicts 𝑦 and prescribes the “should be" optimal decisions 𝑧¯. The algorithm can accommodate a large number of predictive machine learning models and continuous and discrete decisions of high cardinality. It also allows for constraints on these decision variables. We show wide applicability and strong computational performances on synthetic experiments and two real-world case studies. Additionally, we illustrate the impact of these predictive and prescriptive analytics methods in two additional real-world, high-impact applications: healthcare and industrial operations. The fourth chapter proposes an end-to-end framework to help mitigate the COVID-19 pandemic and its impact through the case and death prediction, true prevalence, and fair vaccine distribution. We present the methods we developed for predicting cases and deaths using a novel ML-based aggregation method to create a single prediction we call MIT-Cassandra. We further incorporate COVID-19 case prediction to determine the true prevalence and incorporate this prevalence into an optimization model for efficiently and fairly managing the operations of vaccine allocation. This also allows us to provide insights into how prevalence and exposure of the disease in different parts of the population can affect vaccine distribution. In the last chapter, we propose a novel, machine learning-based methodology to improve the efficiency of maintenance operations, from description to prediction to intervention. The proposed methodology has three main components, applied sequentially to the maintenance scheduling problem. First, a data-driven failure modes and effects analysis to fully describe the state of equipment at a given time in a data-driven way, including probabilities of each failure mode and its respective causes. Second, a unified predictive model which slightly adjusts its parameters for each specific piece of equipment to predict the state of some equipment in the future. Third, a holistic prescriptive model to optimize maintenance interventions.
first_indexed 2024-09-23T11:46:36Z
format Thesis
id mit-1721.1/144567
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T11:46:36Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1445672022-08-30T03:38:55Z Predictive and Prescriptive Analytics in Operations Management Skali Lami, Omar Perakis, Georgia Massachusetts Institute of Technology. Operations Research Center The recent surge in data availability and advances in hardware and software and the recent developments and democratization of analytics highlight the critical importance of prediction and prescription in harnessing the power of data to create value through optimal, data-driven decision making. This thesis proposes novel Machine Learning (ML) and optimization methods in (i) predictive analytics, (ii) prescriptive analytics, and (iii) their high-impact applications in operations management. On the predictive side, this thesis tackles the problems of interpretability and predictive power within the context of tree ensembles. The first chapter introduces the Extended Sampled Trees (XSTrees) method, a novel tree ensemble ML method for classification and regression. Instead of learning a single decision tree like CART, or an collection of trees like Random Forests or Gradient Boosting methods, XSTrees learns the entire probability distribution over the tree space. This approach results in good theoretical guarantees and has a significant edge over other ensemble methods in terms of performance. Analytically, we prove that XSTrees converges to the true underlying tree model with rate [formula], where 𝑛 ∈ N is the number of training observations. Experimentally, we show on publicly available datasets, synthetic data, and two real-world case studies that XSTrees is very competitive with the state-of-theart models, with an average accuracy between 2.5% and 50% higher than competitors for classification and an average R2 between 2% and 85% higher for regression. We further highlight the need and impact of more powerful and interpretable treebased methods in the second chapter through the problem of ancillary services in targeted advertising under an ML lens. This chapter aims to predict the Net Present Value (NPV) of these services, estimate the probability of a customer subscribing to each of them depending on what services are offered to them, and ultimately prescribe the optimal personalized service recommendation that maximizes the expected longterm revenue. First, we propose a novel method called Cluster-While-Classify (CWC). This hybrid optimization-ML method performs joint clustering and classification and subsequently fits a tree-based classifier on the corresponding assignment to predict the sign-up propensity of services based on customer, product, and session-level features. CWC is competitive with the industry state-of-the-art and can be represented in a simple decision tree, making it interpretable and easily actionable. We then use Double Machine Learning (DML) and Causal Forests, another tree-based ML method, to estimate the NPV for each service and finally propose an iterative optimization strategy — that is scalable and efficient — to solve the personalized ancillary service recommendation problem. CWC achieved a competitive 74% out-of-sample accuracy which, alongside the rest of the personalized holistic optimization framework, resulted in an estimated 2.5-3.5% uplift in revenue, which in turn translates to $80-100 million increase in revenue and $15-20 million increase in profits. On the prescriptive side, this thesis moves away from the predict-then-optimize paradigm by doing the prediction and the prescription jointly, resulting in a lower prescription error and higher robustness. The third chapter presents a holistic framework for prescriptive analytics. Given side data 𝑥, decisions 𝑧, and uncertain quantities 𝑦, that are functions of 𝑥 and 𝑧, we propose a framework that simultaneously predicts 𝑦 and prescribes the “should be" optimal decisions 𝑧¯. The algorithm can accommodate a large number of predictive machine learning models and continuous and discrete decisions of high cardinality. It also allows for constraints on these decision variables. We show wide applicability and strong computational performances on synthetic experiments and two real-world case studies. Additionally, we illustrate the impact of these predictive and prescriptive analytics methods in two additional real-world, high-impact applications: healthcare and industrial operations. The fourth chapter proposes an end-to-end framework to help mitigate the COVID-19 pandemic and its impact through the case and death prediction, true prevalence, and fair vaccine distribution. We present the methods we developed for predicting cases and deaths using a novel ML-based aggregation method to create a single prediction we call MIT-Cassandra. We further incorporate COVID-19 case prediction to determine the true prevalence and incorporate this prevalence into an optimization model for efficiently and fairly managing the operations of vaccine allocation. This also allows us to provide insights into how prevalence and exposure of the disease in different parts of the population can affect vaccine distribution. In the last chapter, we propose a novel, machine learning-based methodology to improve the efficiency of maintenance operations, from description to prediction to intervention. The proposed methodology has three main components, applied sequentially to the maintenance scheduling problem. First, a data-driven failure modes and effects analysis to fully describe the state of equipment at a given time in a data-driven way, including probabilities of each failure mode and its respective causes. Second, a unified predictive model which slightly adjusts its parameters for each specific piece of equipment to predict the state of some equipment in the future. Third, a holistic prescriptive model to optimize maintenance interventions. Ph.D. 2022-08-29T15:56:21Z 2022-08-29T15:56:21Z 2022-05 2022-07-05T19:56:47.754Z Thesis https://hdl.handle.net/1721.1/144567 0000-0002-8208-3035​ In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Skali Lami, Omar
Predictive and Prescriptive Analytics in Operations Management
title Predictive and Prescriptive Analytics in Operations Management
title_full Predictive and Prescriptive Analytics in Operations Management
title_fullStr Predictive and Prescriptive Analytics in Operations Management
title_full_unstemmed Predictive and Prescriptive Analytics in Operations Management
title_short Predictive and Prescriptive Analytics in Operations Management
title_sort predictive and prescriptive analytics in operations management
url https://hdl.handle.net/1721.1/144567
work_keys_str_mv AT skalilamiomar predictiveandprescriptiveanalyticsinoperationsmanagement