Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms

Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundame...

Full description

Bibliographic Details
Main Author: Lazar Reich, Claire
Other Authors: Mikusheva, Anna
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/140188
_version_ 1811073732862541824
author Lazar Reich, Claire
author2 Mikusheva, Anna
author_facet Mikusheva, Anna
Lazar Reich, Claire
author_sort Lazar Reich, Claire
collection MIT
description Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundamentally fairer and more accurate. This thesis consists of three chapters that develop methods toward that aim. The first chapter, co-authored with Suhas Vijaykumar, demonstrates that it is possible to reconcile two influential criteria for algorithmic fairness that were previously thought to be in conflict: calibration and equal error rates. We present an algorithm that identifies the most accurate set of predictions satisfying both conditions. In a credit-lending application, we compare our procedure to the common practice of omitting sensitive data and show that it raises both profit and the probability that creditworthy individuals receive loans. The second chapter extends the canonical economic concept of statistical discrimination to algorithmic decision-making. I show that predictive uncertainty often leads algorithms to systematically disadvantage groups with lower-mean outcomes, assigning them smaller true and false positive rates than their higher-mean counterparts. I prove that this disparate impact can occur even when sensitive data and group identifiers are omitted from training, but that it can be resolved if instead data are enriched. In particular, I demonstrate that data acquisition for lower-mean groups can increase access to opportunity. I call the strategy “affirmative information” and compare it to traditional affirmative action in the classification task of identifying creditworthy borrowers. The third chapter, co-authored with Suhas Vijaykumar, establishes a geometric distinction between classification and regression that allows risk in these two settings to be more precisely related. In particular, we note that classification risk depends only on the direction of a regressor, and we take advantage of this scale invariance to improve existing guarantees for how classification risk is bounded by the risk in the associated regression problem. Building on these guarantees, our analysis makes it possible to compare classification algorithms more accurately. Furthermore, it establishes a notion of the “direction” of a conditional expectation function that motivates the design of accurate new classifiers.
first_indexed 2024-09-23T09:37:46Z
format Thesis
id mit-1721.1/140188
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T09:37:46Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1401882022-02-08T18:58:49Z Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms Lazar Reich, Claire Mikusheva, Anna Autor, David Werning, Iván Massachusetts Institute of Technology. Department of Economics Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundamentally fairer and more accurate. This thesis consists of three chapters that develop methods toward that aim. The first chapter, co-authored with Suhas Vijaykumar, demonstrates that it is possible to reconcile two influential criteria for algorithmic fairness that were previously thought to be in conflict: calibration and equal error rates. We present an algorithm that identifies the most accurate set of predictions satisfying both conditions. In a credit-lending application, we compare our procedure to the common practice of omitting sensitive data and show that it raises both profit and the probability that creditworthy individuals receive loans. The second chapter extends the canonical economic concept of statistical discrimination to algorithmic decision-making. I show that predictive uncertainty often leads algorithms to systematically disadvantage groups with lower-mean outcomes, assigning them smaller true and false positive rates than their higher-mean counterparts. I prove that this disparate impact can occur even when sensitive data and group identifiers are omitted from training, but that it can be resolved if instead data are enriched. In particular, I demonstrate that data acquisition for lower-mean groups can increase access to opportunity. I call the strategy “affirmative information” and compare it to traditional affirmative action in the classification task of identifying creditworthy borrowers. The third chapter, co-authored with Suhas Vijaykumar, establishes a geometric distinction between classification and regression that allows risk in these two settings to be more precisely related. In particular, we note that classification risk depends only on the direction of a regressor, and we take advantage of this scale invariance to improve existing guarantees for how classification risk is bounded by the risk in the associated regression problem. Building on these guarantees, our analysis makes it possible to compare classification algorithms more accurately. Furthermore, it establishes a notion of the “direction” of a conditional expectation function that motivates the design of accurate new classifiers. Ph.D. 2022-02-07T15:29:18Z 2022-02-07T15:29:18Z 2021-09 2021-09-15T19:26:58.710Z Thesis https://hdl.handle.net/1721.1/140188 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Lazar Reich, Claire
Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_full Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_fullStr Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_full_unstemmed Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_short Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_sort methods to improve fairness and accuracy in machine learning with applications to financial algorithms
url https://hdl.handle.net/1721.1/140188
work_keys_str_mv AT lazarreichclaire methodstoimprovefairnessandaccuracyinmachinelearningwithapplicationstofinancialalgorithms