Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms

Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundame...

Full description

Bibliographic Details
Main Author:	Lazar Reich, Claire
Other Authors:	Mikusheva, Anna
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/140188

_version_	1811073732862541824
author	Lazar Reich, Claire
author2	Mikusheva, Anna
author_facet	Mikusheva, Anna Lazar Reich, Claire
author_sort	Lazar Reich, Claire
collection	MIT
description	Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundamentally fairer and more accurate. This thesis consists of three chapters that develop methods toward that aim. The first chapter, co-authored with Suhas Vijaykumar, demonstrates that it is possible to reconcile two influential criteria for algorithmic fairness that were previously thought to be in conflict: calibration and equal error rates. We present an algorithm that identifies the most accurate set of predictions satisfying both conditions. In a credit-lending application, we compare our procedure to the common practice of omitting sensitive data and show that it raises both profit and the probability that creditworthy individuals receive loans. The second chapter extends the canonical economic concept of statistical discrimination to algorithmic decision-making. I show that predictive uncertainty often leads algorithms to systematically disadvantage groups with lower-mean outcomes, assigning them smaller true and false positive rates than their higher-mean counterparts. I prove that this disparate impact can occur even when sensitive data and group identifiers are omitted from training, but that it can be resolved if instead data are enriched. In particular, I demonstrate that data acquisition for lower-mean groups can increase access to opportunity. I call the strategy “affirmative information” and compare it to traditional affirmative action in the classification task of identifying creditworthy borrowers. The third chapter, co-authored with Suhas Vijaykumar, establishes a geometric distinction between classification and regression that allows risk in these two settings to be more precisely related. In particular, we note that classification risk depends only on the direction of a regressor, and we take advantage of this scale invariance to improve existing guarantees for how classification risk is bounded by the risk in the associated regression problem. Building on these guarantees, our analysis makes it possible to compare classification algorithms more accurately. Furthermore, it establishes a notion of the “direction” of a conditional expectation function that motivates the design of accurate new classifiers.
first_indexed	2024-09-23T09:37:46Z
format	Thesis
id	mit-1721.1/140188
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T09:37:46Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1401882022-02-08T18:58:49Z Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms Lazar Reich, Claire Mikusheva, Anna Autor, David Werning, Iván Massachusetts Institute of Technology. Department of Economics Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundamentally fairer and more accurate. This thesis consists of three chapters that develop methods toward that aim. The first chapter, co-authored with Suhas Vijaykumar, demonstrates that it is possible to reconcile two influential criteria for algorithmic fairness that were previously thought to be in conflict: calibration and equal error rates. We present an algorithm that identifies the most accurate set of predictions satisfying both conditions. In a credit-lending application, we compare our procedure to the common practice of omitting sensitive data and show that it raises both profit and the probability that creditworthy individuals receive loans. The second chapter extends the canonical economic concept of statistical discrimination to algorithmic decision-making. I show that predictive uncertainty often leads algorithms to systematically disadvantage groups with lower-mean outcomes, assigning them smaller true and false positive rates than their higher-mean counterparts. I prove that this disparate impact can occur even when sensitive data and group identifiers are omitted from training, but that it can be resolved if instead data are enriched. In particular, I demonstrate that data acquisition for lower-mean groups can increase access to opportunity. I call the strategy “affirmative information” and compare it to traditional affirmative action in the classification task of identifying creditworthy borrowers. The third chapter, co-authored with Suhas Vijaykumar, establishes a geometric distinction between classification and regression that allows risk in these two settings to be more precisely related. In particular, we note that classification risk depends only on the direction of a regressor, and we take advantage of this scale invariance to improve existing guarantees for how classification risk is bounded by the risk in the associated regression problem. Building on these guarantees, our analysis makes it possible to compare classification algorithms more accurately. Furthermore, it establishes a notion of the “direction” of a conditional expectation function that motivates the design of accurate new classifiers. Ph.D. 2022-02-07T15:29:18Z 2022-02-07T15:29:18Z 2021-09 2021-09-15T19:26:58.710Z Thesis https://hdl.handle.net/1721.1/140188 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Lazar Reich, Claire Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title	Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_full	Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_fullStr	Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_full_unstemmed	Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_short	Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
title_sort	methods to improve fairness and accuracy in machine learning with applications to financial algorithms
url	https://hdl.handle.net/1721.1/140188
work_keys_str_mv	AT lazarreichclaire methodstoimprovefairnessandaccuracyinmachinelearningwithapplicationstofinancialalgorithms

Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms

Similar Items