Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms
Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundame...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/140188 |
_version_ | 1811073732862541824 |
---|---|
author | Lazar Reich, Claire |
author2 | Mikusheva, Anna |
author_facet | Mikusheva, Anna Lazar Reich, Claire |
author_sort | Lazar Reich, Claire |
collection | MIT |
description | Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundamentally fairer and more accurate. This thesis consists of three chapters that develop methods toward that aim.
The first chapter, co-authored with Suhas Vijaykumar, demonstrates that it is possible to reconcile two influential criteria for algorithmic fairness that were previously thought to be in conflict: calibration and equal error rates. We present an algorithm that identifies the most accurate set of predictions satisfying both conditions. In a credit-lending application, we compare our procedure to the common practice of omitting sensitive data and show that it raises both profit and the probability that creditworthy individuals receive loans.
The second chapter extends the canonical economic concept of statistical discrimination to algorithmic decision-making. I show that predictive uncertainty often leads algorithms to systematically disadvantage groups with lower-mean outcomes, assigning them smaller true and false positive rates than their higher-mean counterparts. I prove that this disparate impact can occur even when sensitive data and group identifiers are omitted from training, but that it can be resolved if instead data are enriched. In particular, I demonstrate that data acquisition for lower-mean groups can increase access to opportunity. I call the strategy “affirmative information” and compare it to traditional affirmative action in the classification task of identifying creditworthy borrowers.
The third chapter, co-authored with Suhas Vijaykumar, establishes a geometric distinction between classification and regression that allows risk in these two settings to be more precisely related. In particular, we note that classification risk depends only on the direction of a regressor, and we take advantage of this scale invariance to improve existing guarantees for how classification risk is bounded by the risk in the associated regression problem. Building on these guarantees, our analysis makes it possible to compare classification algorithms more accurately. Furthermore, it establishes a notion of the “direction” of a conditional expectation function that motivates the design of accurate new classifiers. |
first_indexed | 2024-09-23T09:37:46Z |
format | Thesis |
id | mit-1721.1/140188 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T09:37:46Z |
publishDate | 2022 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1401882022-02-08T18:58:49Z Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms Lazar Reich, Claire Mikusheva, Anna Autor, David Werning, Iván Massachusetts Institute of Technology. Department of Economics Critical decisions like loan approvals, foster care placements, and medical interventions are increasingly determined by data-driven prediction algorithms. These algorithms have the potential to greatly aid decision-makers, but in practice, many can be redesigned to achieve outcomes that are fundamentally fairer and more accurate. This thesis consists of three chapters that develop methods toward that aim. The first chapter, co-authored with Suhas Vijaykumar, demonstrates that it is possible to reconcile two influential criteria for algorithmic fairness that were previously thought to be in conflict: calibration and equal error rates. We present an algorithm that identifies the most accurate set of predictions satisfying both conditions. In a credit-lending application, we compare our procedure to the common practice of omitting sensitive data and show that it raises both profit and the probability that creditworthy individuals receive loans. The second chapter extends the canonical economic concept of statistical discrimination to algorithmic decision-making. I show that predictive uncertainty often leads algorithms to systematically disadvantage groups with lower-mean outcomes, assigning them smaller true and false positive rates than their higher-mean counterparts. I prove that this disparate impact can occur even when sensitive data and group identifiers are omitted from training, but that it can be resolved if instead data are enriched. In particular, I demonstrate that data acquisition for lower-mean groups can increase access to opportunity. I call the strategy “affirmative information” and compare it to traditional affirmative action in the classification task of identifying creditworthy borrowers. The third chapter, co-authored with Suhas Vijaykumar, establishes a geometric distinction between classification and regression that allows risk in these two settings to be more precisely related. In particular, we note that classification risk depends only on the direction of a regressor, and we take advantage of this scale invariance to improve existing guarantees for how classification risk is bounded by the risk in the associated regression problem. Building on these guarantees, our analysis makes it possible to compare classification algorithms more accurately. Furthermore, it establishes a notion of the “direction” of a conditional expectation function that motivates the design of accurate new classifiers. Ph.D. 2022-02-07T15:29:18Z 2022-02-07T15:29:18Z 2021-09 2021-09-15T19:26:58.710Z Thesis https://hdl.handle.net/1721.1/140188 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Lazar Reich, Claire Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms |
title | Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms |
title_full | Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms |
title_fullStr | Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms |
title_full_unstemmed | Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms |
title_short | Methods to Improve Fairness and Accuracy in Machine Learning, with Applications to Financial Algorithms |
title_sort | methods to improve fairness and accuracy in machine learning with applications to financial algorithms |
url | https://hdl.handle.net/1721.1/140188 |
work_keys_str_mv | AT lazarreichclaire methodstoimprovefairnessandaccuracyinmachinelearningwithapplicationstofinancialalgorithms |