Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review

<jats:sec id="sec001"> <jats:title>Background</jats:title> <jats:p>While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-...

Full description

Bibliographic Details
Main Authors: Celi, Leo Anthony, Cellini, Jacqueline, Charpignon, Marie-Laure, Dee, Edward Christopher, Dernoncourt, Franck, Eber, Rene, Mitchell, William Greig, Moukheiber, Lama, Schirmer, Julian, Situ, Julia, Paguio, Joseph, Park, Joel, Wawira, Judy Gichoya, Yao, Seth
Other Authors: Massachusetts Institute of Technology. Institute for Medical Engineering & Science
Format: Article
Published: Public Library of Science (PLoS) 2022
Online Access:https://hdl.handle.net/1721.1/142623
_version_ 1811072578972811264
author Celi, Leo Anthony
Cellini, Jacqueline
Charpignon, Marie-Laure
Dee, Edward Christopher
Dernoncourt, Franck
Eber, Rene
Mitchell, William Greig
Moukheiber, Lama
Schirmer, Julian
Situ, Julia
Paguio, Joseph
Park, Joel
Wawira, Judy Gichoya
Yao, Seth
author2 Massachusetts Institute of Technology. Institute for Medical Engineering & Science
author_facet Massachusetts Institute of Technology. Institute for Medical Engineering & Science
Celi, Leo Anthony
Cellini, Jacqueline
Charpignon, Marie-Laure
Dee, Edward Christopher
Dernoncourt, Franck
Eber, Rene
Mitchell, William Greig
Moukheiber, Lama
Schirmer, Julian
Situ, Julia
Paguio, Joseph
Park, Joel
Wawira, Judy Gichoya
Yao, Seth
author_sort Celi, Leo Anthony
collection MIT
description <jats:sec id="sec001"> <jats:title>Background</jats:title> <jats:p>While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based decisions. Here, we describe the landscape of AI in clinical medicine to delineate population and data-source disparities.</jats:p> </jats:sec> <jats:sec id="sec002"> <jats:title>Methods</jats:title> <jats:p>We performed a scoping review of clinical papers published in PubMed in 2019 using AI techniques. We assessed differences in dataset country source, clinical specialty, and author nationality, sex, and expertise. A manually tagged subsample of PubMed articles was used to train a model, leveraging transfer-learning techniques (building upon an existing BioBERT model) to predict eligibility for inclusion (original, human, clinical AI literature). Of all eligible articles, database country source and clinical specialty were manually labelled. A BioBERT-based model predicted first/last author expertise. Author nationality was determined using corresponding affiliated institution information using Entrez Direct. And first/last author sex was evaluated using the Gendarize.io API.</jats:p> </jats:sec> <jats:sec id="sec003"> <jats:title>Results</jats:title> <jats:p>Our search yielded 30,576 articles, of which 7,314 (23.9%) were eligible for further analysis. Most databases came from the US (40.8%) and China (13.7%). Radiology was the most represented clinical specialty (40.4%), followed by pathology (9.1%). Authors were primarily from either China (24.0%) or the US (18.4%). First and last authors were predominately data experts (i.e., statisticians) (59.6% and 53.9% respectively) rather than clinicians. And the majority of first/last authors were male (74.1%).</jats:p> </jats:sec> <jats:sec id="sec004"> <jats:title>Interpretation</jats:title> <jats:p>U.S. and Chinese datasets and authors were disproportionately overrepresented in clinical AI, and almost all of the top 10 databases and author nationalities were from high income countries (HICs). AI techniques were most commonly employed for image-rich specialties, and authors were predominantly male, with non-clinical backgrounds. Development of technological infrastructure in data-poor regions, and diligence in external validation and model re-calibration prior to clinical implementation in the short-term, are crucial in ensuring clinical AI is meaningful for broader populations, and to avoid perpetuating global health inequity.</jats:p> </jats:sec>
first_indexed 2024-09-23T09:08:16Z
format Article
id mit-1721.1/142623
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T09:08:16Z
publishDate 2022
publisher Public Library of Science (PLoS)
record_format dspace
spelling mit-1721.1/1426232024-03-19T14:21:05Z Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review Celi, Leo Anthony Cellini, Jacqueline Charpignon, Marie-Laure Dee, Edward Christopher Dernoncourt, Franck Eber, Rene Mitchell, William Greig Moukheiber, Lama Schirmer, Julian Situ, Julia Paguio, Joseph Park, Joel Wawira, Judy Gichoya Yao, Seth Massachusetts Institute of Technology. Institute for Medical Engineering & Science Massachusetts Institute of Technology. Institute for Data, Systems, and Society MIT Critical Data (Laboratory) <jats:sec id="sec001"> <jats:title>Background</jats:title> <jats:p>While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based decisions. Here, we describe the landscape of AI in clinical medicine to delineate population and data-source disparities.</jats:p> </jats:sec> <jats:sec id="sec002"> <jats:title>Methods</jats:title> <jats:p>We performed a scoping review of clinical papers published in PubMed in 2019 using AI techniques. We assessed differences in dataset country source, clinical specialty, and author nationality, sex, and expertise. A manually tagged subsample of PubMed articles was used to train a model, leveraging transfer-learning techniques (building upon an existing BioBERT model) to predict eligibility for inclusion (original, human, clinical AI literature). Of all eligible articles, database country source and clinical specialty were manually labelled. A BioBERT-based model predicted first/last author expertise. Author nationality was determined using corresponding affiliated institution information using Entrez Direct. And first/last author sex was evaluated using the Gendarize.io API.</jats:p> </jats:sec> <jats:sec id="sec003"> <jats:title>Results</jats:title> <jats:p>Our search yielded 30,576 articles, of which 7,314 (23.9%) were eligible for further analysis. Most databases came from the US (40.8%) and China (13.7%). Radiology was the most represented clinical specialty (40.4%), followed by pathology (9.1%). Authors were primarily from either China (24.0%) or the US (18.4%). First and last authors were predominately data experts (i.e., statisticians) (59.6% and 53.9% respectively) rather than clinicians. And the majority of first/last authors were male (74.1%).</jats:p> </jats:sec> <jats:sec id="sec004"> <jats:title>Interpretation</jats:title> <jats:p>U.S. and Chinese datasets and authors were disproportionately overrepresented in clinical AI, and almost all of the top 10 databases and author nationalities were from high income countries (HICs). AI techniques were most commonly employed for image-rich specialties, and authors were predominantly male, with non-clinical backgrounds. Development of technological infrastructure in data-poor regions, and diligence in external validation and model re-calibration prior to clinical implementation in the short-term, are crucial in ensuring clinical AI is meaningful for broader populations, and to avoid perpetuating global health inequity.</jats:p> </jats:sec> 2022-05-20T11:56:46Z 2022-05-20T11:56:46Z 2022-03-31 Article http://purl.org/eprint/type/JournalArticle 2767-3170 https://hdl.handle.net/1721.1/142623 Celi, Leo Anthony, Cellini, Jacqueline, Charpignon, Marie-Laure, Dee, Edward Christopher, Dernoncourt, Franck et al. 2022. "Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review." 1 (3). 10.1371/journal.pdig.0000022 Creative Commons Attribution 4.0 International License https://creativecommons.org/licenses/by/4.0 application/pdf Public Library of Science (PLoS) PLoS
spellingShingle Celi, Leo Anthony
Cellini, Jacqueline
Charpignon, Marie-Laure
Dee, Edward Christopher
Dernoncourt, Franck
Eber, Rene
Mitchell, William Greig
Moukheiber, Lama
Schirmer, Julian
Situ, Julia
Paguio, Joseph
Park, Joel
Wawira, Judy Gichoya
Yao, Seth
Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
title Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
title_full Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
title_fullStr Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
title_full_unstemmed Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
title_short Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review
title_sort sources of bias in artificial intelligence that perpetuate healthcare disparities a global review
url https://hdl.handle.net/1721.1/142623
work_keys_str_mv AT celileoanthony sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT cellinijacqueline sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT charpignonmarielaure sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT deeedwardchristopher sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT dernoncourtfranck sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT eberrene sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT mitchellwilliamgreig sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT moukheiberlama sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT schirmerjulian sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT situjulia sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT paguiojoseph sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT parkjoel sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT wawirajudygichoya sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview
AT yaoseth sourcesofbiasinartificialintelligencethatperpetuatehealthcaredisparitiesaglobalreview