Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.

Brain imaging research enjoys increasing adoption of supervised machine learning for single-participant disease classification. Yet, the success of these algorithms likely depends on population diversity, including demographic differences and other factors that may be outside of primary scientific i...

Full description

Bibliographic Details
Main Authors: Oualid Benkarim, Casey Paquola, Bo-Yong Park, Valeria Kebets, Seok-Jun Hong, Reinder Vos de Wael, Shaoshi Zhang, B T Thomas Yeo, Michael Eickenberg, Tian Ge, Jean-Baptiste Poline, Boris C Bernhardt, Danilo Bzdok
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-04-01
Series:PLoS Biology
Online Access:https://doi.org/10.1371/journal.pbio.3001627
_version_ 1828746468363599872
author Oualid Benkarim
Casey Paquola
Bo-Yong Park
Valeria Kebets
Seok-Jun Hong
Reinder Vos de Wael
Shaoshi Zhang
B T Thomas Yeo
Michael Eickenberg
Tian Ge
Jean-Baptiste Poline
Boris C Bernhardt
Danilo Bzdok
author_facet Oualid Benkarim
Casey Paquola
Bo-Yong Park
Valeria Kebets
Seok-Jun Hong
Reinder Vos de Wael
Shaoshi Zhang
B T Thomas Yeo
Michael Eickenberg
Tian Ge
Jean-Baptiste Poline
Boris C Bernhardt
Danilo Bzdok
author_sort Oualid Benkarim
collection DOAJ
description Brain imaging research enjoys increasing adoption of supervised machine learning for single-participant disease classification. Yet, the success of these algorithms likely depends on population diversity, including demographic differences and other factors that may be outside of primary scientific interest. Here, we capitalize on propensity scores as a composite confound index to quantify diversity due to major sources of population variation. We delineate the impact of population heterogeneity on the predictive accuracy and pattern stability in 2 separate clinical cohorts: the Autism Brain Imaging Data Exchange (ABIDE, n = 297) and the Healthy Brain Network (HBN, n = 551). Across various analysis scenarios, our results uncover the extent to which cross-validated prediction performances are interlocked with diversity. The instability of extracted brain patterns attributable to diversity is located preferentially in regions part of the default mode network. Collectively, our findings highlight the limitations of prevailing deconfounding practices in mitigating the full consequences of population diversity.
first_indexed 2024-04-14T04:25:10Z
format Article
id doaj.art-64f312f7b0244c1d91348ad7f49b43f9
institution Directory Open Access Journal
issn 1544-9173
1545-7885
language English
last_indexed 2024-04-14T04:25:10Z
publishDate 2022-04-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Biology
spelling doaj.art-64f312f7b0244c1d91348ad7f49b43f92022-12-22T02:12:21ZengPublic Library of Science (PLoS)PLoS Biology1544-91731545-78852022-04-01204e300162710.1371/journal.pbio.3001627Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.Oualid BenkarimCasey PaquolaBo-Yong ParkValeria KebetsSeok-Jun HongReinder Vos de WaelShaoshi ZhangB T Thomas YeoMichael EickenbergTian GeJean-Baptiste PolineBoris C BernhardtDanilo BzdokBrain imaging research enjoys increasing adoption of supervised machine learning for single-participant disease classification. Yet, the success of these algorithms likely depends on population diversity, including demographic differences and other factors that may be outside of primary scientific interest. Here, we capitalize on propensity scores as a composite confound index to quantify diversity due to major sources of population variation. We delineate the impact of population heterogeneity on the predictive accuracy and pattern stability in 2 separate clinical cohorts: the Autism Brain Imaging Data Exchange (ABIDE, n = 297) and the Healthy Brain Network (HBN, n = 551). Across various analysis scenarios, our results uncover the extent to which cross-validated prediction performances are interlocked with diversity. The instability of extracted brain patterns attributable to diversity is located preferentially in regions part of the default mode network. Collectively, our findings highlight the limitations of prevailing deconfounding practices in mitigating the full consequences of population diversity.https://doi.org/10.1371/journal.pbio.3001627
spellingShingle Oualid Benkarim
Casey Paquola
Bo-Yong Park
Valeria Kebets
Seok-Jun Hong
Reinder Vos de Wael
Shaoshi Zhang
B T Thomas Yeo
Michael Eickenberg
Tian Ge
Jean-Baptiste Poline
Boris C Bernhardt
Danilo Bzdok
Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.
PLoS Biology
title Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.
title_full Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.
title_fullStr Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.
title_full_unstemmed Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.
title_short Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging.
title_sort population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging
url https://doi.org/10.1371/journal.pbio.3001627
work_keys_str_mv AT oualidbenkarim populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT caseypaquola populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT boyongpark populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT valeriakebets populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT seokjunhong populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT reindervosdewael populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT shaoshizhang populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT btthomasyeo populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT michaeleickenberg populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT tiange populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT jeanbaptistepoline populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT boriscbernhardt populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging
AT danilobzdok populationheterogeneityinclinicalcohortsaffectsthepredictiveaccuracyofbrainimaging