Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery

MRI has been extensively used to identify anatomical and functional differences in Autism Spectrum Disorder (ASD). Yet, many of these findings have proven difficult to replicate because studies rely on small cohorts and are built on many complex, undisclosed, analytic choices. We conducted an intern...

Full description

Bibliographic Details
Main Authors: Nicolas Traut, Katja Heuer, Guillaume Lemaître, Anita Beggiato, David Germanaud, Monique Elmaleh, Alban Bethegnies, Laurent Bonnasse-Gahot, Weidong Cai, Stanislas Chambon, Freddy Cliquet, Ayoub Ghriss, Nicolas Guigui, Amicie de Pierrefeu, Meng Wang, Valentina Zantedeschi, Alexandre Boucaud, Joris van den Bossche, Balázs Kegl, Richard Delorme, Thomas Bourgeron, Roberto Toro, Gaël Varoquaux
Format: Article
Language:English
Published: Elsevier 2022-07-01
Series:NeuroImage
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1053811922002981
_version_ 1818206931520585728
author Nicolas Traut
Katja Heuer
Guillaume Lemaître
Anita Beggiato
David Germanaud
Monique Elmaleh
Alban Bethegnies
Laurent Bonnasse-Gahot
Weidong Cai
Stanislas Chambon
Freddy Cliquet
Ayoub Ghriss
Nicolas Guigui
Amicie de Pierrefeu
Meng Wang
Valentina Zantedeschi
Alexandre Boucaud
Joris van den Bossche
Balázs Kegl
Richard Delorme
Thomas Bourgeron
Roberto Toro
Gaël Varoquaux
author_facet Nicolas Traut
Katja Heuer
Guillaume Lemaître
Anita Beggiato
David Germanaud
Monique Elmaleh
Alban Bethegnies
Laurent Bonnasse-Gahot
Weidong Cai
Stanislas Chambon
Freddy Cliquet
Ayoub Ghriss
Nicolas Guigui
Amicie de Pierrefeu
Meng Wang
Valentina Zantedeschi
Alexandre Boucaud
Joris van den Bossche
Balázs Kegl
Richard Delorme
Thomas Bourgeron
Roberto Toro
Gaël Varoquaux
author_sort Nicolas Traut
collection DOAJ
description MRI has been extensively used to identify anatomical and functional differences in Autism Spectrum Disorder (ASD). Yet, many of these findings have proven difficult to replicate because studies rely on small cohorts and are built on many complex, undisclosed, analytic choices. We conducted an international challenge to predict ASD diagnosis from MRI data, where we provided preprocessed anatomical and functional MRI data from > 2,000 individuals. Evaluation of the predictions was rigorously blinded. 146 challengers submitted prediction algorithms, which were evaluated at the end of the challenge using unseen data and an additional acquisition site. On the best algorithms, we studied the importance of MRI modalities, brain regions, and sample size. We found evidence that MRI could predict ASD diagnosis: the 10 best algorithms reliably predicted diagnosis with AUC∼0.80 – far superior to what can be currently obtained using genotyping data in cohorts 20-times larger. We observed that functional MRI was more important for prediction than anatomical MRI, and that increasing sample size steadily increased prediction accuracy, providing an efficient strategy to improve biomarkers. We also observed that despite a strong incentive to generalise to unseen data, model development on a given dataset faces the risk of overfitting: performing well in cross-validation on the data at hand, but not generalising. Finally, we were able to predict ASD diagnosis on an external sample added after the end of the challenge (EU-AIMS), although with a lower prediction accuracy (AUC=0.72). This indicates that despite being based on a large multisite cohort, our challenge still produced biomarkers fragile in the face of dataset shifts.
first_indexed 2024-12-12T04:20:52Z
format Article
id doaj.art-1fea146ac4c34cb3954bdb6784be4e75
institution Directory Open Access Journal
issn 1095-9572
language English
last_indexed 2024-12-12T04:20:52Z
publishDate 2022-07-01
publisher Elsevier
record_format Article
series NeuroImage
spelling doaj.art-1fea146ac4c34cb3954bdb6784be4e752022-12-22T00:38:19ZengElsevierNeuroImage1095-95722022-07-01255119171Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discoveryNicolas Traut0Katja Heuer1Guillaume Lemaître2Anita Beggiato3David Germanaud4Monique Elmaleh5Alban Bethegnies6Laurent Bonnasse-Gahot7Weidong Cai8Stanislas Chambon9Freddy Cliquet10Ayoub Ghriss11Nicolas Guigui12Amicie de Pierrefeu13Meng Wang14Valentina Zantedeschi15Alexandre Boucaud16Joris van den Bossche17Balázs Kegl18Richard Delorme19Thomas Bourgeron20Roberto Toro21Gaël Varoquaux22Institut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, France; Center for Research and Interdisciplinarity (CRI), Université Paris Descartes, Paris, FranceInstitut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, France; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Center for Research and Interdisciplinarity (CRI), Université Paris Descartes, Paris, FranceParietal, Inria, Saclay, France; Paris-Saclay Center for Data Science, Université Paris Saclay, Saclay, FranceInstitut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, France; Child and Adolescent Psychiatry Department, Robert Debré, APHP, Paris, FranceNeurospin CEA, Saclay, FranceDepartment of Radiology, Robert Debré, APHP, Paris, FranceHosa.io, Paris, FranceCentre d'Analyse et de Mathématique Sociales, EHESS, CNRS, PSL, Paris, FranceStanford University School of Medicine, Palo Alto, USRythm.co, 75009 ParisInstitut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, FranceUniversity of Colorado, Boulder, USNeurospin CEA, Saclay, FranceNeurospin CEA, Saclay, FranceBrainnetome Center and National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, ChinaUniv Lyon, UJM-Saint-Etienne, CNRS, Institut d'Optique Graduate School, Laboratoire Hubert Curien UMR 5516, F-42023, Saint-Etienne, FranceParietal, Inria, Saclay, France; Paris-Saclay Center for Data Science, Université Paris Saclay, Saclay, FranceParietal, Inria, Saclay, France; Paris-Saclay Center for Data Science, Université Paris Saclay, Saclay, FranceHuawei, ParisInstitut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, France; Child and Adolescent Psychiatry Department, Robert Debré, APHP, Paris, FranceInstitut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, FranceInstitut Pasteur, Université de Paris, Département de neuroscience, F-75015 Paris, FranceParietal, Inria, Saclay, France; Soda, Inria, Saclay, France; Corresponding author.MRI has been extensively used to identify anatomical and functional differences in Autism Spectrum Disorder (ASD). Yet, many of these findings have proven difficult to replicate because studies rely on small cohorts and are built on many complex, undisclosed, analytic choices. We conducted an international challenge to predict ASD diagnosis from MRI data, where we provided preprocessed anatomical and functional MRI data from > 2,000 individuals. Evaluation of the predictions was rigorously blinded. 146 challengers submitted prediction algorithms, which were evaluated at the end of the challenge using unseen data and an additional acquisition site. On the best algorithms, we studied the importance of MRI modalities, brain regions, and sample size. We found evidence that MRI could predict ASD diagnosis: the 10 best algorithms reliably predicted diagnosis with AUC∼0.80 – far superior to what can be currently obtained using genotyping data in cohorts 20-times larger. We observed that functional MRI was more important for prediction than anatomical MRI, and that increasing sample size steadily increased prediction accuracy, providing an efficient strategy to improve biomarkers. We also observed that despite a strong incentive to generalise to unseen data, model development on a given dataset faces the risk of overfitting: performing well in cross-validation on the data at hand, but not generalising. Finally, we were able to predict ASD diagnosis on an external sample added after the end of the challenge (EU-AIMS), although with a lower prediction accuracy (AUC=0.72). This indicates that despite being based on a large multisite cohort, our challenge still produced biomarkers fragile in the face of dataset shifts.http://www.sciencedirect.com/science/article/pii/S1053811922002981Autismdiagnosticmachine learningbenchmarkoverfitprediction
spellingShingle Nicolas Traut
Katja Heuer
Guillaume Lemaître
Anita Beggiato
David Germanaud
Monique Elmaleh
Alban Bethegnies
Laurent Bonnasse-Gahot
Weidong Cai
Stanislas Chambon
Freddy Cliquet
Ayoub Ghriss
Nicolas Guigui
Amicie de Pierrefeu
Meng Wang
Valentina Zantedeschi
Alexandre Boucaud
Joris van den Bossche
Balázs Kegl
Richard Delorme
Thomas Bourgeron
Roberto Toro
Gaël Varoquaux
Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery
NeuroImage
Autism
diagnostic
machine learning
benchmark
overfit
prediction
title Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery
title_full Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery
title_fullStr Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery
title_full_unstemmed Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery
title_short Insights from an autism imaging biomarker challenge: Promises and threats to biomarker discovery
title_sort insights from an autism imaging biomarker challenge promises and threats to biomarker discovery
topic Autism
diagnostic
machine learning
benchmark
overfit
prediction
url http://www.sciencedirect.com/science/article/pii/S1053811922002981
work_keys_str_mv AT nicolastraut insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT katjaheuer insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT guillaumelemaitre insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT anitabeggiato insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT davidgermanaud insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT moniqueelmaleh insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT albanbethegnies insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT laurentbonnassegahot insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT weidongcai insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT stanislaschambon insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT freddycliquet insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT ayoubghriss insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT nicolasguigui insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT amiciedepierrefeu insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT mengwang insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT valentinazantedeschi insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT alexandreboucaud insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT jorisvandenbossche insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT balazskegl insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT richarddelorme insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT thomasbourgeron insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT robertotoro insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery
AT gaelvaroquaux insightsfromanautismimagingbiomarkerchallengepromisesandthreatstobiomarkerdiscovery