Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model

Abstrac: This study presents a comprehensive examination of sex-related differences in resting-state electroencephalogram (EEG) data, leveraging two different types of machine learning models to predict an individual's sex. We utilized data from the Two Decades-Brainclinics Research Archive for...

Full description

Bibliographic Details
Main Authors: Mariam Khayretdinova, Ilya Zakharov, Polina Pshonkovskaya, Timothy Adamovich, Andrey Kiryasov, Andrey Zhdanov, Alexey Shovkun
Format: Article
Language:English
Published: Elsevier 2024-01-01
Series:NeuroImage
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1053811923006456
_version_ 1797359958726017024
author Mariam Khayretdinova
Ilya Zakharov
Polina Pshonkovskaya
Timothy Adamovich
Andrey Kiryasov
Andrey Zhdanov
Alexey Shovkun
author_facet Mariam Khayretdinova
Ilya Zakharov
Polina Pshonkovskaya
Timothy Adamovich
Andrey Kiryasov
Andrey Zhdanov
Alexey Shovkun
author_sort Mariam Khayretdinova
collection DOAJ
description Abstrac: This study presents a comprehensive examination of sex-related differences in resting-state electroencephalogram (EEG) data, leveraging two different types of machine learning models to predict an individual's sex. We utilized data from the Two Decades-Brainclinics Research Archive for Insights in Neurophysiology (TDBRAIN) EEG study, affirming that gender prediction can be attained with noteworthy accuracy. The best performing model achieved an accuracy of 85% and an ROC AUC of 89%, surpassing all prior benchmarks set using EEG data and rivaling the top-tier results derived from fMRI studies. A comparative analysis of LightGBM and Deep Convolutional Neural Network (DCNN) models revealed DCNN's superior performance, attributed to its ability to learn complex spatial-temporal patterns in the EEG data and handle large volumes of data effectively. Despite this, interpretability remained a challenge for the DCNN model. The LightGBM interpretability analysis revealed that the most important EEG features for accurate sex prediction were related to left fronto-central and parietal EEG connectivity. We also showed the role of both low (delta and theta) and high (beta and gamma) activity in the accurate sex prediction. These results, however, have to be approached with caution, because it was obtained from a dataset comprised largely of participants with various mental health conditions, which limits the generalizability of the results and necessitates further validation in future studies. . Overall, the study illuminates the potential of interpretable machine learning for sex prediction, alongside highlighting the importance of considering individual differences in prediction sex from brain activity.
first_indexed 2024-03-08T15:31:27Z
format Article
id doaj.art-3ac32331aecb44a1a6adf83f68fce090
institution Directory Open Access Journal
issn 1095-9572
language English
last_indexed 2024-03-08T15:31:27Z
publishDate 2024-01-01
publisher Elsevier
record_format Article
series NeuroImage
spelling doaj.art-3ac32331aecb44a1a6adf83f68fce0902024-01-10T04:35:11ZengElsevierNeuroImage1095-95722024-01-01285120495Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML modelMariam Khayretdinova0Ilya Zakharov1Polina Pshonkovskaya2Timothy Adamovich3Andrey Kiryasov4Andrey Zhdanov5Alexey Shovkun6Corresponding authors.; Brainify.AI, Dover, Delaware, United StatesCorresponding authors.; Brainify.AI, Dover, Delaware, United StatesBrainify.AI, Dover, Delaware, United StatesBrainify.AI, Dover, Delaware, United StatesBrainify.AI, Dover, Delaware, United StatesBrainify.AI, Dover, Delaware, United StatesBrainify.AI, Dover, Delaware, United StatesAbstrac: This study presents a comprehensive examination of sex-related differences in resting-state electroencephalogram (EEG) data, leveraging two different types of machine learning models to predict an individual's sex. We utilized data from the Two Decades-Brainclinics Research Archive for Insights in Neurophysiology (TDBRAIN) EEG study, affirming that gender prediction can be attained with noteworthy accuracy. The best performing model achieved an accuracy of 85% and an ROC AUC of 89%, surpassing all prior benchmarks set using EEG data and rivaling the top-tier results derived from fMRI studies. A comparative analysis of LightGBM and Deep Convolutional Neural Network (DCNN) models revealed DCNN's superior performance, attributed to its ability to learn complex spatial-temporal patterns in the EEG data and handle large volumes of data effectively. Despite this, interpretability remained a challenge for the DCNN model. The LightGBM interpretability analysis revealed that the most important EEG features for accurate sex prediction were related to left fronto-central and parietal EEG connectivity. We also showed the role of both low (delta and theta) and high (beta and gamma) activity in the accurate sex prediction. These results, however, have to be approached with caution, because it was obtained from a dataset comprised largely of participants with various mental health conditions, which limits the generalizability of the results and necessitates further validation in future studies. . Overall, the study illuminates the potential of interpretable machine learning for sex prediction, alongside highlighting the importance of considering individual differences in prediction sex from brain activity.http://www.sciencedirect.com/science/article/pii/S1053811923006456Resting state EEGSex-related brain differencesDCNNLightGBMFeature importance analysis
spellingShingle Mariam Khayretdinova
Ilya Zakharov
Polina Pshonkovskaya
Timothy Adamovich
Andrey Kiryasov
Andrey Zhdanov
Alexey Shovkun
Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model
NeuroImage
Resting state EEG
Sex-related brain differences
DCNN
LightGBM
Feature importance analysis
title Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model
title_full Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model
title_fullStr Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model
title_full_unstemmed Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model
title_short Prediction of brain sex from EEG: using large-scale heterogeneous dataset for developing a highly accurate and interpretable ML model
title_sort prediction of brain sex from eeg using large scale heterogeneous dataset for developing a highly accurate and interpretable ml model
topic Resting state EEG
Sex-related brain differences
DCNN
LightGBM
Feature importance analysis
url http://www.sciencedirect.com/science/article/pii/S1053811923006456
work_keys_str_mv AT mariamkhayretdinova predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel
AT ilyazakharov predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel
AT polinapshonkovskaya predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel
AT timothyadamovich predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel
AT andreykiryasov predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel
AT andreyzhdanov predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel
AT alexeyshovkun predictionofbrainsexfromeegusinglargescaleheterogeneousdatasetfordevelopingahighlyaccurateandinterpretablemlmodel