Multi-objective learning and explanation for stroke risk assessment in Shanxi province

Abstract Stroke is the leading cause of death in China (Zhou et al. in The Lancet, 2019). A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To ha...

Full description

Bibliographic Details
Main Authors: Jing Ma, Yiyang Sun, Junjie Liu, Huaxiong Huang, Xiaoshuang Zhou, Shixin Xu
Format: Article
Language:English
Published: Nature Portfolio 2022-12-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-022-26595-z
_version_ 1797973639579041792
author Jing Ma
Yiyang Sun
Junjie Liu
Huaxiong Huang
Xiaoshuang Zhou
Shixin Xu
author_facet Jing Ma
Yiyang Sun
Junjie Liu
Huaxiong Huang
Xiaoshuang Zhou
Shixin Xu
author_sort Jing Ma
collection DOAJ
description Abstract Stroke is the leading cause of death in China (Zhou et al. in The Lancet, 2019). A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To handle the issues related to an imbalanced sample set, the quadratic interactive deep model (QIDeep) was first proposed by flexible selection and appending of quadratic interactive features. The experimental results showed that the QIDeep model with 3 interactive features achieved the state-of-the-art accuracy 83.33%(95% CI (83.14%; 83.52%)). Blood pressure, physical inactivity, smoking, weight, and total cholesterol are the top five most important features. For the sake of high recall in the attack state, stroke occurrence prediction is considered an auxiliary objective in multi-objective learning. The prediction accuracy was improved, while the recall of the attack state was increased by 17.79% (to 82.06%) compared to QIDeep (from 71.49%) with the same features. The prediction model and analysis tool in this paper provided not only a prediction method but also an attribution explanation of the risk states and transition direction of each patient, a valuable tool for doctors to analyze and diagnose the disease.
first_indexed 2024-04-11T04:07:28Z
format Article
id doaj.art-7e6f61139bf4429ba16e42d9255494cf
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-11T04:07:28Z
publishDate 2022-12-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-7e6f61139bf4429ba16e42d9255494cf2023-01-01T12:17:32ZengNature PortfolioScientific Reports2045-23222022-12-0112111010.1038/s41598-022-26595-zMulti-objective learning and explanation for stroke risk assessment in Shanxi provinceJing Ma0Yiyang Sun1Junjie Liu2Huaxiong Huang3Xiaoshuang Zhou4Shixin Xu5Research Center for Mathematics, Beijing Normal UniversityGlobal Health Research Center, Data Science Research Center, Duke Kunshan UniversityBNU-HKBU United International CollegeResearch Center for Mathematics, Beijing Normal UniversityDepartment of Nephrology, Shanxi Provincial People’s HospitalGlobal Health Research Center, Data Science Research Center, Duke Kunshan UniversityAbstract Stroke is the leading cause of death in China (Zhou et al. in The Lancet, 2019). A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To handle the issues related to an imbalanced sample set, the quadratic interactive deep model (QIDeep) was first proposed by flexible selection and appending of quadratic interactive features. The experimental results showed that the QIDeep model with 3 interactive features achieved the state-of-the-art accuracy 83.33%(95% CI (83.14%; 83.52%)). Blood pressure, physical inactivity, smoking, weight, and total cholesterol are the top five most important features. For the sake of high recall in the attack state, stroke occurrence prediction is considered an auxiliary objective in multi-objective learning. The prediction accuracy was improved, while the recall of the attack state was increased by 17.79% (to 82.06%) compared to QIDeep (from 71.49%) with the same features. The prediction model and analysis tool in this paper provided not only a prediction method but also an attribution explanation of the risk states and transition direction of each patient, a valuable tool for doctors to analyze and diagnose the disease.https://doi.org/10.1038/s41598-022-26595-z
spellingShingle Jing Ma
Yiyang Sun
Junjie Liu
Huaxiong Huang
Xiaoshuang Zhou
Shixin Xu
Multi-objective learning and explanation for stroke risk assessment in Shanxi province
Scientific Reports
title Multi-objective learning and explanation for stroke risk assessment in Shanxi province
title_full Multi-objective learning and explanation for stroke risk assessment in Shanxi province
title_fullStr Multi-objective learning and explanation for stroke risk assessment in Shanxi province
title_full_unstemmed Multi-objective learning and explanation for stroke risk assessment in Shanxi province
title_short Multi-objective learning and explanation for stroke risk assessment in Shanxi province
title_sort multi objective learning and explanation for stroke risk assessment in shanxi province
url https://doi.org/10.1038/s41598-022-26595-z
work_keys_str_mv AT jingma multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince
AT yiyangsun multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince
AT junjieliu multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince
AT huaxionghuang multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince
AT xiaoshuangzhou multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince
AT shixinxu multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince