Multi-objective learning and explanation for stroke risk assessment in Shanxi province
Abstract Stroke is the leading cause of death in China (Zhou et al. in The Lancet, 2019). A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To ha...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2022-12-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-022-26595-z |
_version_ | 1797973639579041792 |
---|---|
author | Jing Ma Yiyang Sun Junjie Liu Huaxiong Huang Xiaoshuang Zhou Shixin Xu |
author_facet | Jing Ma Yiyang Sun Junjie Liu Huaxiong Huang Xiaoshuang Zhou Shixin Xu |
author_sort | Jing Ma |
collection | DOAJ |
description | Abstract Stroke is the leading cause of death in China (Zhou et al. in The Lancet, 2019). A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To handle the issues related to an imbalanced sample set, the quadratic interactive deep model (QIDeep) was first proposed by flexible selection and appending of quadratic interactive features. The experimental results showed that the QIDeep model with 3 interactive features achieved the state-of-the-art accuracy 83.33%(95% CI (83.14%; 83.52%)). Blood pressure, physical inactivity, smoking, weight, and total cholesterol are the top five most important features. For the sake of high recall in the attack state, stroke occurrence prediction is considered an auxiliary objective in multi-objective learning. The prediction accuracy was improved, while the recall of the attack state was increased by 17.79% (to 82.06%) compared to QIDeep (from 71.49%) with the same features. The prediction model and analysis tool in this paper provided not only a prediction method but also an attribution explanation of the risk states and transition direction of each patient, a valuable tool for doctors to analyze and diagnose the disease. |
first_indexed | 2024-04-11T04:07:28Z |
format | Article |
id | doaj.art-7e6f61139bf4429ba16e42d9255494cf |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-11T04:07:28Z |
publishDate | 2022-12-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-7e6f61139bf4429ba16e42d9255494cf2023-01-01T12:17:32ZengNature PortfolioScientific Reports2045-23222022-12-0112111010.1038/s41598-022-26595-zMulti-objective learning and explanation for stroke risk assessment in Shanxi provinceJing Ma0Yiyang Sun1Junjie Liu2Huaxiong Huang3Xiaoshuang Zhou4Shixin Xu5Research Center for Mathematics, Beijing Normal UniversityGlobal Health Research Center, Data Science Research Center, Duke Kunshan UniversityBNU-HKBU United International CollegeResearch Center for Mathematics, Beijing Normal UniversityDepartment of Nephrology, Shanxi Provincial People’s HospitalGlobal Health Research Center, Data Science Research Center, Duke Kunshan UniversityAbstract Stroke is the leading cause of death in China (Zhou et al. in The Lancet, 2019). A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To handle the issues related to an imbalanced sample set, the quadratic interactive deep model (QIDeep) was first proposed by flexible selection and appending of quadratic interactive features. The experimental results showed that the QIDeep model with 3 interactive features achieved the state-of-the-art accuracy 83.33%(95% CI (83.14%; 83.52%)). Blood pressure, physical inactivity, smoking, weight, and total cholesterol are the top five most important features. For the sake of high recall in the attack state, stroke occurrence prediction is considered an auxiliary objective in multi-objective learning. The prediction accuracy was improved, while the recall of the attack state was increased by 17.79% (to 82.06%) compared to QIDeep (from 71.49%) with the same features. The prediction model and analysis tool in this paper provided not only a prediction method but also an attribution explanation of the risk states and transition direction of each patient, a valuable tool for doctors to analyze and diagnose the disease.https://doi.org/10.1038/s41598-022-26595-z |
spellingShingle | Jing Ma Yiyang Sun Junjie Liu Huaxiong Huang Xiaoshuang Zhou Shixin Xu Multi-objective learning and explanation for stroke risk assessment in Shanxi province Scientific Reports |
title | Multi-objective learning and explanation for stroke risk assessment in Shanxi province |
title_full | Multi-objective learning and explanation for stroke risk assessment in Shanxi province |
title_fullStr | Multi-objective learning and explanation for stroke risk assessment in Shanxi province |
title_full_unstemmed | Multi-objective learning and explanation for stroke risk assessment in Shanxi province |
title_short | Multi-objective learning and explanation for stroke risk assessment in Shanxi province |
title_sort | multi objective learning and explanation for stroke risk assessment in shanxi province |
url | https://doi.org/10.1038/s41598-022-26595-z |
work_keys_str_mv | AT jingma multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince AT yiyangsun multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince AT junjieliu multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince AT huaxionghuang multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince AT xiaoshuangzhou multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince AT shixinxu multiobjectivelearningandexplanationforstrokeriskassessmentinshanxiprovince |