Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)

Abstract Background Identifying factors associated with cardiovascular disease (CVD) is critical for its prevention, but this topic is scarcely investigated in Kashgar prefecture, Xinjiang, northwestern China. We thus explored the CVD epidemiology and identified prominent factors associated with CVD...

Full description

Bibliographic Details
Main Authors: Jia-Xin Li, Li Li, Xuemei Zhong, Shu-Jun Fan, Tao Cen, Jianquan Wang, Chuanjiang He, Zhoubin Zhang, Ya-Na Luo, Xiao-Xuan Liu, Li-Xin Hu, Yi-Dan Zhang, Hui-Ling Qiu, Guang-Hui Dong, Xiao-Guang Zou, Bo-Yi Yang
Format: Article
Language:English
Published: BMC 2022-12-01
Series:Global Health Research and Policy
Subjects:
Online Access:https://doi.org/10.1186/s41256-022-00282-y
_version_ 1811177910593126400
author Jia-Xin Li
Li Li
Xuemei Zhong
Shu-Jun Fan
Tao Cen
Jianquan Wang
Chuanjiang He
Zhoubin Zhang
Ya-Na Luo
Xiao-Xuan Liu
Li-Xin Hu
Yi-Dan Zhang
Hui-Ling Qiu
Guang-Hui Dong
Xiao-Guang Zou
Bo-Yi Yang
author_facet Jia-Xin Li
Li Li
Xuemei Zhong
Shu-Jun Fan
Tao Cen
Jianquan Wang
Chuanjiang He
Zhoubin Zhang
Ya-Na Luo
Xiao-Xuan Liu
Li-Xin Hu
Yi-Dan Zhang
Hui-Ling Qiu
Guang-Hui Dong
Xiao-Guang Zou
Bo-Yi Yang
author_sort Jia-Xin Li
collection DOAJ
description Abstract Background Identifying factors associated with cardiovascular disease (CVD) is critical for its prevention, but this topic is scarcely investigated in Kashgar prefecture, Xinjiang, northwestern China. We thus explored the CVD epidemiology and identified prominent factors associated with CVD in this region. Methods A total of 1,887,710 adults at baseline (in 2017) of the Kashgar Prospective Cohort Study were included in the analysis. Sixteen candidate factors, including seven demographic factors, 4 lifestyle factors, and 5 clinical factors, were collected from a questionnaire and health examination records. CVD was defined according to International Clinical Diagnosis (ICD-10) codes. We first used logistic regression models to investigate the association between each of the candidate factors and CVD. Then, we employed 3 machine learning methods—Random Forest, Random Ferns, and Extreme Gradient Boosting—to rank and identify prominent factors associated with CVD. Stratification analyses by sex, ethnicity, education level, economic status, and residential setting were also performed to test the consistency of the ranking. Results The prevalence of CVD in Kashgar prefecture was 8.1%. All the 16 candidate factors were confirmed to be significantly associated with CVD (odds ratios ranged from 1.03 to 2.99, all p values < 0.05) in logistic regression models. Further machine learning-based analysis suggested that age, occupation, hypertension, exercise frequency, and dietary pattern were the five most prominent factors associated with CVD. The ranking of relative importance for prominent factors in stratification analyses showed that the factor importance generally followed the same pattern as that in the overall sample. Conclusions CVD is a major public health concern in Kashgar prefecture. Age, occupation, hypertension, exercise frequency, and dietary pattern might be the prominent factors associated with CVD in this region.In the future, these factors should be given priority in preventing CVD in future.
first_indexed 2024-04-11T06:09:16Z
format Article
id doaj.art-fd669cd206984d568feba42d1763ff0c
institution Directory Open Access Journal
issn 2397-0642
language English
last_indexed 2024-04-11T06:09:16Z
publishDate 2022-12-01
publisher BMC
record_format Article
series Global Health Research and Policy
spelling doaj.art-fd669cd206984d568feba42d1763ff0c2022-12-22T04:41:22ZengBMCGlobal Health Research and Policy2397-06422022-12-017111310.1186/s41256-022-00282-yMachine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)Jia-Xin Li0Li Li1Xuemei Zhong2Shu-Jun Fan3Tao Cen4Jianquan Wang5Chuanjiang He6Zhoubin Zhang7Ya-Na Luo8Xiao-Xuan Liu9Li-Xin Hu10Yi-Dan Zhang11Hui-Ling Qiu12Guang-Hui Dong13Xiao-Guang Zou14Bo-Yi Yang15Guangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University)Department of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University)Guangzhou Center for Disease Control and PreventionDepartment of Research and Development, Nanfang Hospital, Southern Medical UniversityDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University)Department of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University)Guangzhou Center for Disease Control and PreventionGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University)Guangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen UniversityAbstract Background Identifying factors associated with cardiovascular disease (CVD) is critical for its prevention, but this topic is scarcely investigated in Kashgar prefecture, Xinjiang, northwestern China. We thus explored the CVD epidemiology and identified prominent factors associated with CVD in this region. Methods A total of 1,887,710 adults at baseline (in 2017) of the Kashgar Prospective Cohort Study were included in the analysis. Sixteen candidate factors, including seven demographic factors, 4 lifestyle factors, and 5 clinical factors, were collected from a questionnaire and health examination records. CVD was defined according to International Clinical Diagnosis (ICD-10) codes. We first used logistic regression models to investigate the association between each of the candidate factors and CVD. Then, we employed 3 machine learning methods—Random Forest, Random Ferns, and Extreme Gradient Boosting—to rank and identify prominent factors associated with CVD. Stratification analyses by sex, ethnicity, education level, economic status, and residential setting were also performed to test the consistency of the ranking. Results The prevalence of CVD in Kashgar prefecture was 8.1%. All the 16 candidate factors were confirmed to be significantly associated with CVD (odds ratios ranged from 1.03 to 2.99, all p values < 0.05) in logistic regression models. Further machine learning-based analysis suggested that age, occupation, hypertension, exercise frequency, and dietary pattern were the five most prominent factors associated with CVD. The ranking of relative importance for prominent factors in stratification analyses showed that the factor importance generally followed the same pattern as that in the overall sample. Conclusions CVD is a major public health concern in Kashgar prefecture. Age, occupation, hypertension, exercise frequency, and dietary pattern might be the prominent factors associated with CVD in this region.In the future, these factors should be given priority in preventing CVD in future.https://doi.org/10.1186/s41256-022-00282-yCardiovascular disease (CVD)PredictionProminent factorsMachine learningKashgar prefecture
spellingShingle Jia-Xin Li
Li Li
Xuemei Zhong
Shu-Jun Fan
Tao Cen
Jianquan Wang
Chuanjiang He
Zhoubin Zhang
Ya-Na Luo
Xiao-Xuan Liu
Li-Xin Hu
Yi-Dan Zhang
Hui-Ling Qiu
Guang-Hui Dong
Xiao-Guang Zou
Bo-Yi Yang
Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)
Global Health Research and Policy
Cardiovascular disease (CVD)
Prediction
Prominent factors
Machine learning
Kashgar prefecture
title Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)
title_full Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)
title_fullStr Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)
title_full_unstemmed Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)
title_short Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)
title_sort machine learning identifies prominent factors associated with cardiovascular disease findings from two million adults in the kashgar prospective cohort study kpcs
topic Cardiovascular disease (CVD)
Prediction
Prominent factors
Machine learning
Kashgar prefecture
url https://doi.org/10.1186/s41256-022-00282-y
work_keys_str_mv AT jiaxinli machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT lili machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT xuemeizhong machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT shujunfan machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT taocen machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT jianquanwang machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT chuanjianghe machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT zhoubinzhang machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT yanaluo machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT xiaoxuanliu machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT lixinhu machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT yidanzhang machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT huilingqiu machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT guanghuidong machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT xiaoguangzou machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs
AT boyiyang machinelearningidentifiesprominentfactorsassociatedwithcardiovasculardiseasefindingsfromtwomillionadultsinthekashgarprospectivecohortstudykpcs