Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol

Purpose: Cardiovascular disease (CVD) is a major worldwide health burden. As the risk factors of CVD, hypertension, and hyperlipidemia are most mentioned. Early stage hypertension in the population with dyslipidemia is an important public health hazard. This study was the application of data-driven...

Full description

Bibliographic Details
Main Authors: Pen-Chih Liao, Ming-Shu Chen, Mao-Jhen Jhou, Tsan-Chi Chen, Chih-Te Yang, Chi-Jie Lu
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/12/8/1965
_version_ 1797410713176637440
author Pen-Chih Liao
Ming-Shu Chen
Mao-Jhen Jhou
Tsan-Chi Chen
Chih-Te Yang
Chi-Jie Lu
author_facet Pen-Chih Liao
Ming-Shu Chen
Mao-Jhen Jhou
Tsan-Chi Chen
Chih-Te Yang
Chi-Jie Lu
author_sort Pen-Chih Liao
collection DOAJ
description Purpose: Cardiovascular disease (CVD) is a major worldwide health burden. As the risk factors of CVD, hypertension, and hyperlipidemia are most mentioned. Early stage hypertension in the population with dyslipidemia is an important public health hazard. This study was the application of data-driven machine learning (ML), demonstrating complex relationships between risk factors and outcomes and promising predictive performance with vast amounts of medical data, aimed to investigate the association between dyslipidemia and the incidence of early stage hypertension in a large cohort with normal blood pressure at baseline. Methods: This study analyzed annual health screening data for 71,108 people from 2005 to 2017, including data for 27 risk-related indicators, sourced from the MJ Group, a major health screening center in Taiwan. We used five machine learning (ML) methods—stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), least absolute shrinkage and selection operator regression (Lasso), ridge regression (Ridge), and gradient boosting with categorical features support (CatBoost)—to develop a multi-stage ML algorithm-based prediction scheme and then evaluate important risk factors at the early stage of hypertension, especially for groups with high-density lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDL-C) levels within or out of the reference range. Results: Age, body mass index, waist circumference, waist-to-hip ratio, fasting plasma glucose, and C-reactive protein (CRP) were associated with hypertension. The hemoglobin level was also a positive contributor to blood pressure elevation and it appeared among the top three important risk factors in all LDL-C/HDL-C groups; therefore, these variables may be important in affecting blood pressure in the early stage of hypertension. A residual contribution to blood pressure elevation was found in groups with increased LDL-C. This suggests that LDL-C levels are associated with CPR levels, and that the LDL-C level may be an important factor for predicting the development of hypertension. Conclusion: The five prediction models provided similar classifications of risk factors. The results of this study show that an increase in LDL-C is more important than the start of a drop in HDL-C in health screening of sub-healthy adults. The findings of this study should be of value to health awareness raising about hypertension and further discussion and follow-up research.
first_indexed 2024-03-09T04:34:25Z
format Article
id doaj.art-ecd642c52925405fab7678e3c6395059
institution Directory Open Access Journal
issn 2075-4418
language English
last_indexed 2024-03-09T04:34:25Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj.art-ecd642c52925405fab7678e3c63950592023-12-03T13:32:10ZengMDPI AGDiagnostics2075-44182022-08-01128196510.3390/diagnostics12081965Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL CholesterolPen-Chih Liao0Ming-Shu Chen1Mao-Jhen Jhou2Tsan-Chi Chen3Chih-Te Yang4Chi-Jie Lu5Division of Cardiology, Cardiovascular Center, Far Eastern Memorial Hospital, New Taipei City 220, TaiwanDepartment of Healthcare Administration, College of Healthcare and Management, Asia Eastern University of Science and Technology, New Taipei City 220, TaiwanGraduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242, TaiwanDepartment of Medical Research, Far Eastern Memorial Hospital, New Taipei City 220, TaiwanDepartment of Business Administration, Tamkang University, New Taipei City 251, TaiwanGraduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242, TaiwanPurpose: Cardiovascular disease (CVD) is a major worldwide health burden. As the risk factors of CVD, hypertension, and hyperlipidemia are most mentioned. Early stage hypertension in the population with dyslipidemia is an important public health hazard. This study was the application of data-driven machine learning (ML), demonstrating complex relationships between risk factors and outcomes and promising predictive performance with vast amounts of medical data, aimed to investigate the association between dyslipidemia and the incidence of early stage hypertension in a large cohort with normal blood pressure at baseline. Methods: This study analyzed annual health screening data for 71,108 people from 2005 to 2017, including data for 27 risk-related indicators, sourced from the MJ Group, a major health screening center in Taiwan. We used five machine learning (ML) methods—stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), least absolute shrinkage and selection operator regression (Lasso), ridge regression (Ridge), and gradient boosting with categorical features support (CatBoost)—to develop a multi-stage ML algorithm-based prediction scheme and then evaluate important risk factors at the early stage of hypertension, especially for groups with high-density lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDL-C) levels within or out of the reference range. Results: Age, body mass index, waist circumference, waist-to-hip ratio, fasting plasma glucose, and C-reactive protein (CRP) were associated with hypertension. The hemoglobin level was also a positive contributor to blood pressure elevation and it appeared among the top three important risk factors in all LDL-C/HDL-C groups; therefore, these variables may be important in affecting blood pressure in the early stage of hypertension. A residual contribution to blood pressure elevation was found in groups with increased LDL-C. This suggests that LDL-C levels are associated with CPR levels, and that the LDL-C level may be an important factor for predicting the development of hypertension. Conclusion: The five prediction models provided similar classifications of risk factors. The results of this study show that an increase in LDL-C is more important than the start of a drop in HDL-C in health screening of sub-healthy adults. The findings of this study should be of value to health awareness raising about hypertension and further discussion and follow-up research.https://www.mdpi.com/2075-4418/12/8/1965health data-drivenhigh-density lipoprotein cholesterol (HDL-C)low-density lipoprotein cholesterol (LDL-C)hypertensionmachine learning
spellingShingle Pen-Chih Liao
Ming-Shu Chen
Mao-Jhen Jhou
Tsan-Chi Chen
Chih-Te Yang
Chi-Jie Lu
Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
Diagnostics
health data-driven
high-density lipoprotein cholesterol (HDL-C)
low-density lipoprotein cholesterol (LDL-C)
hypertension
machine learning
title Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
title_full Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
title_fullStr Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
title_full_unstemmed Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
title_short Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
title_sort integrating health data driven machine learning algorithms to evaluate risk factors of early stage hypertension at different levels of hdl and ldl cholesterol
topic health data-driven
high-density lipoprotein cholesterol (HDL-C)
low-density lipoprotein cholesterol (LDL-C)
hypertension
machine learning
url https://www.mdpi.com/2075-4418/12/8/1965
work_keys_str_mv AT penchihliao integratinghealthdatadrivenmachinelearningalgorithmstoevaluateriskfactorsofearlystagehypertensionatdifferentlevelsofhdlandldlcholesterol
AT mingshuchen integratinghealthdatadrivenmachinelearningalgorithmstoevaluateriskfactorsofearlystagehypertensionatdifferentlevelsofhdlandldlcholesterol
AT maojhenjhou integratinghealthdatadrivenmachinelearningalgorithmstoevaluateriskfactorsofearlystagehypertensionatdifferentlevelsofhdlandldlcholesterol
AT tsanchichen integratinghealthdatadrivenmachinelearningalgorithmstoevaluateriskfactorsofearlystagehypertensionatdifferentlevelsofhdlandldlcholesterol
AT chihteyang integratinghealthdatadrivenmachinelearningalgorithmstoevaluateriskfactorsofearlystagehypertensionatdifferentlevelsofhdlandldlcholesterol
AT chijielu integratinghealthdatadrivenmachinelearningalgorithmstoevaluateriskfactorsofearlystagehypertensionatdifferentlevelsofhdlandldlcholesterol