Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease

Since 2001, cardiovascular disease (CVD) has had the second-highest mortality rate, about 15,700 people per year, in Taiwan. It has thus imposed a substantial burden on medical resources. This study was triggered by the following three factors. First, the CVD problem reflects an urgent issue. A high...

Full description

Bibliographic Details
Main Authors: You-Shyang Chen, Ching-Hsue Cheng, Su-Fen Chen, Jhe-You Jhuang
Format: Article
Language:English
Published: MDPI AG 2020-12-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/12/1406
_version_ 1827700199329366016
author You-Shyang Chen
Ching-Hsue Cheng
Su-Fen Chen
Jhe-You Jhuang
author_facet You-Shyang Chen
Ching-Hsue Cheng
Su-Fen Chen
Jhe-You Jhuang
author_sort You-Shyang Chen
collection DOAJ
description Since 2001, cardiovascular disease (CVD) has had the second-highest mortality rate, about 15,700 people per year, in Taiwan. It has thus imposed a substantial burden on medical resources. This study was triggered by the following three factors. First, the CVD problem reflects an urgent issue. A high priority has been placed on long-term therapy and prevention to reduce the wastage of medical resources, particularly in developed countries. Second, from the perspective of preventive medicine, popular data-mining methods have been well learned and studied, with excellent performance in medical fields. Thus, identification of the risk factors of CVD using these popular techniques is a prime concern. Third, the Framingham risk score is a core indicator that can be used to establish an effective prediction model to accurately diagnose CVD. Thus, this study proposes an integrated predictive model to organize five notable classifiers: the rough set (RS), decision tree (DT), random forest (RF), multilayer perceptron (MLP), and support vector machine (SVM), with a novel use of the Framingham risk score for attribute selection (i.e., F-attributes first identified in this study) to determine the key features for identifying CVD. Verification experiments were conducted with three evaluation criteria—accuracy, sensitivity, and specificity—based on 1190 instances of a CVD dataset available from a Taiwan teaching hospital and 2019 examples from a public Framingham dataset. Given the empirical results, the SVM showed the best performance in terms of accuracy (99.67%), sensitivity (99.93%), and specificity (99.71%) in all F-attributes in the CVD dataset compared to the other listed classifiers. The RS showed the highest performance in terms of accuracy (85.11%), sensitivity (86.06%), and specificity (85.19%) in most of the F-attributes in the Framingham dataset. The above study results support novel evidence that no classifier or model is suitable for all practical datasets of medical applications. Thus, identifying an appropriate classifier to address specific medical data is important. Significantly, this study is novel in its calculation and identification of the use of key Framingham risk attributes integrated with the DT technique to produce entropy-based decision rules of knowledge sets, which has not been undertaken in previous research. This study conclusively yielded meaningful entropy-based knowledgeable rules in tree structures and contributed to the differentiation of classifiers from the two datasets with three useful research findings and three helpful management implications for subsequent medical research. In particular, these rules provide reasonable solutions to simplify processes of preventive medicine by standardizing the formats and codes used in medical data to address CVD problems. The specificity of these rules is thus significant compared to those of past research.
first_indexed 2024-03-10T14:07:03Z
format Article
id doaj.art-978145e68f8249b0a7d4e1803bc7ae09
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-10T14:07:03Z
publishDate 2020-12-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-978145e68f8249b0a7d4e1803bc7ae092023-11-21T00:35:39ZengMDPI AGEntropy1099-43002020-12-012212140610.3390/e22121406Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular DiseaseYou-Shyang Chen0Ching-Hsue Cheng1Su-Fen Chen2Jhe-You Jhuang3Department of Information Management, Hwa Hsia University of Technology, New Taipei City 235, TaiwanDepartment of Information Management, National Yunlin University of Science and Technology, Douliou, Yunlin 64002, TaiwanNational Museum of Marine Science & Technology, Keelung City 202010, TaiwanDepartment of Information Management, National Yunlin University of Science and Technology, Douliou, Yunlin 64002, TaiwanSince 2001, cardiovascular disease (CVD) has had the second-highest mortality rate, about 15,700 people per year, in Taiwan. It has thus imposed a substantial burden on medical resources. This study was triggered by the following three factors. First, the CVD problem reflects an urgent issue. A high priority has been placed on long-term therapy and prevention to reduce the wastage of medical resources, particularly in developed countries. Second, from the perspective of preventive medicine, popular data-mining methods have been well learned and studied, with excellent performance in medical fields. Thus, identification of the risk factors of CVD using these popular techniques is a prime concern. Third, the Framingham risk score is a core indicator that can be used to establish an effective prediction model to accurately diagnose CVD. Thus, this study proposes an integrated predictive model to organize five notable classifiers: the rough set (RS), decision tree (DT), random forest (RF), multilayer perceptron (MLP), and support vector machine (SVM), with a novel use of the Framingham risk score for attribute selection (i.e., F-attributes first identified in this study) to determine the key features for identifying CVD. Verification experiments were conducted with three evaluation criteria—accuracy, sensitivity, and specificity—based on 1190 instances of a CVD dataset available from a Taiwan teaching hospital and 2019 examples from a public Framingham dataset. Given the empirical results, the SVM showed the best performance in terms of accuracy (99.67%), sensitivity (99.93%), and specificity (99.71%) in all F-attributes in the CVD dataset compared to the other listed classifiers. The RS showed the highest performance in terms of accuracy (85.11%), sensitivity (86.06%), and specificity (85.19%) in most of the F-attributes in the Framingham dataset. The above study results support novel evidence that no classifier or model is suitable for all practical datasets of medical applications. Thus, identifying an appropriate classifier to address specific medical data is important. Significantly, this study is novel in its calculation and identification of the use of key Framingham risk attributes integrated with the DT technique to produce entropy-based decision rules of knowledge sets, which has not been undertaken in previous research. This study conclusively yielded meaningful entropy-based knowledgeable rules in tree structures and contributed to the differentiation of classifiers from the two datasets with three useful research findings and three helpful management implications for subsequent medical research. In particular, these rules provide reasonable solutions to simplify processes of preventive medicine by standardizing the formats and codes used in medical data to address CVD problems. The specificity of these rules is thus significant compared to those of past research.https://www.mdpi.com/1099-4300/22/12/1406applications of medicinecardiovascular diseaseFramingham risk score (FRS)Framingham risk attributesentropy-based rule modelmachine learning techniques
spellingShingle You-Shyang Chen
Ching-Hsue Cheng
Su-Fen Chen
Jhe-You Jhuang
Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease
Entropy
applications of medicine
cardiovascular disease
Framingham risk score (FRS)
Framingham risk attributes
entropy-based rule model
machine learning techniques
title Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease
title_full Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease
title_fullStr Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease
title_full_unstemmed Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease
title_short Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease
title_sort identification of the framingham risk score by an entropy based rule model for cardiovascular disease
topic applications of medicine
cardiovascular disease
Framingham risk score (FRS)
Framingham risk attributes
entropy-based rule model
machine learning techniques
url https://www.mdpi.com/1099-4300/22/12/1406
work_keys_str_mv AT youshyangchen identificationoftheframinghamriskscorebyanentropybasedrulemodelforcardiovasculardisease
AT chinghsuecheng identificationoftheframinghamriskscorebyanentropybasedrulemodelforcardiovasculardisease
AT sufenchen identificationoftheframinghamriskscorebyanentropybasedrulemodelforcardiovasculardisease
AT jheyoujhuang identificationoftheframinghamriskscorebyanentropybasedrulemodelforcardiovasculardisease