Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women
As osteoporosis is a degenerative disease related to postmenopausal aging, early diagnosis is vital. This study used data from the Korea National Health and Nutrition Examination Surveys to predict a patient’s risk of osteoporosis using machine learning algorithms. Data from 1431 postmenopausal wome...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Healthcare |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-9032/10/6/1107 |
_version_ | 1797486885645320192 |
---|---|
author | Youngihn Kwon Juyeon Lee Joo Hee Park Yoo Mee Kim Se Hwa Kim Young Jun Won Hyung-Yong Kim |
author_facet | Youngihn Kwon Juyeon Lee Joo Hee Park Yoo Mee Kim Se Hwa Kim Young Jun Won Hyung-Yong Kim |
author_sort | Youngihn Kwon |
collection | DOAJ |
description | As osteoporosis is a degenerative disease related to postmenopausal aging, early diagnosis is vital. This study used data from the Korea National Health and Nutrition Examination Surveys to predict a patient’s risk of osteoporosis using machine learning algorithms. Data from 1431 postmenopausal women aged 40–69 years were used, including 20 features affecting osteoporosis, chosen by feature importance and recursive feature elimination. Random Forest (RF), AdaBoost, and Gradient Boosting (GBM) machine learning algorithms were each used to train three models: A, checkup features; B, survey features; and C, both checkup and survey features, respectively. Of the three models, Model C generated the best outcomes with an accuracy of 0.832 for RF, 0.849 for AdaBoost, and 0.829 for GBM. Its area under the receiver operating characteristic curve (AUROC) was 0.919 for RF, 0.921 for AdaBoost, and 0.908 for GBM. By utilizing multiple feature selection methods, the ensemble models of this study achieved excellent results with an AUROC score of 0.921 with AdaBoost, which is 0.1–0.2 higher than those of the best performing models from recent studies. Our model can be further improved as a practical medical tool for the early diagnosis of osteoporosis after menopause. |
first_indexed | 2024-03-09T23:40:40Z |
format | Article |
id | doaj.art-30522079ba744360bcb6c97a131259bf |
institution | Directory Open Access Journal |
issn | 2227-9032 |
language | English |
last_indexed | 2024-03-09T23:40:40Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Healthcare |
spelling | doaj.art-30522079ba744360bcb6c97a131259bf2023-11-23T16:53:01ZengMDPI AGHealthcare2227-90322022-06-01106110710.3390/healthcare10061107Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean WomenYoungihn Kwon0Juyeon Lee1Joo Hee Park2Yoo Mee Kim3Se Hwa Kim4Young Jun Won5Hyung-Yong Kim6Insilicogen, Inc., Yongin-si 16954, KoreaAIDX, Inc., Yongin-si 16954, KoreaAIDX, Inc., Yongin-si 16954, KoreaDepartment of Internal Medicine, International St. Mary’s Hospital, Catholic Kwandong University College of Medicine, Incheon 22711, KoreaDepartment of Internal Medicine, International St. Mary’s Hospital, Catholic Kwandong University College of Medicine, Incheon 22711, KoreaDepartment of Internal Medicine, International St. Mary’s Hospital, Catholic Kwandong University College of Medicine, Incheon 22711, KoreaAIDX, Inc., Yongin-si 16954, KoreaAs osteoporosis is a degenerative disease related to postmenopausal aging, early diagnosis is vital. This study used data from the Korea National Health and Nutrition Examination Surveys to predict a patient’s risk of osteoporosis using machine learning algorithms. Data from 1431 postmenopausal women aged 40–69 years were used, including 20 features affecting osteoporosis, chosen by feature importance and recursive feature elimination. Random Forest (RF), AdaBoost, and Gradient Boosting (GBM) machine learning algorithms were each used to train three models: A, checkup features; B, survey features; and C, both checkup and survey features, respectively. Of the three models, Model C generated the best outcomes with an accuracy of 0.832 for RF, 0.849 for AdaBoost, and 0.829 for GBM. Its area under the receiver operating characteristic curve (AUROC) was 0.919 for RF, 0.921 for AdaBoost, and 0.908 for GBM. By utilizing multiple feature selection methods, the ensemble models of this study achieved excellent results with an AUROC score of 0.921 with AdaBoost, which is 0.1–0.2 higher than those of the best performing models from recent studies. Our model can be further improved as a practical medical tool for the early diagnosis of osteoporosis after menopause.https://www.mdpi.com/2227-9032/10/6/1107machine learningfeature selectionosteoporosispostmenopausal womenpre-screeningrisk assessment |
spellingShingle | Youngihn Kwon Juyeon Lee Joo Hee Park Yoo Mee Kim Se Hwa Kim Young Jun Won Hyung-Yong Kim Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women Healthcare machine learning feature selection osteoporosis postmenopausal women pre-screening risk assessment |
title | Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women |
title_full | Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women |
title_fullStr | Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women |
title_full_unstemmed | Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women |
title_short | Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women |
title_sort | osteoporosis pre screening using ensemble machine learning in postmenopausal korean women |
topic | machine learning feature selection osteoporosis postmenopausal women pre-screening risk assessment |
url | https://www.mdpi.com/2227-9032/10/6/1107 |
work_keys_str_mv | AT youngihnkwon osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen AT juyeonlee osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen AT jooheepark osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen AT yoomeekim osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen AT sehwakim osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen AT youngjunwon osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen AT hyungyongkim osteoporosisprescreeningusingensemblemachinelearninginpostmenopausalkoreanwomen |