Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning
Chronic obstructive pulmonary disease (COPD) is a major public health concern, affecting estimated 164 million people worldwide. Early detection and intervention strategies are essential to reduce the burden of COPD, but current screening approaches are limited in their ability to accurately predict...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2024-02-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/16950.pdf |
_version_ | 1797295377610702848 |
---|---|
author | Guanglei Liu Jiani Hu Jianzhe Yang Jie Song |
author_facet | Guanglei Liu Jiani Hu Jianzhe Yang Jie Song |
author_sort | Guanglei Liu |
collection | DOAJ |
description | Chronic obstructive pulmonary disease (COPD) is a major public health concern, affecting estimated 164 million people worldwide. Early detection and intervention strategies are essential to reduce the burden of COPD, but current screening approaches are limited in their ability to accurately predict risk. Machine learning (ML) models offer promise for improved accuracy of COPD risk prediction by combining genetic and electronic medical record data. In this study, we developed and evaluated eight ML models for primary screening of COPD utilizing routine screening data, polygenic risk scores (PRS), additional clinical data, or a combination of all three. To assess our models, we conducted a retrospective analysis of approximately 329,396 patients in the UK Biobank database. Incorporating personal information and blood biochemical test results significantly improved the model’s accuracy for predicting COPD risk, achieving a best performance of 0.8505 AUC, a specificity of 0.8539 and a sensitivity of 0.7584. These results indicate that ML models can be effectively utilized for accurate prediction of COPD risk in individuals aged 20 to 50 years, providing a valuable tool for early detection and intervention. |
first_indexed | 2024-03-07T21:46:56Z |
format | Article |
id | doaj.art-e5e95c21a5d14eb8b7d6922d40058220 |
institution | Directory Open Access Journal |
issn | 2167-8359 |
language | English |
last_indexed | 2024-03-07T21:46:56Z |
publishDate | 2024-02-01 |
publisher | PeerJ Inc. |
record_format | Article |
series | PeerJ |
spelling | doaj.art-e5e95c21a5d14eb8b7d6922d400582202024-02-25T15:05:17ZengPeerJ Inc.PeerJ2167-83592024-02-0112e1695010.7717/peerj.16950Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learningGuanglei Liu0Jiani Hu1Jianzhe Yang2Jie Song3School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, ChinaAilurus Biotechnology Ltd., Shenzhen, Guangdong, ChinaAilurus Biotechnology Ltd., Shenzhen, Guangdong, ChinaAilurus Biotechnology Ltd., Shenzhen, Guangdong, ChinaChronic obstructive pulmonary disease (COPD) is a major public health concern, affecting estimated 164 million people worldwide. Early detection and intervention strategies are essential to reduce the burden of COPD, but current screening approaches are limited in their ability to accurately predict risk. Machine learning (ML) models offer promise for improved accuracy of COPD risk prediction by combining genetic and electronic medical record data. In this study, we developed and evaluated eight ML models for primary screening of COPD utilizing routine screening data, polygenic risk scores (PRS), additional clinical data, or a combination of all three. To assess our models, we conducted a retrospective analysis of approximately 329,396 patients in the UK Biobank database. Incorporating personal information and blood biochemical test results significantly improved the model’s accuracy for predicting COPD risk, achieving a best performance of 0.8505 AUC, a specificity of 0.8539 and a sensitivity of 0.7584. These results indicate that ML models can be effectively utilized for accurate prediction of COPD risk in individuals aged 20 to 50 years, providing a valuable tool for early detection and intervention.https://peerj.com/articles/16950.pdfChronic obstructive pulmonary diseaseCOPDMachine learningRisk predictionGenetic dataElectronic health records |
spellingShingle | Guanglei Liu Jiani Hu Jianzhe Yang Jie Song Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning PeerJ Chronic obstructive pulmonary disease COPD Machine learning Risk prediction Genetic data Electronic health records |
title | Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning |
title_full | Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning |
title_fullStr | Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning |
title_full_unstemmed | Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning |
title_short | Predicting early-onset COPD risk in adults aged 20–50 using electronic health records and machine learning |
title_sort | predicting early onset copd risk in adults aged 20 50 using electronic health records and machine learning |
topic | Chronic obstructive pulmonary disease COPD Machine learning Risk prediction Genetic data Electronic health records |
url | https://peerj.com/articles/16950.pdf |
work_keys_str_mv | AT guangleiliu predictingearlyonsetcopdriskinadultsaged2050usingelectronichealthrecordsandmachinelearning AT jianihu predictingearlyonsetcopdriskinadultsaged2050usingelectronichealthrecordsandmachinelearning AT jianzheyang predictingearlyonsetcopdriskinadultsaged2050usingelectronichealthrecordsandmachinelearning AT jiesong predictingearlyonsetcopdriskinadultsaged2050usingelectronichealthrecordsandmachinelearning |