Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
Abstract Background This study aimed to search for blood biomarkers among the profiles of patients with RA-ILD by using machine learning classifiers and probe correlations between the markers and the characteristics of RA-ILD. Methods A total of 153 RA patients were enrolled, including 75 RA-ILD and...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-05-01
|
Series: | Arthritis Research & Therapy |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13075-022-02800-2 |
_version_ | 1818548719908290560 |
---|---|
author | Yan Qin Yanlin Wang Fanxing Meng Min Feng Xiangcong Zhao Chong Gao Jing Luo |
author_facet | Yan Qin Yanlin Wang Fanxing Meng Min Feng Xiangcong Zhao Chong Gao Jing Luo |
author_sort | Yan Qin |
collection | DOAJ |
description | Abstract Background This study aimed to search for blood biomarkers among the profiles of patients with RA-ILD by using machine learning classifiers and probe correlations between the markers and the characteristics of RA-ILD. Methods A total of 153 RA patients were enrolled, including 75 RA-ILD and 78 RA-non-ILD. Routine laboratory data, the levels of tumor markers and autoantibodies, and clinical manifestations were recorded. Univariate analysis, least absolute shrinkage and selection operator (LASSO), random forest (RF), and partial least square (PLS) were performed, and the receiver operating characteristic (ROC) curves were plotted. Results Univariate analysis showed that, compared to RA-non-ILD, patients with RA-ILD were older (p < 0.001), had higher white blood cell (p = 0.003) and neutrophil counts (p = 0.017), had higher erythrocyte sedimentation rate (p = 0.003) and C-reactive protein (p = 0.003), had higher levels of KL-6 (p < 0.001), D-dimer (p < 0.001), fibrinogen (p < 0.001), fibrinogen degradation products (p < 0.001), lactate dehydrogenase (p < 0.001), hydroxybutyrate dehydrogenase (p < 0.001), carbohydrate antigen (CA) 19–9 (p < 0.001), carcinoembryonic antigen (p = 0.001), and CA242 (p < 0.001), but a significantly lower albumin level (p = 0.003). The areas under the curves (AUCs) of the LASSO, RF, and PLS models attained 0.95 in terms of differentiating patients with RA-ILD from those without. When data from the univariate analysis and the top 10 indicators of the three machine learning models were combined, the most discriminatory markers were age and the KL-6, D-dimer, and CA19-9, with AUCs of 0.814 [95% confidence interval (CI) 0.731–0.880], 0.749 (95% CI 0.660–0.824), 0.749 (95% CI 0.660–0.824), and 0.727 (95% CI 0.637–0.805), respectively. When all four markers were combined, the AUC reached 0.928 (95% CI 0.865–0.968). Notably, neither the KL-6 nor the CA19-9 level correlated with disease activity in RA-ILD group. Conclusions The levels of KL-6, D-dimer, and tumor markers greatly aided RA-ILD identification. Machine learning algorithms combined with traditional biostatistical analysis can diagnose patients with RA-ILD and identify biomarkers potentially associated with the disease. |
first_indexed | 2024-12-12T08:23:57Z |
format | Article |
id | doaj.art-604090d284f6459baf7abd1d633280d6 |
institution | Directory Open Access Journal |
issn | 1478-6362 |
language | English |
last_indexed | 2024-12-12T08:23:57Z |
publishDate | 2022-05-01 |
publisher | BMC |
record_format | Article |
series | Arthritis Research & Therapy |
spelling | doaj.art-604090d284f6459baf7abd1d633280d62022-12-22T00:31:18ZengBMCArthritis Research & Therapy1478-63622022-05-0124111210.1186/s13075-022-02800-2Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung diseaseYan Qin0Yanlin Wang1Fanxing Meng2Min Feng3Xiangcong Zhao4Chong Gao5Jing Luo6Department of Rheumatology, Second Hospital of Shanxi Medical UniversityDepartment of Rheumatology, Second Hospital of Shanxi Medical UniversityThe Shanxi Medical UniversityDepartment of Rheumatology, Second Hospital of Shanxi Medical UniversityDepartment of Rheumatology, Second Hospital of Shanxi Medical UniversityDepartment of Pathology, Brigham and Women’s Hospital, Harvard Medical SchoolDepartment of Rheumatology, Second Hospital of Shanxi Medical UniversityAbstract Background This study aimed to search for blood biomarkers among the profiles of patients with RA-ILD by using machine learning classifiers and probe correlations between the markers and the characteristics of RA-ILD. Methods A total of 153 RA patients were enrolled, including 75 RA-ILD and 78 RA-non-ILD. Routine laboratory data, the levels of tumor markers and autoantibodies, and clinical manifestations were recorded. Univariate analysis, least absolute shrinkage and selection operator (LASSO), random forest (RF), and partial least square (PLS) were performed, and the receiver operating characteristic (ROC) curves were plotted. Results Univariate analysis showed that, compared to RA-non-ILD, patients with RA-ILD were older (p < 0.001), had higher white blood cell (p = 0.003) and neutrophil counts (p = 0.017), had higher erythrocyte sedimentation rate (p = 0.003) and C-reactive protein (p = 0.003), had higher levels of KL-6 (p < 0.001), D-dimer (p < 0.001), fibrinogen (p < 0.001), fibrinogen degradation products (p < 0.001), lactate dehydrogenase (p < 0.001), hydroxybutyrate dehydrogenase (p < 0.001), carbohydrate antigen (CA) 19–9 (p < 0.001), carcinoembryonic antigen (p = 0.001), and CA242 (p < 0.001), but a significantly lower albumin level (p = 0.003). The areas under the curves (AUCs) of the LASSO, RF, and PLS models attained 0.95 in terms of differentiating patients with RA-ILD from those without. When data from the univariate analysis and the top 10 indicators of the three machine learning models were combined, the most discriminatory markers were age and the KL-6, D-dimer, and CA19-9, with AUCs of 0.814 [95% confidence interval (CI) 0.731–0.880], 0.749 (95% CI 0.660–0.824), 0.749 (95% CI 0.660–0.824), and 0.727 (95% CI 0.637–0.805), respectively. When all four markers were combined, the AUC reached 0.928 (95% CI 0.865–0.968). Notably, neither the KL-6 nor the CA19-9 level correlated with disease activity in RA-ILD group. Conclusions The levels of KL-6, D-dimer, and tumor markers greatly aided RA-ILD identification. Machine learning algorithms combined with traditional biostatistical analysis can diagnose patients with RA-ILD and identify biomarkers potentially associated with the disease.https://doi.org/10.1186/s13075-022-02800-2Interstitial lung diseaseRheumatoid arthritisKrebs von den Lungen-6D-dimerTumor markersMachine learning algorithm |
spellingShingle | Yan Qin Yanlin Wang Fanxing Meng Min Feng Xiangcong Zhao Chong Gao Jing Luo Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease Arthritis Research & Therapy Interstitial lung disease Rheumatoid arthritis Krebs von den Lungen-6 D-dimer Tumor markers Machine learning algorithm |
title | Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease |
title_full | Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease |
title_fullStr | Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease |
title_full_unstemmed | Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease |
title_short | Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease |
title_sort | identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis associated interstitial lung disease |
topic | Interstitial lung disease Rheumatoid arthritis Krebs von den Lungen-6 D-dimer Tumor markers Machine learning algorithm |
url | https://doi.org/10.1186/s13075-022-02800-2 |
work_keys_str_mv | AT yanqin identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease AT yanlinwang identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease AT fanxingmeng identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease AT minfeng identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease AT xiangcongzhao identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease AT chonggao identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease AT jingluo identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease |