Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study

BackgroundMachine learning (ML) is a type of artificial intelligence strategy. Its algorithms are used on big data sets to see patterns, learn from their results, and perform tasks autonomously without being instructed on how to address problems. New diseases like COVID-19 pr...

Full description

Bibliographic Details
Main Authors: Trong Tue Nguyen, Cam Tu Ho, Huong Thi Thu Bui, Lam Khanh Ho, Van Thanh Ta
Format: Article
Language:English
Published: JMIR Publications 2023-02-01
Series:JMIR Formative Research
Online Access:https://formative.jmir.org/2023/1/e42895
_version_ 1797734294918004736
author Trong Tue Nguyen
Cam Tu Ho
Huong Thi Thu Bui
Lam Khanh Ho
Van Thanh Ta
author_facet Trong Tue Nguyen
Cam Tu Ho
Huong Thi Thu Bui
Lam Khanh Ho
Van Thanh Ta
author_sort Trong Tue Nguyen
collection DOAJ
description BackgroundMachine learning (ML) is a type of artificial intelligence strategy. Its algorithms are used on big data sets to see patterns, learn from their results, and perform tasks autonomously without being instructed on how to address problems. New diseases like COVID-19 provide important data for ML. Therefore, all relevant parameters should be explicitly quantified and modeled. ObjectiveThe purpose of this study was to determine (1) the overall preclinical characteristics, (2) the cumulative cutoff values and risk ratios (RRs), and (3) the factors associated with COVID-19 severity in unidimensional and multidimensional analyses involving 2173 SARS-CoV-2 patients. MethodsThe study population consisted of 2173 patients (1587 mild status [mild group] and asymptomatic patients, 377 moderate status patients [moderate group], and 209 severe status patients [severe group]). The status of the patients was recorded from September 2021 to March 2022. Two correlation tests, relative risk, and RR were used to eliminate unbalanced parameters and select the most remarkable parameters. The independent methods of hierarchical cluster analysis and k-means were used to classify parameters according to their r values. Finally, network analysis provided a 3-dimensional view of the results. ResultsCOVID-19 severity was significantly correlated with age (mild-moderate group: RR 4.19, 95% CI 3.58-4.95; P<.001), scoring index of chest x-ray (mild-moderate group: RR 3.29, 95% CI 2.76-3.92; P<.001; moderate-severe group: RR 3.03, 95% CI 2.4023-3.8314; P<.001), percentage of neutrophils (mild-moderate group: RR 3.18, 95% CI 2.73-3.70; P<.001; moderate-severe group: RR 3.32, 95% CI 2.6480-4.1529; P<.001), quantity of neutrophils (moderate-severe group: RR 3.15, 95% CI 2.6153-3.8025; P<.001), albumin (moderate-severe group: RR 0.46, 95% CI 0.3650-0.5752; P<.001), C-reactive protein (mild-moderate group: RR 3.4, 95% CI 2.91-3.97; P<.001), and ratio of lymphocytes (moderate-severe group: RR 0.34, 95% CI 0.2743-0.4210; P<.001). Significant inversion of correlations among the severity groups is important. Alanine transaminase and leucocytes showed a significant negative correlation (r=−1; P<.001) in the mild group and a significant positive correlation in the moderate group (r=1; P<.001). Transferrin and anion Cl showed a significant positive correlation (r=1; P<.001) in the mild group and a significant negative correlation in the moderate group (r=−0.59; P<.001). The clustering and network analysis showed that in the mild-moderate group, the closest neighbors of COVID-19 severity were ferritin and age. C-reactive protein, scoring index of chest x-ray, albumin, and lactate dehydrogenase were the next closest neighbors of these 3 factors. In the moderate-severe group, the closest neighbors of COVID-19 severity were ferritin, fibrinogen, albumin, quantity of lymphocytes, scoring index of chest x-ray, white blood cell count, lactate dehydrogenase, and quantity of neutrophils. ConclusionsThis multidimensional study in Vietnam showed possible correlations between several elements and COVID-19 severity to provide clinical reference markers for surveillance and diagnostic management.
first_indexed 2024-03-12T12:42:11Z
format Article
id doaj.art-3d442fed8a5a40e8911feeb3fd3b09f3
institution Directory Open Access Journal
issn 2561-326X
language English
last_indexed 2024-03-12T12:42:11Z
publishDate 2023-02-01
publisher JMIR Publications
record_format Article
series JMIR Formative Research
spelling doaj.art-3d442fed8a5a40e8911feeb3fd3b09f32023-08-28T23:45:27ZengJMIR PublicationsJMIR Formative Research2561-326X2023-02-017e4289510.2196/42895Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation StudyTrong Tue Nguyenhttps://orcid.org/0000-0002-5986-831XCam Tu Hohttps://orcid.org/0000-0001-8239-096XHuong Thi Thu Buihttps://orcid.org/0000-0002-4101-5618Lam Khanh Hohttps://orcid.org/0000-0001-6355-1553Van Thanh Tahttps://orcid.org/0000-0002-3195-710X BackgroundMachine learning (ML) is a type of artificial intelligence strategy. Its algorithms are used on big data sets to see patterns, learn from their results, and perform tasks autonomously without being instructed on how to address problems. New diseases like COVID-19 provide important data for ML. Therefore, all relevant parameters should be explicitly quantified and modeled. ObjectiveThe purpose of this study was to determine (1) the overall preclinical characteristics, (2) the cumulative cutoff values and risk ratios (RRs), and (3) the factors associated with COVID-19 severity in unidimensional and multidimensional analyses involving 2173 SARS-CoV-2 patients. MethodsThe study population consisted of 2173 patients (1587 mild status [mild group] and asymptomatic patients, 377 moderate status patients [moderate group], and 209 severe status patients [severe group]). The status of the patients was recorded from September 2021 to March 2022. Two correlation tests, relative risk, and RR were used to eliminate unbalanced parameters and select the most remarkable parameters. The independent methods of hierarchical cluster analysis and k-means were used to classify parameters according to their r values. Finally, network analysis provided a 3-dimensional view of the results. ResultsCOVID-19 severity was significantly correlated with age (mild-moderate group: RR 4.19, 95% CI 3.58-4.95; P<.001), scoring index of chest x-ray (mild-moderate group: RR 3.29, 95% CI 2.76-3.92; P<.001; moderate-severe group: RR 3.03, 95% CI 2.4023-3.8314; P<.001), percentage of neutrophils (mild-moderate group: RR 3.18, 95% CI 2.73-3.70; P<.001; moderate-severe group: RR 3.32, 95% CI 2.6480-4.1529; P<.001), quantity of neutrophils (moderate-severe group: RR 3.15, 95% CI 2.6153-3.8025; P<.001), albumin (moderate-severe group: RR 0.46, 95% CI 0.3650-0.5752; P<.001), C-reactive protein (mild-moderate group: RR 3.4, 95% CI 2.91-3.97; P<.001), and ratio of lymphocytes (moderate-severe group: RR 0.34, 95% CI 0.2743-0.4210; P<.001). Significant inversion of correlations among the severity groups is important. Alanine transaminase and leucocytes showed a significant negative correlation (r=−1; P<.001) in the mild group and a significant positive correlation in the moderate group (r=1; P<.001). Transferrin and anion Cl showed a significant positive correlation (r=1; P<.001) in the mild group and a significant negative correlation in the moderate group (r=−0.59; P<.001). The clustering and network analysis showed that in the mild-moderate group, the closest neighbors of COVID-19 severity were ferritin and age. C-reactive protein, scoring index of chest x-ray, albumin, and lactate dehydrogenase were the next closest neighbors of these 3 factors. In the moderate-severe group, the closest neighbors of COVID-19 severity were ferritin, fibrinogen, albumin, quantity of lymphocytes, scoring index of chest x-ray, white blood cell count, lactate dehydrogenase, and quantity of neutrophils. ConclusionsThis multidimensional study in Vietnam showed possible correlations between several elements and COVID-19 severity to provide clinical reference markers for surveillance and diagnostic management.https://formative.jmir.org/2023/1/e42895
spellingShingle Trong Tue Nguyen
Cam Tu Ho
Huong Thi Thu Bui
Lam Khanh Ho
Van Thanh Ta
Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study
JMIR Formative Research
title Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study
title_full Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study
title_fullStr Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study
title_full_unstemmed Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study
title_short Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study
title_sort multidimensional machine learning for assessing parameters associated with covid 19 in vietnam validation study
url https://formative.jmir.org/2023/1/e42895
work_keys_str_mv AT trongtuenguyen multidimensionalmachinelearningforassessingparametersassociatedwithcovid19invietnamvalidationstudy
AT camtuho multidimensionalmachinelearningforassessingparametersassociatedwithcovid19invietnamvalidationstudy
AT huongthithubui multidimensionalmachinelearningforassessingparametersassociatedwithcovid19invietnamvalidationstudy
AT lamkhanhho multidimensionalmachinelearningforassessingparametersassociatedwithcovid19invietnamvalidationstudy
AT vanthanhta multidimensionalmachinelearningforassessingparametersassociatedwithcovid19invietnamvalidationstudy