On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19
The current COVID-19 pandemic, caused by the rapid worldwide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus affects different individuals differently, with many infected patients showing only mild symptoms, and others showing critical...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-01-01
|
Series: | Computation |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-3197/9/1/4 |
_version_ | 1797415535544107008 |
---|---|
author | Wenhuan Zeng Anupam Gautam Daniel H. Huson |
author_facet | Wenhuan Zeng Anupam Gautam Daniel H. Huson |
author_sort | Wenhuan Zeng |
collection | DOAJ |
description | The current COVID-19 pandemic, caused by the rapid worldwide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus affects different individuals differently, with many infected patients showing only mild symptoms, and others showing critical illness. To lessen the impact of the epidemic, one problem is to determine which factors play an important role in a patient’s progression of the disease. Here, we construct an enhanced COVID-19 structured dataset from more than one source, using natural language processing to add local weather conditions and country-specific research sentiment. The enhanced structured dataset contains 301,363 samples and 43 features, and we applied both machine learning algorithms and deep learning algorithms on it so as to forecast patient’s survival probability. In addition, we import alignment sequence data to improve the performance of the model. Application of Extreme Gradient Boosting (XGBoost) on the enhanced structured dataset achieves 97% accuracy in predicting patient’s survival; with climatic factors, and then age, showing the most importance. Similarly, the application of a Multi-Layer Perceptron (MLP) achieves 98% accuracy. This work suggests that enhancing the available data, mostly basic information on patients, so as to include additional, potentially important features, such as weather conditions, is useful. The explored models suggest that textual weather descriptions can improve outcome forecast. |
first_indexed | 2024-03-09T05:50:01Z |
format | Article |
id | doaj.art-3f2573c06f9046fc9b8cae5baf8ed958 |
institution | Directory Open Access Journal |
issn | 2079-3197 |
language | English |
last_indexed | 2024-03-09T05:50:01Z |
publishDate | 2021-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Computation |
spelling | doaj.art-3f2573c06f9046fc9b8cae5baf8ed9582023-12-03T12:18:11ZengMDPI AGComputation2079-31972021-01-0191410.3390/computation9010004On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19Wenhuan Zeng0Anupam Gautam1Daniel H. Huson2Institute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076 Tübingen, GermanyInstitute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076 Tübingen, GermanyInstitute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076 Tübingen, GermanyThe current COVID-19 pandemic, caused by the rapid worldwide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus affects different individuals differently, with many infected patients showing only mild symptoms, and others showing critical illness. To lessen the impact of the epidemic, one problem is to determine which factors play an important role in a patient’s progression of the disease. Here, we construct an enhanced COVID-19 structured dataset from more than one source, using natural language processing to add local weather conditions and country-specific research sentiment. The enhanced structured dataset contains 301,363 samples and 43 features, and we applied both machine learning algorithms and deep learning algorithms on it so as to forecast patient’s survival probability. In addition, we import alignment sequence data to improve the performance of the model. Application of Extreme Gradient Boosting (XGBoost) on the enhanced structured dataset achieves 97% accuracy in predicting patient’s survival; with climatic factors, and then age, showing the most importance. Similarly, the application of a Multi-Layer Perceptron (MLP) achieves 98% accuracy. This work suggests that enhancing the available data, mostly basic information on patients, so as to include additional, potentially important features, such as weather conditions, is useful. The explored models suggest that textual weather descriptions can improve outcome forecast.https://www.mdpi.com/2079-3197/9/1/4COVID-19machine learningdeep learningNLPweathersentiment analysis |
spellingShingle | Wenhuan Zeng Anupam Gautam Daniel H. Huson On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19 Computation COVID-19 machine learning deep learning NLP weather sentiment analysis |
title | On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19 |
title_full | On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19 |
title_fullStr | On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19 |
title_full_unstemmed | On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19 |
title_short | On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19 |
title_sort | on the application of advanced machine learning methods to analyze enhanced multimodal data from persons infected with covid 19 |
topic | COVID-19 machine learning deep learning NLP weather sentiment analysis |
url | https://www.mdpi.com/2079-3197/9/1/4 |
work_keys_str_mv | AT wenhuanzeng ontheapplicationofadvancedmachinelearningmethodstoanalyzeenhancedmultimodaldatafrompersonsinfectedwithcovid19 AT anupamgautam ontheapplicationofadvancedmachinelearningmethodstoanalyzeenhancedmultimodaldatafrompersonsinfectedwithcovid19 AT danielhhuson ontheapplicationofadvancedmachinelearningmethodstoanalyzeenhancedmultimodaldatafrompersonsinfectedwithcovid19 |